Data type enforcement in Bigtable

Bigtable's flexible schema lets you store data of any type –strings, dates, numbers, JSON documents, or even images or PDFs – in aBigtable table.

This document describes when Bigtable enforces type, requiring youto encode or decode it in your application code. For a list ofBigtable data types, seeType in the Data API reference documentation.

Enforced types

Data type is enforced for the following data:

Aggregate column families (counters)
Timestamps
Materialized views

Aggregates

For theaggregate data type, encoding depends on the aggregation type. Whenyou create an aggregate column family, you must specify an aggregation type.

This table shows the input type and encoding for each aggregation type.

Aggregate type	Input type	Encoding
Sum	Int64	`BigEndianBytes`
Min	Int64	`BigEndianBytes`
Max	Int64	`BigEndianBytes`
HLL	Bytes	Zetasketch HLL++

When you query the data in aggregate cells using SQL, SQL automaticallyincorporates type information.

When you read the data in aggregate cells using the Data API'sReadRowsmethod, Bigtable returns bytes, so your application mustdecode the values using the encoding that Bigtable used to map thetyped data to bytes.

You can't convert a column family that contains non-aggregate data into anaggregate column family. Columns in aggregate column families can't containnon-aggregate cells, and standard column families can't contain aggregate cells.

For more information about creating tables with aggregate column families, seeCreate a table. For code samplesthat show how to increment an aggregate cell with encoded values, seeIncrementa value.

Timestamps

Each Bigtable cell has anInt64 timestamp that must be amicrosecond value with, at most, millisecond precision. Bigtablerejects a timestamp with microsecond precision, such as 3023483279876543. Inthis example, the acceptable timestamp value is 3023483279876000. A timestamp isthe number of microseconds since theUnixepoch,1970-01-01 00:00:00 UTC.

Continuous materialized views

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Continuous materialized views are read-only resources that you can read by usingSQL or with aReadRows Data API call. Data in a materialized view is typedbased on the query that defines it. For an overview, seeContinuousmaterialized views.

When you use SQL to query a continuous materialized view, SQL automaticallyincorporates type information.

When youread from a continuous materializedview using a Data APIReadRows request, you must know each column's type and decode it in yourapplication code.

Aggregated values in a continuous materialized view are stored using encodingdescribed in the following table, based on the output type of the column fromthe view definition.

Type	Encoding
BOOL	1 byte value, 1 = true, 0 = false
BYTES	No encoding
INT64 (or INT, SMALLINT, INTEGER, BIGINT, TINYINT, BYTEINT)	64-bit big-endian
FLOAT64	64-bit IEEE 754, excluding NaN and +/-inf
STRING	UTF-8
TIME/TIMESTAMP	64-bit integer representing the number of microseconds since the Unixepoch (consistent with GoogleSQL)

For more information, seeEncoding in the Data API reference.

Structured row keys

Preview

Structured row keys let you access your data using multi-column keys, similarto composite keys in relational databases.

The type and encoding for structured row keys are defined by arow key schemathat you can optionally add to a Bigtable table. Structured rowkey data is stored as bytes, but GoogleSQL forBigtable automatically uses the type and encoding defined in therow key schema when you execute a SQL query on the table.

Using a row key schema to query a table with aReadRows request isn'tsupported. A continuous materialized view has a row key schema by default. Formore information about structured row keys, seeManage row keyschemas.

Unenforced types

If no type information is provided, then Bigtable treats each cellas bytes with an unknown encoding.

When querying column families that are created without type enforcement, you mustprovide type information at read time to ensure that the data is read correctly.This is relevant with database functions whose behaviordepends on the data type. GoogleSQL for Bigtable offersCAST functions to do type conversions at query time. These functions convert frombytes to the types that various functions expect.

While Bigtable doesn't enforce types, certain operations assume adata type. Knowing this helps you ensure that your data is written in a way thatcan be processed within the database. The following are examples:

Increments usingReadModifyWriteRow assume the cell contains a 64-bitbig-endian signed integer.
TheTO_VECTOR64 function in SQL expects the cell to contain a byte arraythat's a concatenation of the big-endian bytes of 64-bit floating pointnumbers.
TheTO_VECTOR32 function in SQL expects the cell to contain a byte arraythat's a concatenation of the big-endian bytes of 32-bit floating pointnumbers.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換