Continuous materialized view queries

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

To create a continuous materialized view of a Bigtable table, you run aSQL query that defines the continuous materialized view.

This document describes concepts and patterns to help you prepare yourcontinuous materialized view SQL query. Before you read this document, youshould be familiar withContinuous materializedviews andGoogleSQL forBigtable.

Continuous materialized views use restricted SQL syntax. Thefollowing pattern shows how to build a continuous materialized view SQL query:

SELECTexpressionASalias[,...]FROMfrom_item[WHEREbool_expression]GROUPBYexpression[,...];from_item:{table_name[as_alias]|field_path}as_alias:[AS]alias

If you want to build a continuous materialized view SQL query as an asynchronoussecondary index, use theORDER BY clause:

SELECTexpressionASalias[,...]FROMfrom_item[WHEREbool_expression]ORDERBYexpression[,...];from_item:{table_name[as_alias]|field_path}as_alias:[AS]alias

Query limitations

The following rules apply to a SQL query used to create a continuousmaterialized view:

  • Must be aSELECT statement.
  • Must have aGROUP BY clause or, for asynchronous secondary index queries, anORDER BY clause, but not both.
  • Must use only supported aggregation functions.
  • Can have multiple aggregations per group.

Supported aggregations

You can use the following aggregation functions in a SQL query that defines acontinuous materialized view:

  • COUNT
  • SUM
  • MIN
  • MAX
  • HLL_COUNT.INIT
  • HLL_COUNT.MERGE
  • HLL_COUNT.MERGE_PARTIAL
  • ANY_VALUE
  • BIT_AND
  • BIT_OR
  • BIT_XOR
  • AVG

If youSELECT COUNT(*) you must define a row key, like in the followingexample:

SELECT'*'AS_key,COUNT(*)AScountFROMfooGROUPBY_key;

Unsupported SQL features

You can't use the following SQL features:

  • Any feature not supported byGoogleSQL for Bigtable
  • ARRAY
  • ARRAY_AGG
  • ARRAY_CONCAT_AGG
  • COUNT_IF
  • CURRENT_TIME and other non-deterministic functions
  • DATE,DATETIME as output columns (UseTIMESTAMP or store a string.)
  • DESC sort in the output
  • DISTINCT option, as inSUM(*DISTINCT* value))
  • LIMIT/OFFSET
  • SELECT *
  • OVER clause to create a windowing aggregation
  • STRUCT

You also can't nestGROUP BY orORDER BY clauses or create map columns. For additionallimitations, seeLimitations.

Avoiding excluded rows

Input rows are excluded from a continuous materialized view in the followingcircumstances:

  • More than 1 MiB of data is selected from the row. For example, if yourquery isSELECT apple AS apples , SUM(banana) AS sum_bananas FROM my_tableGROUP BY apples, then any row that contains more than 1MiB of data in theapple andbanana columns is excluded from the continuous materializedview.
  • More than 1 MiB of data is output from the row. This might occur whenyou use queries such asSELECT REPEAT(apple, 1000) or use large constants.
  • More than 10 times more data is output than was selected.
  • The query doesn't match your data. This would include attempting todivide a zero, integer overflow, or expecting a row key format that isn'tused in every row key.

Excluded rows increment the user errors metric when they are first processed.For more information about metrics that can help you monitor your continuousmaterialized views, seeMetrics.

Query details

This section describes a continuous materialized view query and how the resultsmight look when the view is queried. Data in the source table is theinput, and the result data in the continuous materialized view is theoutput. Output data is either aggregated or unaggregated (in the defined key).

SELECT statement

The select statement configures the columns and aggregations used in thecontinuous materialized view. The statement must use either aGROUP BY clauseto aggregate across rows or anORDER BY clause to create an asynchronous secondaryindex.

SELECT * is not supported, butSELECT COUNT(*) is.

As in a typicalSELECT statement, you can have multiple aggregations per agrouped set of data. The ungrouped columns must be an aggregation result.

This is an example of a standardGROUP BY aggregation query in SQL:

SELECTmyfamily["node"]ASnode,myfamily["type"]AStype,COUNT(clicks)ASclicks_per_keyFROMmytableGROUPBYnode,type

Row keys and unaggregated data

You can specify a_key as the row key for a continuous materializedview. If you don't, columns in theGROUP BY clause form the key in theview.

Row keys defined by a_key column

You can optionally specify a_key column when you define yourcontinuous materialized view. (This is different from the_keycolumn that you get whenyou execute a SQL query on a Bigtable table.) If you specify a_key, the following rules apply:

  • You must group by_key, and you can't group by anything else except(optionally) by_timestamp. For more information, seeTimestamps.
  • The_key column must be of typeBYTES.

Specifying a_key is useful if you plan to read the view withReadRowsrather than with SQL, because it gives you control over the row key format. Onthe other hand, a SQL query to a view with a defined_key might need to decodethe_key explicitly instead of just returning structured key columns.

Row keys defined byGROUP BY orORDER BY clause

If you don't specify a_key, the unaggregated columns in yourSELECT listbecome the row key in the view. You can assign the key columns any namessupported by SQL conventions. Use this approach if you plan to use SQL to querythe view rather than aReadRows request.

Unaggregated output columns in theSELECT list must be included in theGROUPBY clause. The order in which columns are written in theGROUP BY clause isthe order in which the data is stored in the continuous materialized view rowkey. For example,GROUP BY a, b, c is implicitlyORDER BY a ASC, b ASC, cASC.

If you use anORDER BY clause instead of aGROUP BY clause to create anasynchronous secondary index, the columns in yourSELECT list that are part of theORDER BY clause become the row key in the view. The order in which columnsare written in theORDER BY clause is the order in which the data is storedin the continuous materialized view row key. For example,ORDER BY a, b, cstores the data with row keys ordered bya ASC, thenb ASC, thenc ASC.

Important: The columns that form the row key in a continuous materialized viewmust not be NULL. A NULL value in a key column causes the data in thecorresponding row in the source table to be excluded from the view.

Your SQL filter must eliminate potentialNULL or other invalid values that cancause errors. An invalid row, such as one containing aNULL key column, isomitted from the results and counted in thematerialized_view/user_errorsmetric. To debug user errors, try running the SQL query outside of a continuousmaterialized view.

Aggregated data

Aggregate columns in the query define the calculations that generate the datain the continuous materialized view.

The alias for an aggregate column is treated as a column qualifier in thecontinuous materialized view.

Consider the following example:

SELECTfam["baz"]ASbaz,SUM(fam["foo"])ASsum_foo,SUM(fam["bar"])ASsum_barFROMTABLEGROUPBYbaz;

The query output has the following characteristics:

  • Output for eachbaz is in a separate row inbaz ASC order.
  • If a givenbaz has at least onefoo, then the output row'ssum_foo isa non-NULL value.
  • If a givenbaz has at least onebar, then the output row'ssum_bar isa non-NULL value.
  • If a givenbaz has no value for either column, it's omitted from theresults.

Then if you query the view withSELECT *, the result looks similar to thefollowing:

bazsum_foosum_bar
baz1sum_foo1sum_bar1
baz2sum_foo2sum_bar2

Timestamps

Thedefault timestamp for an output cell in a continuous materialized view is0 (1970-01-01 00:00:00Z). This is visible when you read the view withReadRows and not when you query it with SQL.

To use a different timestamp in the output, you can add a column of theTIMESTAMP type to theSELECT list of the query and name it_timestamp.If you query the continuous materialized view usingReadRows,_timestamp becomes the timestamp for the other cells in the row.

A timestamp must not beNULL, must be greater than or equal to zero, and mustbe a multiple of 1,000 (millisecond precision). Bigtable doesn'tsupport cell timestamps earlier than the Unix epoch (1970-01-01T00:00:00Z).

Consider the following example, which resamples aggregate data by day. The queryuses theUNPACK function.

SELECT_key,TIMESTAMP_TRUNC(_timestamp,DAY)AS_timestamp,SUM(sum_family["sum_column"])ASsum_column,SUM(sum_family["foo"])ASsecond_sum_columnFROMUNPACK(SELECT*FROMmy_table(with_history=>TRUE))GROUPBY1,2

If a givenSUM has non-empty input for a given day, then the output rowcontains an aggregated value with a timestamp that matches the truncated day.

If you query the view withSELECT *, the result looks similar to thefollowing:

_key_timestampsum_columnsecond_sum_column
12024-05-01 00:00:00Z2399
22024-05-02 00:00:00Z45201
32024-05-03 00:00:00ZNULL56
42024-05-04 00:00:00Z8NULL

Encoding

If you query your continuous materialized view with SQL, you don't need to beaware of how aggregated values are encoded because SQL exposes the results astyped columns.

If you read from the view usingReadRows, you need to decode the aggregateddata in your read request. For more information onReadRows requests, seeReads.

Aggregated values in a continuous materialized view are stored using encodingdescribed in the following table, based on the output type of the column fromthe view definition.

TypeEncoding
BOOL1 byte value, 1 = true, 0 = false
BYTESNo encoding
INT64 (or INT, SMALLINT, INTEGER, BIGINT, TINYINT, BYTEINT)64-bit big-endian
FLOAT6464-bit IEEE 754, excluding NaN and +/-inf
STRINGUTF-8
TIME/TIMESTAMP64-bit integer representing the number of microseconds since the Unixepoch (consistent with GoogleSQL)
For more information, seeEncoding in the Data API reference.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.