Query syntax

Query statements scan one or more tables or expressions and return the computedresult rows. This topic describes the syntax for SQL queries inGoogleSQL for BigQuery.

SQL syntax notation rules

The following table lists and describes the syntax notation rules that GoogleSQLdocumentation commonly uses.

NotationExampleDescription
Square brackets[ ]Optional clauses
Parentheses( )Literal parentheses
Vertical bar|LogicalXOR (exclusiveOR)
Curly braces{ }A set of options, such as{ a | b | c }. Select one option.
Ellipsis...The preceding item can repeat.
Comma,Literal comma
Comma followed by an ellipsis, ...The preceding item can repeat in a comma-separated list.
Item listitem [, ...]One or more items
[item, ...]Zero or more items
Double quotes""The enclosed syntax characters (for example,"{"..."}") are literal and required.
Angle brackets<>Literal angle brackets

SQL syntax

query_statement:query_exprquery_expr:  [WITH [RECURSIVE ] {non_recursive_cte |recursive_cte }[, ...] ]  {select | (query_expr ) |set_operation }  [ORDERBYexpression [{ ASC | DESC }] [, ...] ]  [LIMITcount [ OFFSETskip_rows ] ]select:SELECT    [WITHdifferential_privacy_clause ]    [ { ALL | DISTINCT } ]    [ AS {STRUCT |VALUE } ]select_list  [FROMfrom_clause[, ...] ]  [WHEREbool_expression ]  [GROUP BYgroup_by_specification ]  [HAVINGbool_expression ]  [QUALIFYbool_expression ]  [WINDOWwindow_clause ]

SELECT statement

SELECT  [WITHdifferential_privacy_clause ]  [ { ALL | DISTINCT } ]  [ AS {STRUCT |VALUE } ]select_listselect_list:  {select_all |select_expression } [, ...]select_all:  [expression. ]*  [ EXCEPT (column_name [, ...] ) ]  [ REPLACE (expression AScolumn_name [, ...] ) ]select_expression:expression [ [ AS ]alias ]

TheSELECT list defines the columns that the query will return. Expressions intheSELECT list can refer to columns in any of thefrom_items in itscorrespondingFROM clause.

Each item in theSELECT list is one of:

  • *
  • expression
  • expression.*

SELECT *

SELECT *, often referred to asselect star, produces one output column foreach column that's visible after executing the full query.

SELECT*FROM(SELECT"apple"ASfruit,"carrot"ASvegetable);/*-------+-----------+ | fruit | vegetable | +-------+-----------+ | apple | carrot    | +-------+-----------*/

SELECT expression

Items in aSELECT list can be expressions. These expressions evaluate to asingle value and produce one output column, with an optional explicitalias.

If the expression doesn't have an explicit alias, it receives an implicit aliasaccording to the rules forimplicit aliases, if possible.Otherwise, the column is anonymous and you can't refer to it by name elsewherein the query.

SELECT expression.*

An item in aSELECT list can also take the form ofexpression.*. Thisproduces one output column for each column or top-level field ofexpression.The expression must either be a table alias or evaluate to a single value of adata type with fields, such as a STRUCT.

The following query produces one output column for each column in the tablegroceries, aliased asg.

WITHgroceriesAS(SELECT"milk"ASdairy,"eggs"ASprotein,"bread"ASgrain)SELECTg.*FROMgroceriesASg;/*-------+---------+-------+ | dairy | protein | grain | +-------+---------+-------+ | milk  | eggs    | bread | +-------+---------+-------*/

More examples:

WITHlocationsAS(SELECTSTRUCT("Seattle"AScity,"Washington"ASstate)ASlocationUNIONALLSELECTSTRUCT("Phoenix"AScity,"Arizona"ASstate)ASlocation)SELECTl.location.*FROMlocationsl;/*---------+------------+ | city    | state      | +---------+------------+ | Seattle | Washington | | Phoenix | Arizona    | +---------+------------*/
WITHlocationsAS(SELECTARRAY<STRUCT<citySTRING,stateSTRING>>[("Seattle","Washington"),("Phoenix","Arizona")]ASlocation)SELECTl.LOCATION[offset(0)].*FROMlocationsl;/*---------+------------+ | city    | state      | +---------+------------+ | Seattle | Washington | +---------+------------*/

SELECT * EXCEPT

ASELECT * EXCEPT statement specifies the names of one or more columns toexclude from the result. All matching column names are omitted from the output.

WITHordersAS(SELECT5asorder_id,"sprocket"asitem_name,200asquantity)SELECT*EXCEPT(order_id)FROMorders;/*-----------+----------+ | item_name | quantity | +-----------+----------+ | sprocket  | 200      | +-----------+----------*/
Note:SELECT * EXCEPT doesn't exclude columns that don't have names.

SELECT * REPLACE

ASELECT * REPLACE statement specifies one or moreexpression AS identifier clauses. Each identifier must match a column namefrom theSELECT * statement. In the output column list, the column thatmatches the identifier in aREPLACE clause is replaced by the expression inthatREPLACE clause.

ASELECT * REPLACE statement doesn't change the names or order of columns.However, it can change the value and the value type.

WITHordersAS(SELECT5asorder_id,"sprocket"asitem_name,200asquantity)SELECT*REPLACE("widget"ASitem_name)FROMorders;/*----------+-----------+----------+ | order_id | item_name | quantity | +----------+-----------+----------+ | 5        | widget    | 200      | +----------+-----------+----------*/WITHordersAS(SELECT5asorder_id,"sprocket"asitem_name,200asquantity)SELECT*REPLACE(quantity/2ASquantity)FROMorders;/*----------+-----------+----------+ | order_id | item_name | quantity | +----------+-----------+----------+ | 5        | sprocket  | 100      | +----------+-----------+----------*/
Note:SELECT * REPLACE doesn't replace columns that don't have names.

SELECT DISTINCT

ASELECT DISTINCT statement discards duplicate rows and returns only theremaining rows.SELECT DISTINCT can't return columns of the following types:

In the following example,SELECT DISTINCT is used to produce distinct arrays:

WITHPlayerStatsAS(SELECT['Coolidge','Adams']asName,3asPointsScoredUNIONALLSELECT['Adams','Buchanan'],0UNIONALLSELECT['Coolidge','Adams'],1UNIONALLSELECT['Kiran','Noam'],1)SELECTDISTINCTNameFROMPlayerStats;/*------------------+ | Name             | +------------------+ | [Coolidge,Adams] | | [Adams,Buchanan] | | [Kiran,Noam]     | +------------------*/

In the following example,SELECT DISTINCT is used to produce distinct structs:

WITHPlayerStatsAS(SELECTSTRUCT<last_nameSTRING,first_nameSTRING,ageINT64>('Adams','Noam',20)ASPlayer,3ASPointsScoredUNIONALLSELECT('Buchanan','Jie',19),0UNIONALLSELECT('Adams','Noam',20),4UNIONALLSELECT('Buchanan','Jie',19),13)SELECTDISTINCTPlayerFROMPlayerStats;/*--------------------------+ | player                   | +--------------------------+ | {                        | |   last_name: "Adams",    | |   first_name: "Noam",    | |   age: 20                | |  }                       | +--------------------------+ | {                        | |   last_name: "Buchanan", | |   first_name: "Jie",     | |   age: 19                | |  }                       | +---------------------------*/

SELECT ALL

ASELECT ALL statement returns all rows, including duplicate rows.SELECT ALL is the default behavior ofSELECT.

SELECT AS STRUCT

SELECTASSTRUCTexpr[[AS]struct_field_name1][,...]

This produces avalue table with aSTRUCT row type, where theSTRUCT field names and types match the column namesand types produced in theSELECT list.

Example:

SELECTARRAY(SELECTASSTRUCT1a,2b)

SELECT AS STRUCT can be used in a scalar or array subquery to produce a singleSTRUCT type grouping multiple values together. Scalarand array subqueries (seeSubqueries) are normally notallowed to return multiple columns, but can return a single column withSTRUCT type.

SELECT AS VALUE

SELECT AS VALUE produces avalue table from anySELECT list that produces exactly one column. Instead of producing anoutput table with one column, possibly with a name, the output will be avalue table where the row type is just the value type that was produced in theoneSELECT column. Any alias the column had will be discarded in thevalue table.

Example:

SELECTASVALUESTRUCT(1ASa,2ASb)xyz

The query above produces a table with row typeSTRUCT<a int64, b int64>.

FROM clause

FROMfrom_clause[, ...]from_clause:from_item  [ {pivot_operator |unpivot_operator |match_recognize_clause } ]  [tablesample_operator ]from_item:  {table_name [as_alias ] [ FOR SYSTEM_TIME AS OFtimestamp_expression ]     | {join_operation | (join_operation ) }    | (query_expr ) [as_alias ]    |field_path    |unnest_operator    |cte_name [as_alias ]  }as_alias:  [ AS ]alias

TheFROM clause indicates the table or tables from which to retrieve rows,and specifies how to join those rows together to produce a single stream ofrows for processing in the rest of the query.

pivot_operator

SeePIVOT operator.

unpivot_operator

SeeUNPIVOT operator.

tablesample_operator

SeeTABLESAMPLE operator.

match_recognize_clause

SeeMATCH_RECOGNIZE clause.

table_name

The name (optionally qualified) of an existing table.

SELECT * FROM Roster;SELECT * FROM dataset.Roster;SELECT * FROM project.dataset.Roster;

FOR SYSTEM_TIME AS OF

FOR SYSTEM_TIME AS OF references the historical versions of the tabledefinition and rows that were current attimestamp_expression.

Limitations:

The source table in theFROM clause containingFOR SYSTEM_TIME AS OF mustnot be any of the following:

  • An array scan, including aflattened array or the outputof theUNNEST operator.
  • A common table expression defined by aWITH clause.
  • The source table in aCREATE TABLE FUNCTIONstatement creating a new table-valued function

timestamp_expression must be a constant expression. It can'tcontain the following:

  • Subqueries.
  • Correlated references (references to columns of a table that appear ata higher level of the query statement, such as in theSELECT list).
  • User-defined functions (UDFs).

The value oftimestamp_expression can't fall into the following ranges:

  • After the current timestamp (in the future).
  • More than seven (7) days before the current timestamp.

A single query statement can't reference a single table at more than one pointin time, including the current time. That is, a query can reference a tablemultiple times at the same timestamp, but not the current version and ahistorical version, or two different historical versions.

Note: DML statements always operate on the current version of the destinationtable, so if the destination table is used multiple times in the query, all ofthem must use the current version.

The default time zone fortimestamp_expression in aFOR SYSTEM_TIME AS OF expression isAmerica/Los_Angeles, even though thedefault time zone for timestamp literals isUTC.

Examples:

The following query returns a historical version of the table from one hour ago.

SELECT*FROMtFORSYSTEM_TIMEASOFTIMESTAMP_SUB(CURRENT_TIMESTAMP(),INTERVAL1HOUR);

The following query returns a historical version of the table at an absolutepoint in time.

SELECT*FROMtFORSYSTEM_TIMEASOF'2017-01-01 10:00:00-07:00';

The following query returns an error because thetimestamp_expression containsa correlated reference to a column in the containing query.

SELECT*FROMt1WHEREt1.aIN(SELECTt2.aFROMt2FORSYSTEM_TIMEASOFt1.timestamp_column);

The following operations show accessing a historical version of the table beforetable is replaced.

DECLAREbefore_replace_timestampTIMESTAMP;-- Create table books.CREATETABLEbooksASSELECT'Hamlet'title,'William Shakespeare'author;-- Get current timestamp before table replacement.SETbefore_replace_timestamp=CURRENT_TIMESTAMP();-- Replace table with different schema(title and release_date).CREATEORREPLACETABLEbooksASSELECT'Hamlet'title,DATE'1603-01-01'release_date;-- This query returns Hamlet, William Shakespeare as result.SELECT*FROMbooksFORSYSTEM_TIMEASOFbefore_replace_timestamp;

The following operations show accessing a historical version of the tablebefore a DML job.

DECLAREJOB_START_TIMESTAMPTIMESTAMP;-- Create table books.CREATEORREPLACETABLEbooksASSELECT'Hamlet'title,'William Shakespeare'author;-- Insert two rows into the books.INSERTbooks(title,author)VALUES('The Great Gatsby','F. Scott Fizgerald'),('War and Peace','Leo Tolstoy');SELECT*FROMbooks;SETJOB_START_TIMESTAMP=(SELECTstart_timeFROM`region-us`.INFORMATION_SCHEMA.JOBS_BY_USERWHEREjob_type="QUERY"ANDstatement_type="INSERT"ORDERBYstart_timeDESCLIMIT1);-- This query only returns Hamlet, William Shakespeare as result.SELECT*FROMbooksFORSYSTEM_TIMEASOFJOB_START_TIMESTAMP;

The following query returns an error because the DML operates on the currentversion of the table, and a historical version of the table from one day ago.

INSERTINTOt1SELECT*FROMt1FORSYSTEM_TIMEASOFTIMESTAMP_SUB(CURRENT_TIMESTAMP(),INTERVAL1DAY);

join_operation

SeeJoin operation.

query_expr

( query_expr ) [ [ AS ] alias ] is atable subquery.

field_path

In theFROM clause,field_path is any path thatresolves to a field within a data type.field_path can goarbitrarily deep into a nested data structure.

Some examples of validfield_path values include:

SELECT*FROMT1t1,t1.array_column;SELECT*FROMT1t1,t1.struct_column.array_field;SELECT(SELECTARRAY_AGG(c)FROMt1.array_columnc)FROMT1t1;SELECTa.struct_field1FROMT1t1,t1.array_of_structsa;SELECT(SELECTSTRING_AGG(a.struct_field1)FROMt1.array_of_structsa)FROMT1t1;

Field paths in theFROM clause must end in anarray field. Inaddition, field paths can't contain arrays before the end of the path. For example, the patharray_column.some_array.some_array_field is invalid because itcontains an array before the end of the path.

Note: If a path has only one name, it's interpreted as a table.To work around this, wrap the path usingUNNEST, or use thefully-qualified path.Note: If a path has more than one name, and it matches a fieldname, it's interpreted as a field name. To force the path to be interpreted asa table name, wrap the path using`.

unnest_operator

SeeUNNEST operator.

cte_name

Common table expressions (CTEs) in aWITH Clause act liketemporary tables that you can reference anywhere in theFROM clause.In the example below,subQ1 andsubQ2 are CTEs.

Example:

WITHsubQ1AS(SELECT*FROMRosterWHERESchoolID=52),subQ2AS(SELECTSchoolIDFROMsubQ1)SELECTDISTINCT*FROMsubQ2;

TheWITH clause hides any permanent tables with the same namefor the duration of the query, unless you qualify the table name, for example:

dataset.Roster orproject.dataset.Roster.

UNNEST operator

unnest_operator:  {UNNEST(array ) [as_alias ]    |array_path [as_alias ]  }  [ WITH OFFSET [as_alias ] ]array:  {array_expression |array_path }as_alias:  [AS]alias

TheUNNEST operator takes an array and returns a table with one row for eachelement in the array. The output ofUNNEST is onevalue table column.For theseARRAY element types,SELECT * against the value table columnreturns multiple columns:

  • STRUCT

Input values:

  • array_expression: An expression that produces an array and that's not anarray path.
  • array_path: Thepath to anARRAY type.

    • In an implicitUNNEST operation, the pathmuststart witharange variable name.
    • In an explicitUNNEST operation, the path can optionally start with arange variable name.

    TheUNNEST operation with anycorrelatedarray_path mustbe on the right side of aCROSS JOIN,LEFT JOIN, orINNER JOIN operation.

  • as_alias: If specified, defines the explicit name of the value tablecolumn containing the array element values. It can be used to refer tothe column elsewhere in the query.

  • WITH OFFSET:UNNEST destroys the order of elements in the inputarray. Use this optional clause to return an additional column withthe array element indexes, oroffsets. Offset counting starts at zero foreach row produced by theUNNEST operation. This column has anoptional alias; If the optional alias isn't used, the default column name isoffset.

    Example:

    SELECT*FROMUNNEST([10,20,30])asnumbersWITHOFFSET;/*---------+--------+ | numbers | offset | +---------+--------+ | 10      | 0      | | 20      | 1      | | 30      | 2      | +---------+--------*/

You can also useUNNEST outside of theFROM clause with theIN operator.

For several ways to useUNNEST, including construction, flattening, andfiltering, seeWork with arrays.

To learn more about the ways you can useUNNEST explicitly and implicitly,seeExplicit and implicitUNNEST.

UNNEST and structs

For an input array of structs,UNNESTreturns a row for each struct, with a separate column for each field in thestruct. The alias for each column is the name of the corresponding structfield.

Example:

SELECT*FROMUNNEST(ARRAY<STRUCT<xINT64,ySTRING,zSTRUCT<aINT64,bINT64>>>[(1,'foo',(10,11)),(3,'bar',(20,21))]);/*---+-----+----------+ | x | y   | z        | +---+-----+----------+ | 1 | foo | {10, 11} | | 3 | bar | {20, 21} | +---+-----+----------*/

Because theUNNEST operator returns avalue table,you can aliasUNNEST to define a range variable that you can referenceelsewhere in the query. If you reference the range variable in theSELECTlist, the query returns a struct containing all of the fields of the originalstruct in the input table.

Example:

SELECT*,struct_valueFROMUNNEST(ARRAY<STRUCT<xINT64,ySTRING>>[(1,'foo'),(3,'bar')])ASstruct_value;/*---+-----+--------------+ | x | y   | struct_value | +---+-----+--------------+ | 3 | bar | {3, bar}     | | 1 | foo | {1, foo}     | +---+-----+--------------*/

Explicit and implicitUNNEST

Array unnesting can be either explicit or implicit. To learn more, see thefollowing sections.

Explicit unnesting

TheUNNEST keyword is required in explicit unnesting. For example:

WITHCoordinatesAS(SELECT[1,2]ASposition)SELECTresultsFROMCoordinates,UNNEST(Coordinates.position)ASresults;

This example and the following examples use thearray_path calledCoordinates.position to illustrate unnesting.

Implicit unnesting

TheUNNEST keyword isn't used in implicit unnesting.

For example:

WITHCoordinatesAS(SELECT[1,2]ASposition)SELECTresultsFROMCoordinates,Coordinates.positionASresults;
Tables and implicit unnesting

When you usearray_path with implicitUNNEST,array_path must be prependedwith the table. For example:

WITHCoordinatesAS(SELECT[1,2]ASposition)SELECTresultsFROMCoordinates,Coordinates.positionASresults;

UNNEST andNULL values

UNNEST treatsNULL values as follows:

  • NULL and empty arrays produce zero rows.
  • An array containingNULL values produces rows containingNULL values.

PIVOT operator

FROMfrom_item[, ...]pivot_operatorpivot_operator:  PIVOT(aggregate_function_call [as_alias][, ...]    FORinput_column    IN (pivot_column [as_alias][, ...] )  ) [ASalias]as_alias:  [AS]alias

ThePIVOT operator rotates rows into columns, using aggregation.PIVOT is part of theFROM clause.

  • PIVOT can be used to modify any table expression.
  • CombiningPIVOT withFOR SYSTEM_TIME AS OF isn't allowed, although usersmay usePIVOT against a subquery input which itself usesFOR SYSTEM_TIME AS OF.
  • AWITH OFFSET clause immediately preceding thePIVOT operator isn'tallowed.

Conceptual example:

-- Before PIVOT is used to rotate sales and quarter into Q1, Q2, Q3, Q4 columns:/*---------+-------+---------+------+ | product | sales | quarter | year | +---------+-------+---------+------| | Kale    | 51    | Q1      | 2020 | | Kale    | 23    | Q2      | 2020 | | Kale    | 45    | Q3      | 2020 | | Kale    | 3     | Q4      | 2020 | | Kale    | 70    | Q1      | 2021 | | Kale    | 85    | Q2      | 2021 | | Apple   | 77    | Q1      | 2020 | | Apple   | 0     | Q2      | 2020 | | Apple   | 1     | Q1      | 2021 | +---------+-------+---------+------*/-- After PIVOT is used to rotate sales and quarter into Q1, Q2, Q3, Q4 columns:/*---------+------+----+------+------+------+ | product | year | Q1 | Q2   | Q3   | Q4   | +---------+------+----+------+------+------+ | Apple   | 2020 | 77 | 0    | NULL | NULL | | Apple   | 2021 | 1  | NULL | NULL | NULL | | Kale    | 2020 | 51 | 23   | 45   | 3    | | Kale    | 2021 | 70 | 85   | NULL | NULL | +---------+------+----+------+------+------*/

Definitions

Top-level definitions:

  • from_item: The table, subquery, ortable-valued function (TVF) on whichto perform a pivot operation. Thefrom_item mustfollow these rules.
  • pivot_operator: The pivot operation to perform on afrom_item.
  • alias: An alias to use for an item in the query.

pivot_operator definitions:

  • aggregate_function_call: An aggregate function call that aggregates allinput rows such thatinput_column matches a particular value inpivot_column. Each aggregation corresponding to a differentpivot_columnvalue produces a different column in the output.Follow these rules when creating anaggregate function call.
  • input_column: Takes a column and retrieves the row values for thecolumn,following these rules.
  • pivot_column: A pivot column to create for each aggregate functioncall. If an alias isn't provided, a default alias is created. A pivot columnvalue type must match the value type ininput_column so that the values canbe compared. It's possible to have a value inpivot_column that doesn'tmatch a value ininput_column. Must be a constant andfollow these rules.

Rules

Rules for afrom_item passed toPIVOT:

  • Thefrom_item may consist of anytable, subquery, or table-valued function(TVF) result.
  • Thefrom_item may not produce a value table.
  • Thefrom_item may not be a subquery usingSELECT AS STRUCT.

Rules foraggregate_function_call:

  • Must be an aggregate function. For example,SUM.
  • You may reference columns in a table passed toPIVOT, aswell as correlated columns, but may not access columns defined by thePIVOTclause itself.
  • A table passed toPIVOT may be accessed through its alias if one isprovided.
  • You can only use an aggregate function that takes one argument.
  • Except forCOUNT, you can only use aggregate functions that ignoreNULLinputs.
  • If you are usingCOUNT, you can use* as an argument.

Rules forinput_column:

  • May access columns from the input table, as well as correlated columns,not columns defined by thePIVOT clause, itself.
  • Evaluated against each row in the input table; aggregate and window functioncalls are prohibited.
  • Non-determinism is okay.
  • The type must be groupable.
  • The input table may be accessed through its alias if one is provided.

Rules forpivot_column:

  • Apivot_column must be a constant.
  • Named constants, such as variables, aren't supported.
  • Query parameters aren't supported.
  • If a name is desired for a named constant or query parameter,specify it explicitly with an alias.
  • Corner cases exist where a distinctpivot_columns can end up with the samedefault column names. For example, an input column might contain both aNULL value and the string literal"NULL". When this happens, multiplepivot columns are created with the same name. To avoid this situation,use aliases for pivot column names.
  • If apivot_column doesn't specify an alias, a column name is constructed asfollows:
FromToExample
NULLNULL Input: NULL
Output: "NULL"
INT64
NUMERIC
BIGNUMERIC
The number in string format with the following rules:
  • Positive numbers are preceded with_.
  • Negative numbers are preceded withminus_.
  • A decimal point is replaced with_point_.
Input: 1
Output: _1

Input: -1
Output: minus_1

Input: 1.0
Output: _1_point_0
BOOLTRUE orFALSE. Input: TRUE
Output: TRUE

Input: FALSE
Output: FALSE
STRINGThe string value. Input: "PlayerName"
Output: PlayerName
DATEThe date in_YYYY_MM_DD format. Input: DATE '2013-11-25'
Output: _2013_11_25
ENUMThe name of the enumeration constant. Input: COLOR.RED
Output: RED
STRUCT A string formed by computing thepivot_column name for each field and joining the results together with an underscore. The following rules apply:
  • If the field is named:<field_name>_<pivot_column_name_for_field_name>.
  • If the field is unnamed:<pivot_column_name_for_field_name>.

<pivot_column_name_for_field_name> is determined by applying the rules in this table, recursively. If no rule is available for anySTRUCT field, the entire pivot column is unnamed.

Due to implicit type coercion from theIN list values to the type of<value-expression>, field names must be present ininput_column to have an effect on the names of the pivot columns.

Input: STRUCT("one", "two")
Output: one_two

Input: STRUCT("one" AS a, "two" AS b)
Output: one_a_two_b
All other data typesNot supported. You must provide an alias.

Examples

The following examples reference a table calledProduce that looks like this:

WITHProduceAS(SELECT'Kale'asproduct,51assales,'Q1'asquarter,2020asyearUNIONALLSELECT'Kale',23,'Q2',2020UNIONALLSELECT'Kale',45,'Q3',2020UNIONALLSELECT'Kale',3,'Q4',2020UNIONALLSELECT'Kale',70,'Q1',2021UNIONALLSELECT'Kale',85,'Q2',2021UNIONALLSELECT'Apple',77,'Q1',2020UNIONALLSELECT'Apple',0,'Q2',2020UNIONALLSELECT'Apple',1,'Q1',2021)SELECT*FROMProduce/*---------+-------+---------+------+ | product | sales | quarter | year | +---------+-------+---------+------| | Kale    | 51    | Q1      | 2020 | | Kale    | 23    | Q2      | 2020 | | Kale    | 45    | Q3      | 2020 | | Kale    | 3     | Q4      | 2020 | | Kale    | 70    | Q1      | 2021 | | Kale    | 85    | Q2      | 2021 | | Apple   | 77    | Q1      | 2020 | | Apple   | 0     | Q2      | 2020 | | Apple   | 1     | Q1      | 2021 | +---------+-------+---------+------*/

With thePIVOT operator, the rows in thequarter column are rotated intothese new columns:Q1,Q2,Q3,Q4. The aggregate functionSUM isimplicitly grouped by all unaggregated columns other than thepivot_column:product andyear.

SELECT*FROMProducePIVOT(SUM(sales)FORquarterIN('Q1','Q2','Q3','Q4'))/*---------+------+----+------+------+------+ | product | year | Q1 | Q2   | Q3   | Q4   | +---------+------+----+------+------+------+ | Apple   | 2020 | 77 | 0    | NULL | NULL | | Apple   | 2021 | 1  | NULL | NULL | NULL | | Kale    | 2020 | 51 | 23   | 45   | 3    | | Kale    | 2021 | 70 | 85   | NULL | NULL | +---------+------+----+------+------+------*/

If you don't includeyear, thenSUM is grouped only byproduct.

SELECT*FROM(SELECTproduct,sales,quarterFROMProduce)PIVOT(SUM(sales)FORquarterIN('Q1','Q2','Q3','Q4'))/*---------+-----+-----+------+------+ | product | Q1  | Q2  | Q3   | Q4   | +---------+-----+-----+------+------+ | Apple   | 78  | 0   | NULL | NULL | | Kale    | 121 | 108 | 45   | 3    | +---------+-----+-----+------+------*/

You can select a subset of values in thepivot_column:

SELECT*FROM(SELECTproduct,sales,quarterFROMProduce)PIVOT(SUM(sales)FORquarterIN('Q1','Q2','Q3'))/*---------+-----+-----+------+ | product | Q1  | Q2  | Q3   | +---------+-----+-----+------+ | Apple   | 78  | 0   | NULL | | Kale    | 121 | 108 | 45   | +---------+-----+-----+------*/
SELECT*FROM(SELECTsales,quarterFROMProduce)PIVOT(SUM(sales)FORquarterIN('Q1','Q2','Q3'))/*-----+-----+----+ | Q1  | Q2  | Q3 | +-----+-----+----+ | 199 | 108 | 45 | +-----+-----+----*/

You can include multiple aggregation functions in thePIVOT. In this case, youmust specify an alias for each aggregation. These aliases are used to constructthe column names in the resulting table.

SELECT*FROM(SELECTproduct,sales,quarterFROMProduce)PIVOT(SUM(sales)AStotal_sales,COUNT(*)ASnum_recordsFORquarterIN('Q1','Q2'))/*--------+----------------+----------------+----------------+----------------+ |product | total_sales_Q1 | num_records_Q1 | total_sales_Q2 | num_records_Q2 | +--------+----------------+----------------+----------------+----------------+ | Kale   | 121            | 2              | 108            | 2              | | Apple  | 78             | 2              | 0              | 1              | +--------+----------------+----------------+----------------+----------------*/

UNPIVOT operator

FROMfrom_item[, ...]unpivot_operatorunpivot_operator:  UNPIVOT [ { INCLUDE NULLS | EXCLUDE NULLS } ] (    {single_column_unpivot |multi_column_unpivot }  ) [unpivot_alias]single_column_unpivot:values_column  FORname_column  IN (columns_to_unpivot)multi_column_unpivot:values_column_set  FORname_column  IN (column_sets_to_unpivot)values_column_set:  (values_column[, ...])columns_to_unpivot:unpivot_column [row_value_alias][, ...]column_sets_to_unpivot:  (unpivot_column [row_value_alias][, ...])unpivot_alias androw_value_alias:  [AS]alias

TheUNPIVOT operator rotates columns into rows.UNPIVOT is part of theFROM clause.

  • UNPIVOT can be used to modify any tableexpression.
  • CombiningUNPIVOT withFOR SYSTEM_TIME AS OF isn't allowed, althoughusers may useUNPIVOT against a subquery input which itself usesFOR SYSTEM_TIME AS OF.
  • AWITH OFFSET clause immediately preceding theUNPIVOT operator isn'tallowed.
  • PIVOT aggregations can't be reversed withUNPIVOT.

Conceptual example:

-- Before UNPIVOT is used to rotate Q1, Q2, Q3, Q4 into sales and quarter columns:/*---------+----+----+----+----+ | product | Q1 | Q2 | Q3 | Q4 | +---------+----+----+----+----+ | Kale    | 51 | 23 | 45 | 3  | | Apple   | 77 | 0  | 25 | 2  | +---------+----+----+----+----*/-- After UNPIVOT is used to rotate Q1, Q2, Q3, Q4 into sales and quarter columns:/*---------+-------+---------+ | product | sales | quarter | +---------+-------+---------+ | Kale    | 51    | Q1      | | Kale    | 23    | Q2      | | Kale    | 45    | Q3      | | Kale    | 3     | Q4      | | Apple   | 77    | Q1      | | Apple   | 0     | Q2      | | Apple   | 25    | Q3      | | Apple   | 2     | Q4      | +---------+-------+---------*/

Definitions

Top-level definitions:

  • from_item: The table, subquery, ortable-valued function (TVF) on whichto perform a pivot operation. Thefrom_item mustfollow these rules.
  • unpivot_operator: The pivot operation to perform on afrom_item.

unpivot_operator definitions:

  • INCLUDE NULLS: Add rows withNULL values to the result.
  • EXCLUDE NULLS: don't add rows withNULL values to the result.By default,UNPIVOT excludes rows withNULL values.
  • single_column_unpivot: Rotates columns into onevalues_columnand onename_column.
  • multi_column_unpivot: Rotates columns into multiplevalues_columns and onename_column.
  • unpivot_alias: An alias for the results of theUNPIVOT operation. Thisalias can be referenced elsewhere in the query.

single_column_unpivot definitions:

  • values_column: A column to contain the row values fromcolumns_to_unpivot.Follow these rules when creating a values column.
  • name_column: A column to contain the column names fromcolumns_to_unpivot.Follow these rules when creating a name column.
  • columns_to_unpivot: The columns from thefrom_item to populatevalues_column andname_column.Follow these rules when creating an unpivotcolumn.
    • row_value_alias: An optional alias for a column that's displayed for thecolumn inname_column. If not specified, the string value of thecolumn name is used.Follow these rules when creating arow value alias.

multi_column_unpivot definitions:

  • values_column_set: A set of columns to contain the row values fromcolumns_to_unpivot.Follow these rules whencreating a values column.
  • name_column: A set of columns to contain the column names fromcolumns_to_unpivot.Follow these rules whencreating a name column.
  • column_sets_to_unpivot: The columns from thefrom_item to unpivot.Follow these rules when creating an unpivotcolumn.
    • row_value_alias: An optional alias for a column set that's displayedfor the column set inname_column. If not specified, a string value forthe column set is used and each column in the string is separated with anunderscore (_). For example,(col1, col2) outputscol1_col2.Follow these rules when creating arow value alias.

Rules

Rules for afrom_item passed toUNPIVOT:

  • Thefrom_item may consist of anytable, subquery, or table-valued function(TVF) result.
  • Thefrom_item may not produce a value table.
  • Duplicate columns in afrom_item can't be referenced in theUNPIVOTclause.

Rules forunpivot_operator:

  • Expressions aren't permitted.
  • Qualified names aren't permitted. For example,mytable.mycolumn isn'tallowed.
  • In the case where theUNPIVOT result has duplicate column names:
    • SELECT * is allowed.
    • SELECT values_column causes ambiguity.

Rules forvalues_column:

  • It can't be a name used for aname_column or anunpivot_column.
  • It can be the same name as a column from thefrom_item.

Rules forname_column:

  • It can't be a name used for avalues_column or anunpivot_column.
  • It can be the same name as a column from thefrom_item.

Rules forunpivot_column:

  • Must be a column name from thefrom_item.
  • It can't reference duplicatefrom_item column names.
  • All columns in a column set must have equivalent data types.
    • Data types can't be coerced to a common supertype.
    • If the data types are exact matches (for example, a struct withdifferent field names), the data type of the first input isthe data type of the output.
  • You can't have the same name in the same column set. For example,(emp1, emp1) results in an error.
  • You can have a the same name in different column sets. For example,(emp1, emp2), (emp1, emp3) is valid.

Rules forrow_value_alias:

  • This can be a string or anINT64 literal.
  • The data type for allrow_value_alias clauses must be the same.
  • If the value is anINT64, therow_value_alias for eachunpivot_columnmust be specified.

Examples

The following examples reference a table calledProduce that looks like this:

WITHProduceAS(SELECT'Kale'asproduct,51asQ1,23asQ2,45asQ3,3asQ4UNIONALLSELECT'Apple',77,0,25,2)SELECT*FROMProduce/*---------+----+----+----+----+ | product | Q1 | Q2 | Q3 | Q4 | +---------+----+----+----+----+ | Kale    | 51 | 23 | 45 | 3  | | Apple   | 77 | 0  | 25 | 2  | +---------+----+----+----+----*/

With theUNPIVOT operator, the columnsQ1,Q2,Q3, andQ4 arerotated. The values of these columns now populate a new column calledSalesand the names of these columns now populate a new column calledQuarter.This is a single-column unpivot operation.

SELECT*FROMProduceUNPIVOT(salesFORquarterIN(Q1,Q2,Q3,Q4))/*---------+-------+---------+ | product | sales | quarter | +---------+-------+---------+ | Kale    | 51    | Q1      | | Kale    | 23    | Q2      | | Kale    | 45    | Q3      | | Kale    | 3     | Q4      | | Apple   | 77    | Q1      | | Apple   | 0     | Q2      | | Apple   | 25    | Q3      | | Apple   | 2     | Q4      | +---------+-------+---------*/

In this example, weUNPIVOT four quarters into two semesters.This is a multi-column unpivot operation.

SELECT*FROMProduceUNPIVOT((first_half_sales,second_half_sales)FORsemestersIN((Q1,Q2)AS'semester_1',(Q3,Q4)AS'semester_2'))/*---------+------------------+-------------------+------------+ | product | first_half_sales | second_half_sales | semesters  | +---------+------------------+-------------------+------------+ | Kale    | 51               | 23                | semester_1 | | Kale    | 45               | 3                 | semester_2 | | Apple   | 77               | 0                 | semester_1 | | Apple   | 25               | 2                 | semester_2 | +---------+------------------+-------------------+------------*/

TABLESAMPLE operator

TABLESAMPLE SYSTEM ( percent PERCENT )

Description

You can use theTABLESAMPLE operator to select a random sample of a dataset.This operator is useful when you're working with tables that have largeamounts of data and you don't need precise answers.

Sampling returns a variety of records while avoiding the costs associated withscanning and processing an entire table. Each execution of the query mightreturn different results because each execution processes an independentlycomputed sample. GoogleSQL doesn't cache the results of queries thatinclude aTABLESAMPLE clause.

Replacepercent with the percentage of the dataset that you want to include inthe results. The value must be between0 and100. The value can be a literalvalue or a query parameter. It can't be a variable.

For more information, seeTable sampling.

Example

The following query selects approximately 10% of a table's data:

SELECT*FROMdataset.my_tableTABLESAMPLESYSTEM(10PERCENT)

MATCH_RECOGNIZE clause

FROMfrom_itemMATCH_RECOGNIZE (  [PARTITION BYpartition_expr [, ... ] ]ORDER BYorder_expr [{ ASC | DESC }] [{ NULLS FIRST | NULLS LAST }] [, ...]MEASURES {measures_expr [AS] alias } [, ... ]  [ AFTER MATCH SKIP { PAST LAST ROW | TO NEXT ROW } ]PATTERN (pattern)DEFINEsymbol ASboolean_expr [, ... ]  [ OPTIONS ( [ use_longest_match = { TRUE | FALSE } ] ) ])

Description

TheMATCH_RECOGNIZE clause is an optional sub-clause of theFROM clause,used to filter and aggregate based on matches. Amatch is an ordered sequenceof rows that match a pattern that you specify.Matching rows works similarly to matching with regular expressions, butinstead of matching characters in a string, theMATCH_RECOGNIZE clause findsmatches across rows in a table.

Definitions

  • from_item: The data over which theMATCH_RECOGNIZE clause operates.
  • partition_expr: An expression for how to partition the input. Theexpression can't contain floating point types, non-groupable types,constants, or window functions.
  • order_expr: An orderable column or expression used to order the rowsbefore matching.
  • measures_expr: An aggregate expression that is evaluated per partition andmatch. The expression can reference columns and symbols.
  • alias: The output column name formeasures_expr.
  • PAST LAST ROW: Don't allow overlapping matches. The next match must startfrom one or more rows past the last row in the previous match. The defaultvalue forAFTER MATCH SKIP isPAST LAST ROW.
  • TO NEXT ROW: Allow overlapping matches. The next match can begin one rowafter the start of the previous match. Multiple matches that start at thesame row aren't allowed.
  • pattern: A sequence of symbols andpattern elements that defines a match.
  • symbol: A name given to a boolean expression used to construct thepattern.
  • boolean_expr: A boolean expression for a symbol that determines whethera row matches that symbol.
  • use_longest_match: If true, then the chosen match starting from anygiven row is the match that includes the most rows.The default value isFALSE. For more information, seeMatch disambiguation rules.

Output

When you use theMATCH_RECOGNIZE clause, input data is transformedin the following way:

If thePARTITION BY clause is present, then the output is the following:

  • There is one row for each match in each partition. Partitions with nomatches don't generate any rows.
  • There is one column for each expression in thePARTITION BY clause, andone column for each expression in theMEASURES clause.

If thePARTITION BY clause is omitted, then the output is the following:

  • There is one row for each match.
  • There is one column for each expression in theMEASURES clause.

PARTITION BY clause

ThePARTITION BY clause specifies a list of expressions that are usedto partition the input rows for patternmatching. Each partition is sorted according to theORDER BY clause andmatches must be entirely contained within a partition.If thePARTITION BY clause is omitted,then the entire input table belongs to a single partition.

You can't introduce an alias for a partition expression. If the expression is asingle column name, then that name is used as the column name in the output.Otherwise, the output column for that partition is anonymous.

ORDER BY clause

TheORDER BY clause within aMATCH_RECOGNIZE clause orders therows of the input for pattern matching andfollows the same rules as the standardORDER BY clause.The data type of eachexpression must beorderable.If all expressions in theORDER BYclause are tied, then the rows may be ordered arbitrarily.

MEASURES clause

TheMEASURES clause lists which aggregate expressions to compute for eachmatch, subject to the following rules:

  • You must provide an alias for each aggregate expression.
  • Aggregate expressions must aggregate any columns that they reference,except for columns in thePARTITION BY clause, which areguaranteed to have the same value for every row in a match.
  • To perform aggregation only on rows that matched a particular symbol, usethe syntaxsymbol_name.column_name. You can't use a symbol withouta column reference. You can't reference multiple symbols within the sameaggregation function.
  • A single row can match at most one symbol within a pattern. If a rowsatisfies the boolean expressions of multiple symbols in theDEFINEclause, only the symbol that that takes precedence can contribute to anaggregation expression. For more information about how todetermine which symbol counts towards a match, seematch disambiguation.

Order-sensitive aggregate functions

If an aggregate expression contains a function that depends on the order ofinput, such asARRAY_AGG orSTRING_AGG, the inputs are ordered accordingto theORDER BY clause of theMATCH_RECOGNIZE clause, unless they includethe keywordDISTINCT or an explicit ordering.If you explicitly specify an ordering within anaggregate function call, such asARRAY_AGG(x ORDER BY x), then the functionuses the order that you specified.

MEASURES clause functions

In addition to the standard aggregate functions, you can use the followingfunctions in theMEASURES clause:

  • FIRST(x): Returns the value ofx in the first row of the match, orNULL if the match is empty.
  • LAST(x): Returns the value ofx in the last row of the match, orNULLif the match is empty.
  • MATCH_NUMBER(): Returns the integer rank of the match, relative to othermatches in the same partition, starting at 1. Matches are ranked accordingto the order of their starting rows using theORDER BY clause.This function can be used with or without aggregation,because it returns the same value for every row in a match.
  • MATCH_ROW_NUMBER(): Returns an integer specifying the row number of thecurrent row, within the current match, starting at 1. The results of thisfunction must be aggregated.
  • CLASSIFIER(): Returns aSTRING value for each row in the match thatindicates the name of the symbol matched by the row. The results of thisfunction must be aggregated.

The following SQL snippet references a table with columns calledsales,sale_date, andcustomer, andshows examples of what is and isn't allowed in theMEASURES clause:

PARTITIONBYcustomerORDERBYsale_dateMEASURES# Error. No alias.SUM(sales),# Correct. You don't need to aggregate a partition column.customerAScustomer_name,# Error. You must aggregate all non-partition columns.salesAScustomer_sales,# Correct. You can aggregate table columns.SUM(sales)AStotal_sales,# Correct. Uses the symbol.column syntax.SUM(low_sales.sales)AStotal_low_sales,# Correct. Multiple symbols appear in separate aggregations.SUM(low_sales.sales)+SUM(high_sales.sales)AScombined_low_high,# Error. Multiple symbols appear in a single aggregation.SUM(low_sales.sales+high_sales.sales)ASbad_combined_sales,# Error. A single aggregation can't combine symbol-qualified and# non-symbol-qualified terms.SUM(low_sales.sales+sales)PATTERN(low_sales+high_sales*)DEFINElow_salesASsales <100,high_salesASsales >200

PATTERN clause

ThePATTERN clause specifies a pattern to match. A pattern is a sequence ofsymbols and operators. Adjacent symbols within a pattern must be separated by aspace or grouped separately using parentheses. Pattern matches are computedbased on the order specified in theORDER BY clause. Each pattern matchmust appear entirely within a partition. Patterns support thefollowing elements, listed in order of precedence:

ElementDescription
(<pattern>)Matches<pattern>. Parentheses indicate grouping.
<symbol>Matches a single row such that the associated expression in theDEFINE clause evaluates toTRUE for that row. See theDEFINE clause for details.
()Matches an empty row sequence unconditionally.
^Matches an empty row sequence, only before the first row of input.
$Matches an empty row sequence, only after the last row of input.
<pattern>?Matches<pattern> zero or one times; prefer once.
<pattern>??Matches<pattern> zero or one times; prefer zero.
<pattern>*Matches<pattern> zero or more times; prefer more.
<pattern>*?Matches<pattern> zero or more times; prefer fewer.
<pattern>+Matches<pattern> one or more times; prefer more.
<pattern>+?Matches<pattern> one or more times; prefer fewer.
<pattern>{n}Matches<pattern> exactly n times.
<pattern>{m,}Matches<pattern> at leastm times; prefer more.
<pattern>{m,}?Matches<pattern> at leastm times; prefer fewer.
<pattern>{,n}Matches<pattern> at mostn times; prefer more.
<pattern>{,n}?Matches<pattern> at mostn times; prefer fewer.
<pattern>{m,n}Matches<pattern> betweenm andn times, inclusive; prefer more.
<pattern>{m,n}?Matches<pattern> betweenm andn times, inclusive; prefer fewer.
<pattern1> <pattern2>Matches<pattern1>, followed by<pattern2>.
<pattern1> | <pattern2>Matches either<pattern1> or<pattern2>; prefer<pattern1>.
<pattern> |Matches either<pattern> or an empty row sequence; prefer<pattern>. This is equivalent to<pattern>?.
| <pattern>Matches either<pattern> or an empty row sequence; prefer the empty row sequence. This is equivalent to<pattern>??.

The values ofm andn must be non-null integerliterals or query parameters between0 and 10,000. If a quantifier contains an upper and lower bound, such as{m,n}, then the upper bound can't be less than the lower bound.

DEFINE clause

TheDEFINE clause lists all symbols used in the pattern. Each symbol isdefined by a boolean expression. The symbol canmatch a row if the expression evaluates toTRUE for that row.Every symbol must appear at least once in thePATTERN clause.

A single row canmatch at most one symbol within a match, even if it satisfies the booleanexpressions of multiple symbols in theDEFINE clause. Multiple matchescan't start at the same row. For more information, seematch disambiguation.

For example, you can use the following pattern and symbols inaMATCH_RECOGNIZE clause tomatch pairs of adjacent rows in which the value ofsales is less than 100 inthe first row and greater than 200 in the following row:

PATTERN(low_saleshigh_sales)DEFINElow_salesASsales <100,high_salesASsales >200

The following example matches one or more rows with sales less than 100,followed by at most one row with sales greater than 200, followedby any number of rows with sales less than 100:

PATTERN(low_sales+high_sales?low_sales*)DEFINElow_salesASsales <100,high_salesASsales >200

There are two functions unique to theDEFINE clause,PREV() andNEXT(),that you can use to define a symbol in relation to rows around thecurrent row. Each function accepts a column name and, optionally, a nonnegativeinteger indicating how many rows away to look:

  • PREV(column_name [, num_rows]): Returns the value ofcolumn_namethat occursnum_rows before the current row,orNULL if that row doesn't exist. The default value ofnum_rows is 1.
  • NEXT(column_name [, num_rows]): Returns the value ofcolumn_namethat occursnum_rows after the current row,orNULL if that row doesn't exist. The default value ofnum_rows is 1.

You can't use navigation functions, such asLEAD orLAG, to reference otherrow values.

The following example matches sequences of three rows in which the value ofxincreases and then decreases:

PATTERN(inclinepeakdecline)DEFINEinclineASx <NEXT(x),peakASPREV(x) <xANDx >NEXT(x),declineASPREV(x) >x

Match disambiguation

When multiple matches exist with the same starting row, only one is selectedaccording to the following rules:

  1. Longest match mode.

    • If theOPTIONS clause is present anduse_longest_match isTRUE,then the chosen match is the one that includes the most rows.In case of a tie, the chosen match is decided by operator preference.
    • If theOPTIONS clause is not present oruse_longest_match isFALSE, then the match is chosen based on operator preference.
  2. Operator preference.

    • The| operator gives preference to the left operand over the rightoperand.
    • Greedy quantifiers, which include?,*,+,{m,},{,n}, and{m,n}, give preference to matches that repeat the operand more timesover matches that repeat it fewer times.
    • Reluctant quantifiers, which include??,*?,{m,}?,{,n}?, and{m,n}?, give preference to matches that repeat the operand fewertimes over matches that repeat it more times.

When operator preferences in different parts of the pattern conflict with eachother, preferences that occur earlier in the match take priority over those thatoccur later in the match. For example, the patternA* B+ first prioritizesincreasing the number of rows that match theA symbol, even if thatresults in fewer rows that match theB symbol. In other words,AAB ischosen overABB.

The pattern(A|B B)+ prioritizes starting the match withAB rather thanBB, even if that results in fewer repetitions overall.

Examples

In the following example, pattern matches are single rows. The pattern matchesthelow symbol ifx is less than or equal to 2. The pattern matches thehigh symbol ifx is greater than or equal to 2. There are three matches,one for each row:

  • The first row of input only satisfies thelow symbol expression, sox = 1 contributes to thelow_agg aggregation but not thehigh_aggaggregation.
  • In the second row of input,x satisfies both symbol expressions, but thelow symbol takes precedence because in the| operator, preference isgiven to the symbol on the left. Therefore,x = 2 contributes only tothelow_agg aggregation.
  • The third row of input only satisfies thehigh symbol expression, sox = 3 contributes to thehigh_agg aggregation but not thelow_aggaggregation.
SELECT*FROM(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3)MATCH_RECOGNIZE(ORDERBYxMEASURESARRAY_AGG(high.x)AShigh_agg,ARRAY_AGG(low.x)ASlow_aggAFTERMATCHSKIPTONEXTROWPATTERN(low|high)DEFINElowASx<=2,highASx>=2);/*----------+---------+ | high_agg | low_agg | +----------+---------+ | NULL     | [1]     | | NULL     | [2]     | | [3]      | NULL    | +----------+---------*/

The following example is similar to the preceding example, except that thepattern is changed tolow | high+ and theuse_longest_match option is set toTRUE. There are three potential pattern matches that start at the second rowof input:

  1. Match the second row using thelow symbol.
  2. Match the second row using thehigh symbol.
  3. Match the second row using thehigh symbol and the third row using thehigh symbol.

The third option is chosen because it's strictly longest. Since there is notie for length, operator preference isn't considered.

SELECT*FROM(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3)MATCH_RECOGNIZE(ORDERBYxMEASURESARRAY_AGG(high.x)AShigh_agg,ARRAY_AGG(low.x)ASlow_aggAFTERMATCHSKIPTONEXTROWPATTERN(low|high+)DEFINElowASx<=2,highASx>=2OPTIONS(use_longest_match=TRUE));/*----------+---------+ | high_agg | low_agg | +----------+---------+ | NULL     | [1]     | | [2,3]    | NULL    | | [3]      | NULL    | +----------+---------*/

The following examples reference a table calledSales:

WITHSalesAS(SELECT'Daisy'AScustomer,DATE'2024-01-03'ASsale_date,'Electronics'ASproduct_category,500ASamountUNIONALLSELECT'Daisy',DATE'2024-01-04','Software',30UNIONALLSELECT'Ian',DATE'2024-02-01','Books',20UNIONALLSELECT'Ian',DATE'2024-02-08','Clothing',30UNIONALLSELECT'Ian',DATE'2024-02-10','Clothing',90UNIONALLSELECT'Daisy',DATE'2024-03-15','Software',40UNIONALLSELECT'Ian',DATE'2024-03-15','Electronics',300UNIONALLSELECT'Ian',DATE'2024-03-15','Electronics',400UNIONALLSELECT'Ian',DATE'2024-03-21','Software',30UNIONALLSELECT'Ian',DATE'2024-04-07','Books',50UNIONALLSELECT'Daisy',DATE'2024-06-28','Electronics',400UNIONALLSELECT'Daisy',DATE'2024-06-29','Clothing',100UNIONALLSELECT'Daisy',DATE'2024-06-30','Software',30UNIONALLSELECT'Ian',DATE'2024-07-07','Clothing',110)SELECT*FROMSales;/*----------+------------+------------------+--------+ | customer | sale_date  | product_category | amount | +----------+------------+------------------+--------+ | Daisy    | 2024-01-03 | Electronics      | 500    | | Daisy    | 2024-01-04 | Software         | 30     | | Ian      | 2024-02-01 | Books            | 20     | | Ian      | 2024-02-08 | Clothing         | 30     | | Ian      | 2024-02-10 | Clothing         | 90     | | Daisy    | 2024-03-15 | Software         | 40     | | Ian      | 2024-03-15 | Electronics      | 300    | | Ian      | 2024-03-15 | Electronics      | 400    | | Ian      | 2024-03-21 | Software         | 30     | | Ian      | 2024-04-07 | Books            | 50     | | Daisy    | 2024-06-28 | Electronics      | 400    | | Daisy    | 2024-06-29 | Clothing         | 100    | | Daisy    | 2024-06-30 | Software         | 30     | | Ian      | 2024-07-07 | Clothing         | 110    | +----------+------------+------------------+--------*/

The following example finds electronics purchases, followed by any number ofother purchases of other types, followed by software purchases. TheMEASURESclause aggregatesthe data in each match and computes total sales and software sales:

SELECT*FROMSalesMATCH_RECOGNIZE(PARTITIONBYcustomerORDERBYsale_dateMEASURESARRAY_AGG(STRUCT(sale_date,product_category,amount))ASsales,SUM(amount)AStotal_sale_amount,SUM(software.amount)ASsoftware_sale_amountPATTERN(electronics+any_category*?software+)DEFINEelectronicsASproduct_category='Electronics',softwareASproduct_category='Software',any_categoryASTRUE);/*----------+-----------------+------------------------+--------------+-------------------+----------------------+ | customer | sales.sale_date | sales.product_category | sales.amount | total_sale_amount | software_sale_amount | +----------+-----------------+------------------------+--------------+-------------------+----------------------+ | Daisy    | 2024-01-03      | Electronics            | 500          | 570               | 70                   | |          | 2024-01-04      | Software               | 30           |                   |                      | |          | 2024-03-15      | Software               | 40           |                   |                      | | Daisy    | 2024-06-28      | Electronics            | 400          | 530               | 30                   | |          | 2024-06-29      | Clothing               | 100          |                   |                      | |          | 2024-06-30      | Software               | 30           |                   |                      | | Ian      | 2024-03-15      | Electronics            | 300          | 730               | 30                   | |          | 2024-03-15      | Electronics            | 400          |                   |                      | |          | 2024-03-21      | Software               | 30           |                   |                      | +----------+-----------------+------------------------+--------------+-------------------+----------------------*/

The following example, like the previous example, matches electronicspurchases that were eventually followed by software purchases. The queryuses theLAST andFIRST functions that are uniqueto theMEASURES clause to compute the number of days between the finalelectronics purchase in a match and the first following software purchase:

SELECT*FROMSalesMATCH_RECOGNIZE(PARTITIONBYcustomerORDERBYsale_dateMEASURESLAST(electronics.sale_date)ASlast_electronics_sale_date,FIRST(software.sale_date)ASfirst_software_sale_date,DATE_DIFF(FIRST(software.sale_date),LAST(electronics.sale_date),day)ASdate_gapPATTERN(electronics+any_category*?software+)DEFINEelectronicsASproduct_category='Electronics',softwareASproduct_category='Software',any_categoryASTRUE);/*----------+----------------------------+--------------------------+----------+ | customer | last_electronics_sale_date | first_software_sale_date | date_gap | +----------+----------------------------+--------------------------+----------+ | Daisy    | 2024-01-03                 | 2024-01-04               | 1        | | Daisy    | 2024-06-28                 | 2024-06-30               | 2        | | Ian      | 2024-03-15                 | 2024-03-21               | 6        | +----------+----------------------------+--------------------------+----------*/

The following example matches a sequence of rows whereamount is less than50, followed by rows whereamount is between 50 and 100, followed by rowswhereamount is greater than 100. The query uses theMATCH_NUMBER,MATCH_ROW_NUMBER, andCLASSIFIER functions in theMEASURESclause to identify matches and their symbols in the results.

SELECT*FROMSalesMATCH_RECOGNIZE(PARTITIONBYcustomerORDERBYsale_dateMEASURESMATCH_NUMBER()ASmatch_number,ARRAY_AGG(STRUCT(MATCH_ROW_NUMBER()ASrow,CLASSIFIER()ASsymbol,sale_date,product_category))ASsalesPATTERN(low+mid+high+)DEFINElowASamount <50,midASamount>=50ANDamount<=100,highASamount >100);/*----------+--------------+-----------+--------------+-----------------+------------------------+ | customer | match_number | sales.row | sales.symbol | sales.sale_date | sales.product_category | +----------+--------------+-----------+--------------+-----------------+------------------------+ | Ian      | 1            | 1         | low          | 2024-02-01      | Books                  | |          |              | 2         | low          | 2024-02-08      | Clothing               | |          |              | 3         | mid          | 2024-02-10      | Clothing               | |          |              | 4         | high         | 2024-03-15      | Electronics            | |          |              | 5         | high         | 2024-03-15      | Electronics            | | Ian      | 2            | 1         | low          | 2024-03-21      | Software               | |          |              | 2         | mid          | 2024-04-07      | Books                  | |          |              | 3         | high         | 2024-07-07      | Clothing               | +----------+--------------+-----------+--------------+-----------------+------------------------*/

The following example is similar to the previous one, except it allowsoverlapping matches:

SELECT*FROMSalesMATCH_RECOGNIZE(PARTITIONBYcustomerORDERBYsale_dateMEASURESMATCH_NUMBER()ASmatch_number,ARRAY_AGG(STRUCT(MATCH_ROW_NUMBER()ASrow,CLASSIFIER()ASsymbol,sale_date,product_category))ASsalesAFTERMATCHSKIPTONEXTROWPATTERN(low+mid+high+)DEFINElowASamount <50,midASamount>=50ANDamount<=100,highASamount >100);/*----------+--------------+-----------+--------------+-----------------+------------------------+ | customer | match_number | sales.row | sales.symbol | sales.sale_date | sales.product_category | +----------+--------------+-----------+--------------+-----------------+------------------------+ | Ian      | 1            | 1         | low          | 2024-02-01      | Books                  | |          |              | 2         | low          | 2024-02-08      | Clothing               | |          |              | 3         | mid          | 2024-02-10      | Clothing               | |          |              | 4         | high         | 2024-03-15      | Electronics            | |          |              | 5         | high         | 2024-03-15      | Electronics            | | Ian      | 2            | 1         | low          | 2024-02-08      | Clothing               | |          |              | 2         | mid          | 2024-02-10      | Clothing               | |          |              | 3         | high         | 2024-03-15      | Electronics            | |          |              | 4         | high         | 2024-03-15      | Electronics            | | Ian      | 3            | 1         | low          | 2024-03-21      | Software               | |          |              | 2         | mid          | 2024-04-07      | Books                  | |          |              | 3         | high         | 2024-07-07      | Clothing               | +----------+--------------+-----------+--------------+-----------------+------------------------*/

Best practices

To scale the performance of queries that contain theMATCH_RECOGNIZE clause,use the following best practices:

  • Use thePARTITION BY clause.
    • Within each partition, rows are processed one at a time. Separatepartitions can be processed in parallel, subject to slot availability.
    • TheMATCH_RECOGNIZE clause generally runs faster when you usea larger number of smaller partitions, rather than fewer largepartitions.
    • Individual partitions which are too large, typically exceeding1 million rows, might cause anOut of memory error.
  • Use simple patterns.
    • Avoid patterns with a large number of operators.
    • Avoid bounded quantifiers with large bound values. For example,matching the patternsA+ orA* is faster than matchingA{,100}.
  • Use theORDER BY clause to guarantee the order in which the input rowsare processed. Otherwise theMATCH_RECOGNIZE clause can producenon-deterministic results.

For more example queries, see theMATCH_RECOGNIZE notebook tutorial.

Join operation

join_operation:  {cross_join_operation |condition_join_operation }cross_join_operation:from_itemcross_join_operatorfrom_itemcondition_join_operation:from_itemcondition_join_operatorfrom_itemjoin_conditioncross_join_operator:  {CROSS JOIN |, }condition_join_operator:  {[INNER] JOIN    |FULL [OUTER] JOIN    |LEFT [OUTER] JOIN    |RIGHT [OUTER] JOIN  }join_condition:  {on_clause |using_clause }on_clause:  ONbool_expressionusing_clause:  USING (column_list )

TheJOIN operation merges twofrom_items so that theSELECT clause canquery them as one source. The join operator and join condition specify how tocombine and discard rows from the twofrom_items to form a single source.

[INNER] JOIN

AnINNER JOIN, or simplyJOIN, effectively calculates the Cartesian productof the twofrom_items and discards all rows that don't meet the joincondition.Effectively means that it's possible to implement anINNER JOINwithout actually calculating the Cartesian product.

FROMAINNERJOINBONA.w=B.y/*Table A       Table B       Result+-------+     +-------+     +---------------+| w | x |  *  | y | z |  =  | w | x | y | z |+-------+     +-------+     +---------------+| 1 | a |     | 2 | k |     | 2 | b | 2 | k || 2 | b |     | 3 | m |     | 3 | c | 3 | m || 3 | c |     | 3 | n |     | 3 | c | 3 | n || 3 | d |     | 4 | p |     | 3 | d | 3 | m |+-------+     +-------+     | 3 | d | 3 | n |                            +---------------+*/
FROMAINNERJOINBUSING(x)/*Table A       Table B       Result+-------+     +-------+     +-----------+| x | y |  *  | x | z |  =  | x | y | z |+-------+     +-------+     +-----------+| 1 | a |     | 2 | k |     | 2 | b | k || 2 | b |     | 3 | m |     | 3 | c | m || 3 | c |     | 3 | n |     | 3 | c | n || 3 | d |     | 4 | p |     | 3 | d | m |+-------+     +-------+     | 3 | d | n |                            +-----------+*/

Example

This query performs anINNER JOIN on theRosterandTeamMascot tables.

SELECTRoster.LastName,TeamMascot.MascotFROMRosterJOINTeamMascotONRoster.SchoolID=TeamMascot.SchoolID;/*---------------------------+ | LastName   | Mascot       | +---------------------------+ | Adams      | Jaguars      | | Buchanan   | Lakers       | | Coolidge   | Lakers       | | Davis      | Knights      | +---------------------------*/

You can use acorrelatedINNER JOIN to flatten an arrayinto a set of rows. To learn more, seeConvert elements in an array to rows in a table.

CROSS JOIN

CROSS JOIN returns the Cartesian product of the twofrom_items. In otherwords, it combines each row from the firstfrom_item with each row from thesecondfrom_item.

If the rows of the twofrom_items are independent, then the result hasM * N rows, givenM rows in onefrom_item andN in the other. Note thatthis still holds for the case when eitherfrom_item has zero rows.

In aFROM clause, aCROSS JOIN can be written like this:

FROMACROSSJOINB/*Table A       Table B       Result+-------+     +-------+     +---------------+| w | x |  *  | y | z |  =  | w | x | y | z |+-------+     +-------+     +---------------+| 1 | a |     | 2 | c |     | 1 | a | 2 | c || 2 | b |     | 3 | d |     | 1 | a | 3 | d |+-------+     +-------+     | 2 | b | 2 | c |                            | 2 | b | 3 | d |                            +---------------+*/

You can use acorrelated cross join to convert orflatten an array into a set of rows, though the (equivalent)INNER JOIN ispreferred overCROSS JOIN for this case. To learn more, seeConvert elements in an array to rows in a table.

Examples

This query performs anCROSS JOIN on theRosterandTeamMascot tables.

SELECTRoster.LastName,TeamMascot.MascotFROMRosterCROSSJOINTeamMascot;/*---------------------------+ | LastName   | Mascot       | +---------------------------+ | Adams      | Jaguars      | | Adams      | Knights      | | Adams      | Lakers       | | Adams      | Mustangs     | | Buchanan   | Jaguars      | | Buchanan   | Knights      | | Buchanan   | Lakers       | | Buchanan   | Mustangs     | | ...                       | +---------------------------*/

Comma cross join (,)

CROSS JOINs can be written implicitly with a comma. This iscalled a comma cross join.

A comma cross join looks like this in aFROM clause:

FROMA,B/*Table A       Table B       Result+-------+     +-------+     +---------------+| w | x |  *  | y | z |  =  | w | x | y | z |+-------+     +-------+     +---------------+| 1 | a |     | 2 | c |     | 1 | a | 2 | c || 2 | b |     | 3 | d |     | 1 | a | 3 | d |+-------+     +-------+     | 2 | b | 2 | c |                            | 2 | b | 3 | d |                            +---------------+*/

You can't write comma cross joins inside parentheses. To learn more, seeJoin operations in a sequence.

FROM(A,B)//INVALID

You can use acorrelated comma cross join to convert orflatten an array into a set of rows. To learn more, seeConvert elements in an array to rows in a table.

Examples

This query performs a comma cross join on theRosterandTeamMascot tables.

SELECTRoster.LastName,TeamMascot.MascotFROMRoster,TeamMascot;/*---------------------------+ | LastName   | Mascot       | +---------------------------+ | Adams      | Jaguars      | | Adams      | Knights      | | Adams      | Lakers       | | Adams      | Mustangs     | | Buchanan   | Jaguars      | | Buchanan   | Knights      | | Buchanan   | Lakers       | | Buchanan   | Mustangs     | | ...                       | +---------------------------*/

FULL [OUTER] JOIN

AFULL OUTER JOIN (or simplyFULL JOIN) returns all fields for all matchingrows in bothfrom_items that meet the join condition. If a given row from onefrom_item doesn't join to any row in the otherfrom_item, the row returnswithNULL values for all columns from the otherfrom_item.

FROMAFULLOUTERJOINBONA.w=B.y/*Table A       Table B       Result+-------+     +-------+     +---------------------------+| w | x |  *  | y | z |  =  | w    | x    | y    | z    |+-------+     +-------+     +---------------------------+| 1 | a |     | 2 | k |     | 1    | a    | NULL | NULL || 2 | b |     | 3 | m |     | 2    | b    | 2    | k    || 3 | c |     | 3 | n |     | 3    | c    | 3    | m    || 3 | d |     | 4 | p |     | 3    | c    | 3    | n    |+-------+     +-------+     | 3    | d    | 3    | m    |                            | 3    | d    | 3    | n    |                            | NULL | NULL | 4    | p    |                            +---------------------------+*/
FROMAFULLOUTERJOINBUSING(x)/*Table A       Table B       Result+-------+     +-------+     +--------------------+| x | y |  *  | x | z |  =  | x    | y    | z    |+-------+     +-------+     +--------------------+| 1 | a |     | 2 | k |     | 1    | a    | NULL || 2 | b |     | 3 | m |     | 2    | b    | k    || 3 | c |     | 3 | n |     | 3    | c    | m    || 3 | d |     | 4 | p |     | 3    | c    | n    |+-------+     +-------+     | 3    | d    | m    |                            | 3    | d    | n    |                            | 4    | NULL | p    |                            +--------------------+*/

Example

This query performs aFULL JOIN on theRosterandTeamMascot tables.

SELECTRoster.LastName,TeamMascot.MascotFROMRosterFULLJOINTeamMascotONRoster.SchoolID=TeamMascot.SchoolID;/*---------------------------+ | LastName   | Mascot       | +---------------------------+ | Adams      | Jaguars      | | Buchanan   | Lakers       | | Coolidge   | Lakers       | | Davis      | Knights      | | Eisenhower | NULL         | | NULL       | Mustangs     | +---------------------------*/

LEFT [OUTER] JOIN

The result of aLEFT OUTER JOIN (or simplyLEFT JOIN) for twofrom_items always retains all rows of the leftfrom_item in theJOIN operation, even if no rows in the rightfrom_item satisfy the joinpredicate.

All rows from theleftfrom_item are retained;if a given row from the leftfrom_item doesn't join to any rowin therightfrom_item, the row will return withNULL values for allcolumns exclusively from the rightfrom_item. Rows from the rightfrom_item that don't join to any row in the leftfrom_item are discarded.

FROMALEFTOUTERJOINBONA.w=B.y/*Table A       Table B       Result+-------+     +-------+     +---------------------------+| w | x |  *  | y | z |  =  | w    | x    | y    | z    |+-------+     +-------+     +---------------------------+| 1 | a |     | 2 | k |     | 1    | a    | NULL | NULL || 2 | b |     | 3 | m |     | 2    | b    | 2    | k    || 3 | c |     | 3 | n |     | 3    | c    | 3    | m    || 3 | d |     | 4 | p |     | 3    | c    | 3    | n    |+-------+     +-------+     | 3    | d    | 3    | m    |                            | 3    | d    | 3    | n    |                            +---------------------------+*/
FROMALEFTOUTERJOINBUSING(x)/*Table A       Table B       Result+-------+     +-------+     +--------------------+| x | y |  *  | x | z |  =  | x    | y    | z    |+-------+     +-------+     +--------------------+| 1 | a |     | 2 | k |     | 1    | a    | NULL || 2 | b |     | 3 | m |     | 2    | b    | k    || 3 | c |     | 3 | n |     | 3    | c    | m    || 3 | d |     | 4 | p |     | 3    | c    | n    |+-------+     +-------+     | 3    | d    | m    |                            | 3    | d    | n    |                            +--------------------+*/

Example

This query performs aLEFT JOIN on theRosterandTeamMascot tables.

SELECTRoster.LastName,TeamMascot.MascotFROMRosterLEFTJOINTeamMascotONRoster.SchoolID=TeamMascot.SchoolID;/*---------------------------+ | LastName   | Mascot       | +---------------------------+ | Adams      | Jaguars      | | Buchanan   | Lakers       | | Coolidge   | Lakers       | | Davis      | Knights      | | Eisenhower | NULL         | +---------------------------*/

RIGHT [OUTER] JOIN

The result of aRIGHT OUTER JOIN (or simplyRIGHT JOIN) for twofrom_items always retains all rows of the rightfrom_item in theJOIN operation, even if no rows in the leftfrom_item satisfy the joinpredicate.

All rows from therightfrom_item are returned;if a given row from the rightfrom_item doesn't join to any rowin theleftfrom_item, the row will return withNULL values for allcolumns exclusively from the leftfrom_item. Rows from the leftfrom_itemthat don't join to any row in the rightfrom_item are discarded.

FROMARIGHTOUTERJOINBONA.w=B.y/*Table A       Table B       Result+-------+     +-------+     +---------------------------+| w | x |  *  | y | z |  =  | w    | x    | y    | z    |+-------+     +-------+     +---------------------------+| 1 | a |     | 2 | k |     | 2    | b    | 2    | k    || 2 | b |     | 3 | m |     | 3    | c    | 3    | m    || 3 | c |     | 3 | n |     | 3    | c    | 3    | n    || 3 | d |     | 4 | p |     | 3    | d    | 3    | m    |+-------+     +-------+     | 3    | d    | 3    | n    |                            | NULL | NULL | 4    | p    |                            +---------------------------+*/
FROMARIGHTOUTERJOINBUSING(x)/*Table A       Table B       Result+-------+     +-------+     +--------------------+| x | y |  *  | x | z |  =  | x    | y    | z    |+-------+     +-------+     +--------------------+| 1 | a |     | 2 | k |     | 2    | b    | k    || 2 | b |     | 3 | m |     | 3    | c    | m    || 3 | c |     | 3 | n |     | 3    | c    | n    || 3 | d |     | 4 | p |     | 3    | d    | m    |+-------+     +-------+     | 3    | d    | n    |                            | 4    | NULL | p    |                            +--------------------+*/

Example

This query performs aRIGHT JOIN on theRosterandTeamMascot tables.

SELECTRoster.LastName,TeamMascot.MascotFROMRosterRIGHTJOINTeamMascotONRoster.SchoolID=TeamMascot.SchoolID;/*---------------------------+ | LastName   | Mascot       | +---------------------------+ | Adams      | Jaguars      | | Buchanan   | Lakers       | | Coolidge   | Lakers       | | Davis      | Knights      | | NULL       | Mustangs     | +---------------------------*/

Join conditions

In ajoin operation, a join condition helps specify how tocombine rows in twofrom_items to form a single source.

The two types of join conditions are theON clause andUSING clause. You must use a join condition when you perform aconditional join operation. You can't use a join condition when you perform across join operation.

ON clause

ONbool_expression

Description

Given a row from each table, if theON clause evaluates toTRUE, the querygenerates a consolidated row with the result of combining the given rows.

Definitions:

  • bool_expression: The boolean expression that specifies the condition forthe join. This is frequently acomparison operation orlogical combination of comparison operators.

Details:

Similarly toCROSS JOIN,ON produces a column once for each column in eachinput table.

ANULL join condition evaluation is equivalent to aFALSE evaluation.

If a column-order sensitive operation such asUNION orSELECT * is used withtheON join condition, the resulting table contains all of the columns fromthe left input in order, and then all of the columns from the right input inorder.

Examples

The following examples show how to use theON clause:

WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4)SELECT*FROMAINNERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4)SELECTA.x,B.xFROMAINNERJOINBONA.x=B.x;/*Table A   Table B   Result (A.x, B.x)+---+     +---+     +-------+| x |  *  | x |  =  | x | x |+---+     +---+     +-------+| 1 |     | 2 |     | 2 | 2 || 2 |     | 3 |     | 3 | 3 || 3 |     | 4 |     +-------++---+     +---+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECT*FROMALEFTOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTA.x,B.xFROMALEFTOUTERJOINBONA.x=B.x;/*Table A    Table B   Result+------+   +---+     +-------------+| x    | * | x |  =  | x    | x    |+------+   +---+     +-------------+| 1    |   | 2 |     | 1    | NULL || 2    |   | 3 |     | 2    | 2    || 3    |   | 4 |     | 3    | 3    || NULL |   | 5 |     | NULL | NULL |+------+   +---+     +-------------+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECT*FROMAFULLOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTA.x,B.xFROMAFULLOUTERJOINBONA.x=B.x;/*Table A    Table B   Result+------+   +---+     +-------------+| x    | * | x |  =  | x    | x    |+------+   +---+     +-------------+| 1    |   | 2 |     | 1    | NULL || 2    |   | 3 |     | 2    | 2    || 3    |   | 4 |     | 3    | 3    || NULL |   | 5 |     | NULL | NULL |+------+   +---+     | NULL | 4    |                     | NULL | 5    |                     +-------------+*/

USING clause

USING(column_name_list)column_name_list:column_name[,...]

Description

When you are joining two tables,USING performs anequality comparison operation on the columns named incolumn_name_list. Each column name incolumn_name_list must appear in bothinput tables. For each pair of rows from the input tables, if theequality comparisons all evaluate toTRUE, one row is added to the resultingcolumn.

Definitions:

  • column_name_list: A list of columns to include in the join condition.
  • column_name: The column that exists in both of the tables that you arejoining.

Details:

ANULL join condition evaluation is equivalent to aFALSE evaluation.

If a column-order sensitive operation such asUNION orSELECT * is usedwith theUSING join condition, the resulting table contains columns in thisorder:

  • The columns fromcolumn_name_list in the order they appear in theUSINGclause.
  • All other columns of the left input in the order they appear in the input.
  • All other columns of the right input in the order they appear in the input.

A column name in theUSING clause must not be qualified by atable name.

If the join is anINNER JOIN or aLEFT OUTER JOIN, the outputcolumns are populated from the values in the first table. If thejoin is aRIGHT OUTER JOIN, the output columns are populated from the valuesin the second table. If the join is aFULL OUTER JOIN, the output columnsare populated bycoalescing the values from the left and righttables in that order.

Examples

The following example shows how to use theUSING clause with onecolumn name in the column name list:

WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT9UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT9UNIONALLSELECT9UNIONALLSELECT5)SELECT*FROMAINNERJOINBUSING(x);/*Table A    Table B   Result+------+   +---+     +---+| x    | * | x |  =  | x |+------+   +---+     +---+| 1    |   | 2 |     | 2 || 2    |   | 9 |     | 9 || 9    |   | 9 |     | 9 || NULL |   | 5 |     +---++------+   +---+*/

The following example shows how to use theUSING clause withmultiple column names in the column name list:

WITHAAS(SELECT1asx,15asyUNIONALLSELECT2,10UNIONALLSELECT9,16UNIONALLSELECTNULL,12),BAS(SELECT2asx,10asyUNIONALLSELECT9,17UNIONALLSELECT9,16UNIONALLSELECT5,15)SELECT*FROMAINNERJOINBUSING(x,y);/*Table A         Table B        Result+-----------+   +---------+     +---------+| x    | y  | * | x  | y  |  =  | x  | y  |+-----------+   +---------+     +---------+| 1    | 15 |   | 2  | 10 |     | 2  | 10 || 2    | 10 |   | 9  | 17 |     | 9  | 16 || 9    | 16 |   | 9  | 16 |     +---------+| NULL | 12 |   | 5  | 15 |+-----------+   +---------+*/

The following examples show additional ways in which to use theUSING clausewith one column name in the column name list:

WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT9UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT9UNIONALLSELECT9UNIONALLSELECT5)SELECTx,A.x,B.xFROMAINNERJOINBUSING(x)/*Table A    Table B   Result+------+   +---+     +--------------------+| x    | * | x |  =  | x    | A.x  | B.x  |+------+   +---+     +--------------------+| 1    |   | 2 |     | 2    | 2    | 2    || 2    |   | 9 |     | 9    | 9    | 9    || 9    |   | 9 |     | 9    | 9    | 9    || NULL |   | 5 |     +--------------------++------+   +---+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT9UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT9UNIONALLSELECT9UNIONALLSELECT5)SELECTx,A.x,B.xFROMALEFTOUTERJOINBUSING(x)/*Table A    Table B   Result+------+   +---+     +--------------------+| x    | * | x |  =  | x    | A.x  | B.x  |+------+   +---+     +--------------------+| 1    |   | 2 |     | 1    | 1    | NULL || 2    |   | 9 |     | 2    | 2    | 2    || 9    |   | 9 |     | 9    | 9    | 9    || NULL |   | 5 |     | 9    | 9    | 9    |+------+   +---+     | NULL | NULL | NULL |                     +--------------------+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT2UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT9UNIONALLSELECT9UNIONALLSELECT5)SELECTx,A.x,B.xFROMARIGHTOUTERJOINBUSING(x)/*Table A    Table B   Result+------+   +---+     +--------------------+| x    | * | x |  =  | x    | A.x  | B.x  |+------+   +---+     +--------------------+| 1    |   | 2 |     | 2    | 2    | 2    || 2    |   | 9 |     | 2    | 2    | 2    || 2    |   | 9 |     | 9    | NULL | 9    || NULL |   | 5 |     | 9    | NULL | 9    |+------+   +---+     | 5    | NULL | 5    |                     +--------------------+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT2UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT9UNIONALLSELECT9UNIONALLSELECT5)SELECTx,A.x,B.xFROMAFULLOUTERJOINBUSING(x);/*Table A    Table B   Result+------+   +---+     +--------------------+| x    | * | x |  =  | x    | A.x  | B.x  |+------+   +---+     +--------------------+| 1    |   | 2 |     | 1    | 1    | NULL || 2    |   | 9 |     | 2    | 2    | 2    || 2    |   | 9 |     | 2    | 2    | 2    || NULL |   | 5 |     | NULL | NULL | NULL |+------+   +---+     | 9    | NULL | 9    |                     | 9    | NULL | 9    |                     | 5    | NULL | 5    |                     +--------------------+*/

The following example shows how to use theUSING clause withonly some column names in the column name list.

WITHAAS(SELECT1asx,15asyUNIONALLSELECT2,10UNIONALLSELECT9,16UNIONALLSELECTNULL,12),BAS(SELECT2asx,10asyUNIONALLSELECT9,17UNIONALLSELECT9,16UNIONALLSELECT5,15)SELECT*FROMAINNERJOINBUSING(x);/*Table A         Table B         Result+-----------+   +---------+     +-----------------+| x    | y  | * | x  | y  |  =  | x   | A.y | B.y |+-----------+   +---------+     +-----------------+| 1    | 15 |   | 2  | 10 |     | 2   | 10  | 10  || 2    | 10 |   | 9  | 17 |     | 9   | 16  | 17  || 9    | 16 |   | 9  | 16 |     | 9   | 16  | 16  || NULL | 12 |   | 5  | 15 |     +-----------------++-----------+   +---------+*/

The following query performs anINNER JOIN on theRoster andTeamMascot table.The query returns the rows fromRoster andTeamMascot whereRoster.SchoolID is the same asTeamMascot.SchoolID. The results include asingleSchoolID column.

SELECT*FROMRosterINNERJOINTeamMascotUSING(SchoolID);/*----------------------------------------+ | SchoolID   | LastName   | Mascot       | +----------------------------------------+ | 50         | Adams      | Jaguars      | | 52         | Buchanan   | Lakers       | | 52         | Coolidge   | Lakers       | | 51         | Davis      | Knights      | +----------------------------------------*/

ON andUSING equivalency

TheON andUSING join conditions aren'tequivalent, but they share some rules and sometimes they can produce similarresults.

In the following examples, observe what is returned when all rowsare produced for inner and outer joins. Also, look at howeach join condition handlesNULL values.

WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4)SELECT*FROMAINNERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4)SELECT*FROMAINNERJOINBUSING(x);/*Table A   Table B   Result ON     Result USING+---+     +---+     +-------+     +---+| x |  *  | x |  =  | x | x |     | x |+---+     +---+     +-------+     +---+| 1 |     | 2 |     | 2 | 2 |     | 2 || 2 |     | 3 |     | 3 | 3 |     | 3 || 3 |     | 4 |     +-------+     +---++---+     +---+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECT*FROMALEFTOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECT*FROMALEFTOUTERJOINBUSING(x);/*Table A    Table B   Result ON           Result USING+------+   +---+     +-------------+     +------+| x    | * | x |  =  | x    | x    |     | x    |+------+   +---+     +-------------+     +------+| 1    |   | 2 |     | 1    | NULL |     | 1    || 2    |   | 3 |     | 2    | 2    |     | 2    || 3    |   | 4 |     | 3    | 3    |     | 3    || NULL |   | 5 |     | NULL | NULL |     | NULL |+------+   +---+     +-------------+     +------+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4)SELECT*FROMAFULLOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4)SELECT*FROMAFULLOUTERJOINBUSING(x);/*Table A   Table B   Result ON           Result USING+---+     +---+     +-------------+     +---+| x |  *  | x |  =  | x    | x    |     | x |+---+     +---+     +-------------+     +---+| 1 |     | 2 |     | 1    | NULL |     | 1 || 2 |     | 3 |     | 2    | 2    |     | 2 || 3 |     | 4 |     | 3    | 3    |     | 3 |+---+     +---+     | NULL | 4    |     | 4 |                    +-------------+     +---+*/

AlthoughON andUSING aren't equivalent, they can return the sameresults in some situations if you specify the columns you want to return.

In the following examples, observe what is returned when a specific rowis produced for inner and outer joins. Also, look at how eachjoin condition handlesNULL values.

WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTA.xFROMAINNERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTxFROMAINNERJOINBUSING(x);/*Table A    Table B   Result ON     Result USING+------+   +---+     +---+         +---+| x    | * | x |  =  | x |         | x |+------+   +---+     +---+         +---+| 1    |   | 2 |     | 2 |         | 2 || 2    |   | 3 |     | 3 |         | 3 || 3    |   | 4 |     +---+         +---+| NULL |   | 5 |+------+   +---+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTA.xFROMALEFTOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTxFROMALEFTOUTERJOINBUSING(x);/*Table A    Table B   Result ON    Result USING+------+   +---+     +------+     +------+| x    | * | x |  =  | x    |     | x    |+------+   +---+     +------+     +------+| 1    |   | 2 |     | 1    |     | 1    || 2    |   | 3 |     | 2    |     | 2    || 3    |   | 4 |     | 3    |     | 3    || NULL |   | 5 |     | NULL |     | NULL |+------+   +---+     +------+     +------+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTA.xFROMAFULLOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTxFROMAFULLOUTERJOINBUSING(x);/*Table A    Table B   Result ON    Result USING+------+   +---+     +------+     +------+| x    | * | x |  =  | x    |     | x    |+------+   +---+     +------+     +------+| 1    |   | 2 |     | 1    |     | 1    || 2    |   | 3 |     | 2    |     | 2    || 3    |   | 4 |     | 3    |     | 3    || NULL |   | 5 |     | NULL |     | NULL |+------+   +---+     | NULL |     | 4    |                     | NULL |     | 5    |                     +------+     +------+*/
WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTB.xFROMAFULLOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTxFROMAFULLOUTERJOINBUSING(x);/*Table A    Table B   Result ON    Result USING+------+   +---+     +------+     +------+| x    | * | x |  =  | x    |     | x    |+------+   +---+     +------+     +------+| 1    |   | 2 |     | 2    |     | 1    || 2    |   | 3 |     | 3    |     | 2    || 3    |   | 4 |     | NULL |     | 3    || NULL |   | 5 |     | NULL |     | NULL |+------+   +---+     | 4    |     | 4    |                     | 5    |     | 5    |                     +------+     +------+*/

In the following example, observe what is returned whenCOALESCE is usedwith theON clause. It provides the same results as a querywith theUSING clause.

WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTCOALESCE(A.x,B.x)FROMAFULLOUTERJOINBONA.x=B.x;WITHAAS(SELECT1asxUNIONALLSELECT2UNIONALLSELECT3UNIONALLSELECTNULL),BAS(SELECT2asxUNIONALLSELECT3UNIONALLSELECT4UNIONALLSELECT5)SELECTxFROMAFULLOUTERJOINBUSING(x);/*Table A    Table B   Result ON    Result USING+------+   +---+     +------+     +------+| x    | * | x |  =  | x    |     | x    |+------+   +---+     +------+     +------+| 1    |   | 2 |     | 1    |     | 1    || 2    |   | 3 |     | 2    |     | 2    || 3    |   | 4 |     | 3    |     | 3    || NULL |   | 5 |     | NULL |     | NULL |+------+   +---+     | 4    |     | 4    |                     | 5    |     | 5    |                     +------+     +------+*/

Join operations in a sequence

TheFROM clause can contain multipleJOIN operations in a sequence.JOINs are bound from left to right. For example:

FROMAJOINBUSING(x)JOINCUSING(x)-- A JOIN B USING (x)        = result_1-- result_1 JOIN C USING (x) = result_2-- result_2                  = return value

You can also insert parentheses to groupJOINs:

FROM((AJOINBUSING(x))JOINCUSING(x))-- A JOIN B USING (x)        = result_1-- result_1 JOIN C USING (x) = result_2-- result_2                  = return value

With parentheses, you can groupJOINs so that they are bound in a differentorder:

FROM(AJOIN(BJOINCUSING(x))USING(x))-- B JOIN C USING (x)       = result_1-- A JOIN result_1          = result_2-- result_2                 = return value

AFROM clause can have multiple joins. Provided there are no comma cross joinsin theFROM clause, joins don't require parenthesis, though parenthesis canhelp readability:

FROMAJOINBJOINCJOINDUSING(w)ONB.x=C.yONA.z=B.x

If your clause contains comma cross joins, you must use parentheses:

FROMA,BJOINCJOINDONC.x=D.yONB.z=C.x//INVALID
FROMA,BJOIN(CJOINDONC.x=D.y)ONB.z=C.x//VALID

When comma cross joins are present in a query with a sequence of JOINs, theygroup from left to right like otherJOIN types:

FROMAJOINBUSING(x)JOINCUSING(x),D-- A JOIN B USING (x)        = result_1-- result_1 JOIN C USING (x) = result_2-- result_2 CROSS JOIN D     = return value

There can't be aRIGHT JOIN orFULL JOIN after a comma cross join unlessit's parenthesized:

FROMA,BRIGHTJOINCONTRUE//INVALID
FROMA,BFULLJOINCONTRUE//INVALID
FROMA,BJOINCONTRUE//VALID
FROMA,(BRIGHTJOINCONTRUE)//VALID
FROMA,(BFULLJOINCONTRUE)//VALID

Correlated join operation

A join operation iscorrelated when the rightfrom_item contains areference to at least one range variable orcolumn name introduced by the leftfrom_item.

In a correlated join operation, rows from the rightfrom_item are determinedby a row from the leftfrom_item. Consequently,RIGHT OUTER andFULL OUTERjoins can't be correlated because rightfrom_item rows can't be determinedin the case when there is no row from the leftfrom_item.

All correlated join operations must reference an array in the rightfrom_item.

This is a conceptual example of a correlated join operation that includesacorrelated subquery:

FROMAJOINUNNEST(ARRAY(SELECTASSTRUCT*FROMBWHEREA.ID=B.ID))ASC
  • Leftfrom_item:A
  • Rightfrom_item:UNNEST(...) AS C
  • A correlated subquery:(SELECT AS STRUCT * FROM B WHERE A.ID = B.ID)

This is another conceptual example of a correlated join operation.array_of_IDs is part of the leftfrom_item but is referenced in therightfrom_item.

FROMAJOINUNNEST(A.array_of_IDs)ASC

TheUNNEST operator can be explicit or implicit.These are both allowed:

FROMAJOINUNNEST(A.array_of_IDs)ASIDs
FROMAJOINA.array_of_IDsASIDs

In a correlated join operation, the rightfrom_item is re-evaluatedagainst each distinct row from the leftfrom_item. In the followingconceptual example, the correlated join operation firstevaluatesA andB, thenA andC:

FROMAJOINUNNEST(ARRAY(SELECTASSTRUCT*FROMBWHEREA.ID=B.ID))ASCONA.Name=C.Name

Caveats

  • In a correlatedLEFT JOIN, when the input table on the right side is emptyfor some row from the left side, it's as if no rows from the right sidesatisfied the join condition in a regularLEFT JOIN. When there are nojoining rows, a row withNULL values for all columns on the right side isgenerated to join with the row from the left side.
  • In a correlatedCROSS JOIN, when the input table on the right side isempty for some row from the left side, it's as if no rows from the rightside satisfied the join condition in a regular correlatedINNER JOIN. Thismeans that the row is dropped from the results.

Examples

This is an example of a correlated join, using theRoster andPlayerStats tables:

SELECT*FROMRosterJOINUNNEST(ARRAY(SELECTASSTRUCT*FROMPlayerStatsWHEREPlayerStats.OpponentID=Roster.SchoolID))ASPlayerMatchesONPlayerMatches.LastName='Buchanan'/*------------+----------+----------+------------+--------------+ | LastName   | SchoolID | LastName | OpponentID | PointsScored | +------------+----------+----------+------------+--------------+ | Adams      | 50       | Buchanan | 50         | 13           | | Eisenhower | 77       | Buchanan | 77         | 0            | +------------+----------+----------+------------+--------------*/

A common pattern for a correlatedLEFT JOIN is to have anUNNEST operationon the right side that references an array from some column introduced byinput on the left side. For rows where that array is empty orNULL,theUNNEST operation produces no rows on the right input. In that case, a rowwith aNULL entry in each column of the right input is created to join withthe row from the left input. For example:

SELECTA.name,item,ARRAY_LENGTH(A.items)item_count_for_nameFROMUNNEST([STRUCT('first'ASname,[1,2,3,4]ASitems),STRUCT('second'ASname,[]ASitems)])ASALEFTJOINA.itemsASitem;/*--------+------+---------------------+ | name   | item | item_count_for_name | +--------+------+---------------------+ | first  | 1    | 4                   | | first  | 2    | 4                   | | first  | 3    | 4                   | | first  | 4    | 4                   | | second | NULL | 0                   | +--------+------+---------------------*/

In the case of a correlatedINNER JOIN orCROSS JOIN, when the input on theright side is empty for some row from the left side, the final row is droppedfrom the results. For example:

SELECTA.name,itemFROMUNNEST([STRUCT('first'ASname,[1,2,3,4]ASitems),STRUCT('second'ASname,[]ASitems)])ASAINNERJOINA.itemsASitem;/*-------+------+ | name  | item | +-------+------+ | first | 1    | | first | 2    | | first | 3    | | first | 4    | +-------+------*/

WHERE clause

WHERE bool_expression

TheWHERE clause filters the results of theFROM clause.

Only rows whosebool_expression evaluates toTRUE are included. Rowswhosebool_expression evaluates toNULL orFALSE arediscarded.

The evaluation of a query with aWHERE clause is typically completed in thisorder:

  • FROM
  • WHERE
  • GROUP BY and aggregation
  • HAVING
  • WINDOW
  • QUALIFY
  • DISTINCT
  • ORDER BY
  • LIMIT

Evaluation order doesn't always match syntax order.

TheWHERE clause only references columns available via theFROM clause;it can't referenceSELECT list aliases.

Examples

This query returns returns all rows from theRoster tablewhere theSchoolID column has the value52:

SELECT*FROMRosterWHERESchoolID=52;

Thebool_expression can contain multiple sub-conditions:

SELECT*FROMRosterWHERESTARTS_WITH(LastName,"Mc")ORSTARTS_WITH(LastName,"Mac");

Expressions in anINNER JOIN have an equivalent expression in theWHERE clause. For example, a query usingINNERJOIN andON has anequivalent expression usingCROSS JOIN andWHERE. For example,the following two queries are equivalent:

SELECTRoster.LastName,TeamMascot.MascotFROMRosterINNERJOINTeamMascotONRoster.SchoolID=TeamMascot.SchoolID;
SELECTRoster.LastName,TeamMascot.MascotFROMRosterCROSSJOINTeamMascotWHERERoster.SchoolID=TeamMascot.SchoolID;

GROUP BY clause

GROUP BYgroup_by_specificationgroup_by_specification:  {groupable_items    | ALL    |grouping_sets_specification    |rollup_specification    |cube_specification    | ()  }

Description

TheGROUP BY clause groups together rows in a table that share common valuesfor certain columns. For a group of rows in the source table withnon-distinct values, theGROUP BY clause aggregates them into a singlecombined row. This clause is commonly used when aggregate functions arepresent in theSELECT list, or to eliminate redundancy in the output.

Definitions

  • groupable_items: Group rows in a table that share common valuesfor certain columns. To learn more, seeGroup rows by groupable items.
  • ALL: Automatically group rows. To learn more, seeGroup rows automatically.
  • grouping_sets_specification: Group rows with theGROUP BY GROUPING SETS clause. To learn more, seeGroup rows byGROUPING SETS.
  • rollup_specification: Group rows with theGROUP BY ROLLUP clause.To learn more, seeGroup rows byROLLUP.
  • cube_specification: Group rows with theGROUP BY CUBE clause.To learn more, seeGroup rows byCUBE.
  • (): Group all rows and produce a grand total. Equivalent to nogroup_by_specification.

Group rows by groupable items

GROUP BYgroupable_item[, ...]groupable_item:  {value    |value_alias    |column_ordinal  }

Description

TheGROUP BY clause can includegroupable expressionsand their ordinals.

Definitions

Group rows by values

TheGROUP BY clause can group rows in a table with non-distinctvalues in theGROUP BY clause. For example:

WITHPlayerStatsAS(SELECT'Adams'asLastName,'Noam'asFirstName,3asPointsScoredUNIONALLSELECT'Buchanan','Jie',0UNIONALLSELECT'Coolidge','Kiran',1UNIONALLSELECT'Adams','Noam',4UNIONALLSELECT'Buchanan','Jie',13)SELECTSUM(PointsScored)AStotal_points,LastNameFROMPlayerStatsGROUPBYLastName;/*--------------+----------+ | total_points | LastName | +--------------+----------+ | 7            | Adams    | | 13           | Buchanan | | 1            | Coolidge | +--------------+----------*/

GROUP BY clauses may also refer to aliases. If a query contains aliases intheSELECT clause, those aliases override names in the correspondingFROMclause. For example:

WITHPlayerStatsAS(SELECT'Adams'asLastName,'Noam'asFirstName,3asPointsScoredUNIONALLSELECT'Buchanan','Jie',0UNIONALLSELECT'Coolidge','Kiran',1UNIONALLSELECT'Adams','Noam',4UNIONALLSELECT'Buchanan','Jie',13)SELECTSUM(PointsScored)AStotal_points,LastNameASlast_nameFROMPlayerStatsGROUPBYlast_name;/*--------------+-----------+ | total_points | last_name | +--------------+-----------+ | 7            | Adams     | | 13           | Buchanan  | | 1            | Coolidge  | +--------------+-----------*/

You can use theGROUP BY clause with arrays. The following query executesbecause the array elements being grouped are the same length and group type:

WITHPlayerStatsAS(SELECT['Coolidge','Adams']asName,3asPointsScoredUNIONALLSELECT['Adams','Buchanan'],0UNIONALLSELECT['Coolidge','Adams'],1UNIONALLSELECT['Kiran','Noam'],1)SELECTSUM(PointsScored)AStotal_points,nameFROMPlayerStatsGROUPBYName;/*--------------+------------------+ | total_points | name             | +--------------+------------------+ | 4            | [Coolidge,Adams] | | 0            | [Adams,Buchanan] | | 1            | [Kiran,Noam]     | +--------------+------------------*/

You can use theGROUP BY clause with structs. The following query executesbecause the struct fields being grouped have the same group types:

WITHTeamStatsAS(SELECTARRAY<STRUCT<last_nameSTRING,first_nameSTRING,ageINT64>>[('Adams','Noam',20),('Buchanan','Jie',19)]ASTeam,3ASPointsScoredUNIONALLSELECT[('Coolidge','Kiran',21),('Yang','Jason',22)],4UNIONALLSELECT[('Adams','Noam',20),('Buchanan','Jie',19)],10UNIONALLSELECT[('Coolidge','Kiran',21),('Yang','Jason',22)],7)SELECTSUM(PointsScored)AStotal_points,TeamFROMTeamStatsGROUPBYTeam;/*--------------+--------------------------+ | total_points | teams                    | +--------------+--------------------------+ | 13           | [{                       | |              |    last_name: "Adams",   | |              |    first_name: "Noam",   | |              |    age: 20               | |              |  },{                     | |              |    last_name: "Buchanan",| |              |    first_name: "Jie",    | |              |    age: 19               | |              |  }]                      | +-----------------------------------------+ | 11           | [{                       | |              |    last_name: "Coolidge",| |              |    first_name: "Kiran",  | |              |    age: 21               | |              |  },{                     | |              |    last_name: "Yang",    | |              |    first_name: "Jason",  | |              |    age: 22               | |              |  }]                      | +--------------+--------------------------*/

To learn more about the data types that are supported for values in theGROUP BY clause, seeGroupable data types.

Group rows by column ordinals

TheGROUP BY clause can refer to expression names in theSELECT list. TheGROUP BY clause also allows ordinal references to expressions in theSELECTlist, using integer values.1 refers to the first value in theSELECT list,2 the second, and so forth. The value list can combineordinals and value names. The following queries are equivalent:

WITHPlayerStatsAS(SELECT'Adams'asLastName,'Noam'asFirstName,3asPointsScoredUNIONALLSELECT'Buchanan','Jie',0UNIONALLSELECT'Coolidge','Kiran',1UNIONALLSELECT'Adams','Noam',4UNIONALLSELECT'Buchanan','Jie',13)SELECTSUM(PointsScored)AStotal_points,LastName,FirstNameFROMPlayerStatsGROUPBYLastName,FirstName;/*--------------+----------+-----------+ | total_points | LastName | FirstName | +--------------+----------+-----------+ | 7            | Adams    | Noam      | | 13           | Buchanan | Jie       | | 1            | Coolidge | Kiran     | +--------------+----------+-----------*/
WITHPlayerStatsAS(SELECT'Adams'asLastName,'Noam'asFirstName,3asPointsScoredUNIONALLSELECT'Buchanan','Jie',0UNIONALLSELECT'Coolidge','Kiran',1UNIONALLSELECT'Adams','Noam',4UNIONALLSELECT'Buchanan','Jie',13)SELECTSUM(PointsScored)AStotal_points,LastName,FirstNameFROMPlayerStatsGROUPBY2,3;/*--------------+----------+-----------+ | total_points | LastName | FirstName | +--------------+----------+-----------+ | 7            | Adams    | Noam      | | 13           | Buchanan | Jie       | | 1            | Coolidge | Kiran     | +--------------+----------+-----------*/

Group rows byALL

GROUP BY ALL

Description

TheGROUP BY ALL clause groups rows by inferring grouping keys from theSELECT items.

The followingSELECT items are excluded from theGROUP BY ALL clause:

  • Expressions that includeaggregate functions.
  • Expressions that includewindow functions.
  • Expressions that don't reference a name from theFROM clause.This includes:
    • Constants
    • Query parameters
    • Correlated column references
    • Expressions that only referenceGROUP BY keys inferred fromotherSELECT items.

After exclusions are applied, an error is produced if any remainingSELECTitem includes a volatile function or has a non-groupable type.

If the set of inferred grouping keys is empty after exclusions are applied, allinput rows are considered a single group for aggregation. Thisbehavior is equivalent to writingGROUP BY ().

Examples

In the following example, the query groups rows byfirst_name andlast_name.total_points is excluded because it represents anaggregate function.

WITHPlayerStatsAS(SELECT'Adams'asLastName,'Noam'asFirstName,3asPointsScoredUNIONALLSELECT'Buchanan','Jie',0UNIONALLSELECT'Coolidge','Kiran',1UNIONALLSELECT'Adams','Noam',4UNIONALLSELECT'Buchanan','Jie',13)SELECTSUM(PointsScored)AStotal_points,FirstNameASfirst_name,LastNameASlast_nameFROMPlayerStatsGROUPBYALL;/*--------------+------------+-----------+ | total_points | first_name | last_name | +--------------+------------+-----------+ | 7            | Noam       | Adams     | | 13           | Jie        | Buchanan  | | 1            | Kiran      | Coolidge  | +--------------+------------+-----------*/

If the select list contains an analytic function, the query groups rows byfirst_name andlast_name.total_people is excluded because itcontains a window function.

WITHPlayerStatsAS(SELECT'Adams'asLastName,'Noam'asFirstName,3asPointsScoredUNIONALLSELECT'Buchanan','Jie',0UNIONALLSELECT'Coolidge','Kiran',1UNIONALLSELECT'Adams','Noam',4UNIONALLSELECT'Buchanan','Jie',13)SELECTCOUNT(*)OVER()AStotal_people,FirstNameASfirst_name,LastNameASlast_nameFROMPlayerStatsGROUPBYALL;/*--------------+------------+-----------+ | total_people | first_name | last_name | +--------------+------------+-----------+ | 3            | Noam       | Adams     | | 3            | Jie        | Buchanan  | | 3            | Kiran      | Coolidge  | +--------------+------------+-----------*/

If multipleSELECT items reference the sameFROM item, and any of them isa path expression prefix of another, only the prefix path is used for grouping.In the following example,coordinates is excluded becausex_coordinate andy_coordinate have already referencedValues.x andValues.y in theFROM clause, and they are prefixes of the path expression used inx_coordinate:

WITHValuesAS(SELECT1ASx,2ASyUNIONALLSELECT1ASx,4ASyUNIONALLSELECT2ASx,5ASy)SELECTValues.xASx_coordinate,Values.yASy_coordinate,[Values.x,Values.y]AScoordinatesFROMValuesGROUPBYALL/*--------------+--------------+-------------+ | x_coordinate | y_coordinate | coordinates | +--------------+--------------+-------------+ | 1            | 4            | [1, 4]      | | 1            | 2            | [1, 2]      | | 2            | 5            | [2, 5]      | +--------------+--------------+-------------*/

In the following example, the inferred set of grouping keys is empty. The queryreturns one row even when the input contains zero rows.

SELECTCOUNT(*)ASnum_rowsFROMUNNEST([])GROUPBYALL/*----------+ | num_rows | +----------+ | 0        | +----------*/

Group rows byGROUPING SETS

GROUP BY GROUPING SETS (grouping_list )grouping_list:  {rollup_specification    |cube_specification    |groupable_item    |groupable_item_set  }[, ...]groupable_item_set:  ( [groupable_item[, ...] ] )

Description

TheGROUP BY GROUPING SETS clause produces aggregated data for one or moregrouping sets. A grouping set is a group of columns by which rows canbe grouped together. This clause is helpful if you want to produceaggregated data for sets of data without using theUNION operation.For example,GROUP BY GROUPING SETS(x,y) is roughly equivalent toGROUP BY x UNION ALL GROUP BY y.

Definitions

  • grouping_list: A list of items that you can add to theGROUPING SETS clause. Grouping sets are generated based upon what is inthis list.
  • rollup_specification: Group rows with theROLLUP clause.Don't includeGROUP BY if you use this inside theGROUPING SETS clause.To learn more, seeGroup rows byROLLUP.
  • cube_specification: Group rows with theCUBE clause.Don't includeGROUP BY if you use this inside theGROUPING SETS clause.To learn more, seeGroup rows byCUBE.
  • groupable_item: Group rows in a table that share common valuesfor certain columns. To learn more, seeGroup rows by groupable items.AnonymousSTRUCT values aren't allowed.
  • groupable_item_set: Group rows by a set ofgroupable items. If the set contains nogroupable items, group all rows and produce a grand total.

Details

GROUP BY GROUPING SETS works by taking a grouping list, generatinggrouping sets from it, and then producing a table as a union of queriesgrouped by each grouping set.

For example,GROUP BY GROUPING SETS (a, b, c) generates thefollowing grouping sets from the grouping list,a, b, c, andthen produces aggregated rows for each of them:

  • (a)
  • (b)
  • (c)

Here is an example that includes groupable item sets inGROUP BY GROUPING SETS (a, (b, c), d):

Conceptual grouping setsActual grouping sets
(a)(a)
((b, c))(b, c)
(d)(d)

GROUP BY GROUPING SETS can includeROLLUP andCUBE operations, whichgenerate grouping sets. IfROLLUP is added, it generates rolled upgrouping sets. IfCUBE is added, it generates grouping set permutations.

The following grouping sets are generated forGROUP BY GROUPING SETS (a, ROLLUP(b, c), d).

Conceptual grouping setsActual grouping sets
(a)(a)
((b, c))(b, c)
((b))(b)
(())()
(d)(d)

The following grouping sets are generated forGROUP BY GROUPING SETS (a, CUBE(b, c), d):

Conceptual grouping setsActual grouping sets
(a)(a)
((b, c))(b, c)
((b))(b)
((c))(c)
(())()
(d)(d)

When evaluating the results for a particular grouping set,expressions that aren't in the grouping set are aggregated and produce aNULL placeholder.

You can filter results for specific groupable items. To learn more, see theGROUPING function

GROUPING SETS allows up to 4096 groupable items.

Examples

The following queries produce the same results, butthe first one usesGROUP BY GROUPING SETS and the second one doesn't:

-- GROUP BY with GROUPING SETSWITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYGROUPINGSETS(product_type,product_name)ORDERBYproduct_name/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | shirt        | NULL         | 36          | | pants        | NULL         | 6           | | NULL         | jeans        | 6           | | NULL         | polo         | 25          | | NULL         | t-shirt      | 11          | +--------------+--------------+-------------*/
-- GROUP BY without GROUPING SETS-- (produces the same results as GROUPING SETS)WITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,NULL,SUM(product_count)ASproduct_sumFROMProductsGROUPBYproduct_typeUNIONALLSELECTNULL,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYproduct_nameORDERBYproduct_name

You can include groupable item sets in aGROUP BY GROUPING SETS clause.In the example below,(product_type, product_name) is a groupable item set.

-- GROUP BY with GROUPING SETS and a groupable item setWITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYGROUPINGSETS(product_type,(product_type,product_name))ORDERBYproduct_type,product_name;/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | pants        | NULL         | 6           | | pants        | jeans        | 6           | | shirt        | NULL         | 36          | | shirt        | polo         | 25          | | shirt        | t-shirt      | 11          | +--------------+--------------+-------------*/
-- GROUP BY with GROUPING SETS but without a groupable item set-- (produces the same results as GROUPING SETS with a groupable item set)WITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,NULL,SUM(product_count)ASproduct_sumFROMProductsGROUPBYproduct_typeUNIONALLSELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYproduct_type,product_nameORDERBYproduct_type,product_name;

You can includeROLLUP in aGROUP BY GROUPING SETS clause. For example:

-- GROUP BY with GROUPING SETS and ROLLUPWITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYGROUPINGSETS(product_type,ROLLUP(product_type,product_name))ORDERBYproduct_type,product_name;/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | NULL         | NULL         | 42          | | pants        | NULL         | 6           | | pants        | NULL         | 6           | | pants        | jeans        | 6           | | shirt        | NULL         | 36          | | shirt        | NULL         | 36          | | shirt        | polo         | 25          | | shirt        | t-shirt      | 11          | +--------------+--------------+-------------*/
-- GROUP BY with GROUPING SETS, but without ROLLUP-- (produces the same results as GROUPING SETS with ROLLUP)WITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYGROUPINGSETS(product_type,(product_type,product_name),product_type,())ORDERBYproduct_type,product_name;

You can includeCUBE in aGROUP BY GROUPING SETS clause.For example:

-- GROUP BY with GROUPING SETS and CUBEWITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYGROUPINGSETS(product_type,CUBE(product_type,product_name))ORDERBYproduct_type,product_name;/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | NULL         | NULL         | 42          | | NULL         | jeans        | 6           | | NULL         | polo         | 25          | | NULL         | t-shirt      | 11          | | pants        | NULL         | 6           | | pants        | NULL         | 6           | | pants        | jeans        | 6           | | shirt        | NULL         | 36          | | shirt        | NULL         | 36          | | shirt        | polo         | 25          | | shirt        | t-shirt      | 11          | +--------------+--------------+-------------*/
-- GROUP BY with GROUPING SETS, but without CUBE-- (produces the same results as GROUPING SETS with CUBE)WITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYGROUPINGSETS(product_type,(product_type,product_name),product_type,product_name,())ORDERBYproduct_type,product_name;

Group rows byROLLUP

GROUP BY ROLLUP (grouping_list )grouping_list:  {groupable_item |groupable_item_set }[, ...]groupable_item_set:  (groupable_item[, ...] )

Description

TheGROUP BY ROLLUP clause produces aggregated data for rolled upgrouping sets. A grouping set is a group of columns by which rows canbe grouped together. This clause is helpful if you need to roll up totalsin a set of data.

Definitions

  • grouping_list: A list of items that you can add to theGROUPING SETS clause. This is used to create a generated listof grouping sets when the query is run.
  • groupable_item: Group rows in a table that share common valuesfor certain columns. To learn more, seeGroup rows by groupable items.anonymousSTRUCT valuesaren't allowed.
  • groupable_item_set: Group rows by a subset ofgroupable items.

Details

GROUP BY ROLLUP works by taking a grouping list, generatinggrouping sets from the prefixes inside this list, and then producing atable as a union of queries grouped by each grouping set. The resultinggrouping sets include an empty grouping set. In the empty grouping set, allrows are aggregated down to a single group.

For example,GROUP BY ROLLUP (a, b, c) generates thefollowing grouping sets from the grouping list,a, b, c, and then producesaggregated rows for each of them:

  • (a, b, c)
  • (a, b)
  • (a)
  • ()

Here is an example that includes groupable item sets inGROUP BY ROLLUP (a, (b, c), d):

Conceptual grouping setsActual grouping sets
(a, (b, c), d)(a, b, c, d)
(a, (b, c))(a, b, c)
(a)(a)
()()

When evaluating the results for a particular grouping set,expressions that aren't in the grouping set are aggregated and produce aNULL placeholder.

You can filter results by specific groupable items. To learn more, see theGROUPING function

ROLLUP allows up to 4095 groupable items (equivalent to 4096 grouping sets).

Examples

The following queries produce the same subtotals and a grand total, butthe first one usesGROUP BY withROLLUP and the second one doesn't:

-- GROUP BY with ROLLUPWITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYROLLUP(product_type,product_name)ORDERBYproduct_type,product_name;/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | NULL         | NULL         | 42          | | pants        | NULL         | 6           | | pants        | jeans        | 6           | | shirt        | NULL         | 36          | | shirt        | t-shirt      | 11          | | shirt        | polo         | 25          | +--------------+--------------+-------------*/
-- GROUP BY without ROLLUP (produces the same results as ROLLUP)WITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYproduct_type,product_nameUNIONALLSELECTproduct_type,NULL,SUM(product_count)FROMProductsGROUPBYproduct_typeUNIONALLSELECTNULL,NULL,SUM(product_count)FROMProductsORDERBYproduct_type,product_name;

You can include groupable item sets in aGROUP BY ROLLUP clause.In the following example,(product_type, product_name) is agroupable item set.

WITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYROLLUP(product_type,(product_type,product_name))ORDERBYproduct_type,product_name;/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | NULL         | NULL         | 42          | | pants        | NULL         | 6           | | pants        | jeans        | 6           | | shirt        | NULL         | 36          | | shirt        | polo         | 25          | | shirt        | t-shirt      | 11          | +--------------+--------------+-------------*/

Group rows byCUBE

GROUP BY CUBE (grouping_list )grouping_list:  {groupable_item |groupable_item_set }[, ...]groupable_item_set:  (groupable_item[, ...] )

Description

TheGROUP BY CUBE clause produces aggregated data for allgrouping setpermutations. A grouping set is a group of columns by which rowscan be grouped together. This clause is helpful if you need to create acontingency table to find interrelationshipsbetween items in a set of data.

Definitions

  • grouping_list: A list of items that you can add to theGROUPING SETS clause. This is used to create a generated listof grouping sets when the query is run.
  • groupable_item: Group rows in a table that share common valuesfor certain columns. To learn more, seeGroup rows by groupable items.AnonymousSTRUCT values aren't allowed.
  • groupable_item_set: Group rows by a set ofgroupable items.

Details

GROUP BY CUBE is similar toGROUP BY ROLLUP, except that it takes agrouping list and generates grouping sets from all permutations in thislist, including an empty grouping set. In the empty grouping set, all rowsare aggregated down to a single group.

For example,GROUP BY CUBE (a, b, c) generates the followinggrouping sets from the grouping list,a, b, c, and then producesaggregated rows for each of them:

  • (a, b, c)
  • (a, b)
  • (a, c)
  • (a)
  • (b, c)
  • (b)
  • (c)
  • ()

Here is an example that includes groupable item sets inGROUP BY CUBE (a, (b, c), d):

Conceptual grouping setsActual grouping sets
(a, (b, c), d)(a, b, c, d)
(a, (b, c))(a, b, c)
(a, d)(a, d)
(a)(a)
((b, c), d)(b, c, d)
((b, c))(b, c)
(d)(d)
()()

When evaluating the results for a particular grouping set,expressions that aren't in the grouping set are aggregated and produce aNULL placeholder.

You can filter results by specific groupable items. To learn more, see theGROUPING function

CUBE allows up to 12 groupable items (equivalent to 4096 grouping sets).

Examples

The following query groups rows by all combinations ofproduct_type andproduct_name to produce a contingency table:

-- GROUP BY with CUBEWITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYCUBE(product_type,product_name)ORDERBYproduct_type,product_name;/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | NULL         | NULL         | 42          | | NULL         | jeans        | 6           | | NULL         | polo         | 25          | | NULL         | t-shirt      | 11          | | pants        | NULL         | 6           | | pants        | jeans        | 6           | | shirt        | NULL         | 36          | | shirt        | polo         | 25          | | shirt        | t-shirt      | 11          | +--------------+--------------+-------------*/

You can include groupable item sets in aGROUP BY CUBE clause.In the following example,(product_type, product_name) is agroupable item set.

WITHProductsAS(SELECT'shirt'ASproduct_type,'t-shirt'ASproduct_name,3ASproduct_countUNIONALLSELECT'shirt','t-shirt',8UNIONALLSELECT'shirt','polo',25UNIONALLSELECT'pants','jeans',6)SELECTproduct_type,product_name,SUM(product_count)ASproduct_sumFROMProductsGROUPBYCUBE(product_type,(product_type,product_name))ORDERBYproduct_type,product_name;/*--------------+--------------+-------------+ | product_type | product_name | product_sum | +--------------+--------------+-------------+ | NULL         | NULL         | 42          | | pants        | NULL         | 6           | | pants        | jeans        | 6           | | pants        | jeans        | 6           | | shirt        | NULL         | 36          | | shirt        | polo         | 25          | | shirt        | polo         | 25          | | shirt        | t-shirt      | 11          | | shirt        | t-shirt      | 11          | +--------------+--------------+-------------*/

HAVING clause

HAVING bool_expression

TheHAVING clause filters the results produced byGROUP BY oraggregation.GROUP BY or aggregation must be present in the query. Ifaggregation is present, theHAVING clause is evaluated once for everyaggregated row in the result set.

Only rows whosebool_expression evaluates toTRUE are included. Rowswhosebool_expression evaluates toNULL orFALSE arediscarded.

The evaluation of a query with aHAVING clause is typically completed in thisorder:

  • FROM
  • WHERE
  • GROUP BY and aggregation
  • HAVING
  • WINDOW
  • QUALIFY
  • DISTINCT
  • ORDER BY
  • LIMIT

Evaluation order doesn't always match syntax order.

TheHAVING clause references columns available via theFROM clause, aswell asSELECT list aliases. Expressions referenced in theHAVING clausemust either appear in theGROUP BY clause or they must be the result of anaggregate function:

SELECTLastNameFROMRosterGROUPBYLastNameHAVINGSUM(PointsScored) >15;

If a query contains aliases in theSELECT clause, those aliases override namesin aFROM clause.

SELECTLastName,SUM(PointsScored)ASpsFROMRosterGROUPBYLastNameHAVINGps >0;

Mandatory aggregation

Aggregation doesn't have to be present in theHAVING clause itself, butaggregation must be present in at least one of the following forms:

Aggregation function in theSELECT list.

SELECTLastName,SUM(PointsScored)AStotalFROMPlayerStatsGROUPBYLastNameHAVINGtotal >15;

Aggregation function in theHAVING clause.

SELECTLastNameFROMPlayerStatsGROUPBYLastNameHAVINGSUM(PointsScored) >15;

Aggregation in both theSELECT list andHAVING clause.

When aggregation functions are present in both theSELECT list andHAVINGclause, the aggregation functions and the columns they reference don't needto be the same. In the example below, the two aggregation functions,COUNT() andSUM(), are different and also use different columns.

SELECTLastName,COUNT(*)FROMPlayerStatsGROUPBYLastNameHAVINGSUM(PointsScored) >15;

ORDER BY clause

ORDER BY expression  [{ ASC | DESC }]  [{ NULLS FIRST | NULLS LAST }]  [, ...]

TheORDER BY clause specifies a column or expression as the sort criterion forthe result set. If anORDER BY clause isn't present, the order of the resultsof a query isn't defined. Column aliases from aFROM clause orSELECT listare allowed. If a query contains aliases in theSELECT clause, those aliasesoverride names in the correspondingFROM clause. The data type ofexpression must beorderable.

Optional Clauses

  • NULLS FIRST | NULLS LAST:
    • NULLS FIRST: Sort null values before non-null values.
    • NULLS LAST. Sort null values after non-null values.
  • ASC | DESC: Sort the results in ascending or descendingorder ofexpression values.ASC is thedefault value. If null ordering isn't specifiedwithNULLS FIRST orNULLS LAST:
    • NULLS FIRST is applied by default if the sort order is ascending.
    • NULLS LAST is applied by default if the sort order isdescending.

Examples

Use the default sort order (ascending).

SELECTx,yFROM(SELECT1ASx,trueASyUNIONALLSELECT9,trueUNIONALLSELECTNULL,false)ORDERBYx;/*------+-------+ | x    | y     | +------+-------+ | NULL | false | | 1    | true  | | 9    | true  | +------+-------*/

Use the default sort order (ascending), but return null values last.

SELECTx,yFROM(SELECT1ASx,trueASyUNIONALLSELECT9,trueUNIONALLSELECTNULL,false)ORDERBYxNULLSLAST;/*------+-------+ | x    | y     | +------+-------+ | 1    | true  | | 9    | true  | | NULL | false | +------+-------*/

Use descending sort order.

SELECTx,yFROM(SELECT1ASx,trueASyUNIONALLSELECT9,trueUNIONALLSELECTNULL,false)ORDERBYxDESC;/*------+-------+ | x    | y     | +------+-------+ | 9    | true  | | 1    | true  | | NULL | false | +------+-------*/

Use descending sort order, but return null values first.

SELECTx,yFROM(SELECT1ASx,trueASyUNIONALLSELECT9,trueUNIONALLSELECTNULL,false)ORDERBYxDESCNULLSFIRST;/*------+-------+ | x    | y     | +------+-------+ | NULL | false | | 9    | true  | | 1    | true  | +------+-------*/

It's possible to order by multiple columns. In the example below, the resultset is ordered first bySchoolID and then byLastName:

SELECTLastName,PointsScored,OpponentIDFROMPlayerStatsORDERBYSchoolID,LastName;

When used in conjunction withset operators,theORDER BY clause applies to the result set of the entire query; it doesn'tapply only to the closestSELECT statement. For this reason, it can be helpful(though it isn't required) to use parentheses to show the scope of theORDERBY.

This query without parentheses:

SELECT*FROMRosterUNIONALLSELECT*FROMTeamMascotORDERBYSchoolID;

is equivalent to this query with parentheses:

(SELECT*FROMRosterUNIONALLSELECT*FROMTeamMascot)ORDERBYSchoolID;

but isn't equivalent to this query, where theORDER BY clause applies only tothe secondSELECT statement:

SELECT*FROMRosterUNIONALL(SELECT*FROMTeamMascotORDERBYSchoolID);

You can also use integer literals as column references inORDER BY clauses. Aninteger literal becomes an ordinal (for example, counting starts at 1) intotheSELECT list.

Example - the following two queries are equivalent:

SELECTSUM(PointsScored),LastNameFROMPlayerStatsGROUPBYLastNameORDERBYLastName;
SELECTSUM(PointsScored),LastNameFROMPlayerStatsGROUPBY2ORDERBY2;

QUALIFY clause

QUALIFY bool_expression

TheQUALIFY clause filters the results of window functions.A window function is required to be present in theQUALIFY clause or theSELECT list.

Only rows whosebool_expression evaluates toTRUE are included. Rowswhosebool_expression evaluates toNULL orFALSE arediscarded.

The evaluation of a query with aQUALIFY clause is typically completed in thisorder:

  • FROM
  • WHERE
  • GROUP BY and aggregation
  • HAVING
  • WINDOW
  • QUALIFY
  • DISTINCT
  • ORDER BY
  • LIMIT

Evaluation order doesn't always match syntax order.

Examples

The following query returns the most popular vegetables in theProduce table and their rank.

SELECTitem,RANK()OVER(PARTITIONBYcategoryORDERBYpurchasesDESC)asrankFROMProduceWHEREProduce.category='vegetable'QUALIFYrank<=3/*---------+------+ | item    | rank | +---------+------+ | kale    | 1    | | lettuce | 2    | | cabbage | 3    | +---------+------*/

You don't have to include a window function in theSELECT list to useQUALIFY. The following query returns the most popular vegetables in theProduce table.

SELECTitemFROMProduceWHEREProduce.category='vegetable'QUALIFYRANK()OVER(PARTITIONBYcategoryORDERBYpurchasesDESC)<=3/*---------+ | item    | +---------+ | kale    | | lettuce | | cabbage | +---------*/

WINDOW clause

WINDOW named_window_expression [, ...]named_window_expression:  named_window AS { named_window | ( [ window_specification ] ) }

AWINDOW clause defines a list of named windows.A named window represents a group of rows in a table upon which to use awindow function. A named window can be defined withawindow specification or reference anothernamed window. If another named window is referenced, the definition of thereferenced window must precede the referencing window.

Examples

These examples reference a table calledProduce.They all return the sameresult. Note the differentways you can combine named windows and use them in a window function'sOVER clause.

SELECTitem,purchases,category,LAST_VALUE(item)OVER(item_window)ASmost_popularFROMProduceWINDOWitem_windowAS(PARTITIONBYcategoryORDERBYpurchasesROWSBETWEEN2PRECEDINGAND2FOLLOWING)
SELECTitem,purchases,category,LAST_VALUE(item)OVER(d)ASmost_popularFROMProduceWINDOWaAS(PARTITIONBYcategory),bAS(aORDERBYpurchases),cAS(bROWSBETWEEN2PRECEDINGAND2FOLLOWING),dAS(c)
SELECTitem,purchases,category,LAST_VALUE(item)OVER(cROWSBETWEEN2PRECEDINGAND2FOLLOWING)ASmost_popularFROMProduceWINDOWaAS(PARTITIONBYcategory),bAS(aORDERBYpurchases),cASb

Set operators

query_expr  [ { INNER | [ { FULL | LEFT } [ OUTER ] ] } ]  {UNION { ALL | DISTINCT } |INTERSECT DISTINCT |EXCEPT DISTINCT  }  [ {BY NAME [ ON (column_list) ] | [ STRICT ]CORRESPONDING [ BY (column_list) ] } ]query_expr

Set operators combine or filterresults from two or more input queries into a single result set.

Definitions

  • query_expr: One of two input queries whose results are combined or filteredinto a single result set.
  • UNION: Returns the combined results of the left and right input queries.Values in columns that are matched by position are concatenated vertically.
  • INTERSECT: Returns rows that are found in the results of both the left andright input queries.
  • EXCEPT: Returns rows from the left input query that aren't present in theright input query.
  • ALL: Executes the set operation on all rows.
  • DISTINCT: Excludes duplicate rows in the set operation.
  • BY NAME,CORRESPONDING: Matchescolumns by name instead of by position. TheBY NAME modifier is equivalent toSTRICT CORRESPONDING.For details, seeBY NAME orCORRESPONDING.
  • INNER,FULL | LEFT [OUTER],STRICT,ON,BY:Adjust how theBY NAME orCORRESPONDING modifier behaves whenthe column names don't match exactly. For details, seeBY NAME orCORRESPONDING.

Positional column matching

By default, columns are matched positionally and follow these rules. If theBY NAME orCORRESPONDING modifier isused, columns are matched by name, as described in the next section.

  • Columns from input queries are matched by their position in the queries. Thatis, the first column in the first input query is paired with the first columnin the second input query and so on.
  • The input queries on each side of the operator must return the same number ofcolumns.

Name-based column matching

To make set operations match columns by name instead of by column position,use theBY NAME orCORRESPONDING modifier.

WithBY NAME orSTRICT CORRESPONDING, the same column namesmust exist in each input, but they can be in different orders. Additionalmodifiers can be used to handle cases where the columns don't exactly match.

TheBY NAME modifier is equivalent toSTRICT CORRESPONDING, buttheBY NAME modifier is recommended because it's shorter and clearer.

Example:

SELECT1ASone_digit,10AStwo_digitUNIONALLBYNAMESELECT20AStwo_digit,2ASone_digit;-- Column values match by name and not position in query./*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | | 2         | 20        | +-----------+-----------*/

Other column-related rules

  • For set operations other thanUNIONALL, all column types must supportequality comparison.
  • The results of the set operation always use the column names from thefirst input query.
  • The results of the set operation always use the supertypes of input typesin corresponding columns, so paired columns must also have either the samedata type or a common supertype.

Parenthesized set operators

  • Parentheses must be used to separate different set operations.Set operations likeUNION ALL andUNION DISTINCT are considered different.
  • Parentheses are also used to group set operations and control order ofoperations. InEXCEPT set operations, for example,query results can vary depending on the operation grouping.

The following examples illustrate the use of parentheses with setoperations:

-- Same set operations, no parentheses.query1UNIONALLquery2UNIONALLquery3;
-- Different set operations, parentheses needed.query1UNIONALL(query2UNIONDISTINCTquery3);
-- Invalidquery1UNIONALLquery2UNIONDISTINCTquery3;
-- Same set operations, no parentheses.query1EXCEPTDISTINCTquery2EXCEPTDISTINCTquery3;-- Equivalent query with optional parentheses, returns same results.(query1EXCEPTDISTINCTquery2)EXCEPTDISTINCTquery3;
-- Different execution order with a subquery, parentheses needed.query1EXCEPTDISTINCT(query2EXCEPTDISTINCTquery3);

Set operator behavior with duplicate rows

Consider a given rowR that appears exactlym times in the first input queryandn times in the second input query, wherem >= 0 andn >= 0:

  • ForUNION ALL, rowR appears exactlym + n times in theresult.
  • ForUNION DISTINCT, theDISTINCTis computed after theUNION is computed, so rowR appears exactlyone time.
  • ForINTERSECT DISTINCT, rowR appears once in the output ifm > 0 andn > 0.
  • ForEXCEPT DISTINCT, rowR appears once in the output ifm > 0 andn = 0.
  • If more than two input queries are used, the above operations generalizeand the output is the same as if the input queries were combinedincrementally from left to right.

BY NAME or CORRESPONDING

Use theBY NAME orCORRESPONDING modifierwith set operations to match columns by name instead of by position.TheBY NAME modifier is equivalent toSTRICT CORRESPONDING, but theBY NAME modifier is recommended because it's shorter and clearer.You can use mode prefixes to adjust howtheBY NAME orCORRESPONDING modifierbehaves when the column names don't match exactly.

  • BY NAME: Matchescolumns by name instead of by position.
    • Both input queries must have the same set of column names, but column ordercan be different. If a column in one input query doesn't appear in the otherquery, an error is raised.
    • Input queries can't contain duplicate columns.
    • Input queries that producevalue tables aren't supported.
  • INNER: Adjusts theBY NAME modifier behavior so that columns that appearin both input queries are included in the query results and any othercolumns are excluded.
    • No error is raised for the excluded columns that appear in one inputquery but not in the other input query.
    • At least one column must be common in both left and right input queries.
  • FULL [OUTER]: Adjusts theBY NAME modifier behavior so that all columnsfrom both input queries are included in the query results, even if somecolumns aren't present in both queries.
    • Columns from the left input query are returnedfirst, followed by unique columns from the right input query.
    • For columns in one input query that aren't present in the other query,aNULL value is added as its column value for the other query in the results.
  • LEFT [OUTER]: Adjusts theBY NAME modifier so that all columns from theleft input query are included in the results, even if some columns in theleft query aren't present in the right query.
    • For columns in the left query that aren't in the right query, aNULLvalue is added as its column value for the right query in the results.
    • At least one column name must be common in both left and right input queries.
  • OUTER: If used alone, equivalent toFULL OUTER.
  • ON (column_list): Used after theBY NAME modifier tospecify a comma-separated list of column names and the column order toreturn from the input queries.
    • IfBY NAME ON (column_list) is usedalone without mode prefixes likeINNER orFULL | LEFT [OUTER], then both the left and rightinput queries must contain all the columns in thecolumn_list.
    • If any mode prefixes are used, then any column names not in thecolumn_list are excluded from the results according to the mode used.
  • CORRESPONDING: Equivalent toINNER...BY NAME.
    • SupportsFULL | LEFT [OUTER] modes the same way they're supported by theBY NAME modifier.
    • SupportsINNER mode, but this mode has no effect. TheINNER mode is used with theBY NAMEmodifier to exclude unmatched columns between input queries, which isthe default behavior of theCORRESPONDING modifier. Therefore, usingINNER...CORRESPONDING produces the same results asCORRESPONDING.
  • STRICT: Adjusts theCORRESPONDING modifier to be equivalent to the defaultBY NAME modifier, where inputqueries must have the same set of column names.
  • BY (column_list): Equivalent toON (column_list) withBY NAME.

The following table shows the equivalent syntaxes between theBY NAME andCORRESPONDING modifiers, using theUNION ALL set operator as an example:

BY NAME syntaxEquivalent CORRESPONDING syntax
UNION ALL BY NAMEUNION ALL STRICT CORRESPONDING
INNER UNION ALL BY NAMEUNION ALL CORRESPONDING
{LEFT | FULL} [OUTER] UNION ALL BY NAME{LEFT | FULL} [OUTER] UNION ALL CORRESPONDING
[FULL] OUTER UNION ALL BY NAME[FULL] OUTER UNION ALL CORRESPONDING
UNION ALL BY NAME ON (col1, col2, ...)UNION ALL STRICT CORRESPONDING BY (col1, col2, ...)

The following table shows the behavior of the mode prefixes for theBY NAME andCORRESPONDING modifierswhen left and right input columns don't match:

Mode prefix and modifierBehavior when left and right input columns don't match
BY NAME (no prefix) or
STRICT CORRESPONDING
Error, all columns must match in both inputs.
INNER BY NAME or
CORRESPONDING (no prefix)
Drop all unmatched columns in both inputs.
FULL [OUTER] BY NAME or
FULL [OUTER] CORRESPONDING
Include all columns from both inputs. For column values that exist in one input but not in another, addNULL values.
LEFT [OUTER] BY NAME or
LEFT [OUTER] CORRESPONDING
Include all columns from the left input. For column values that exist in the left input but not in the right input, addNULL values. Drop any columns from the right input that don't exist in the left input.

For example set operations with modifiers, see the sections for each setoperator, such asUNION.

UNION

TheUNION operator returns the combined results of the left and right inputqueries. Columns are matched according to the rules described previously androws are concatenated vertically.

Examples

SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3])ASnumberUNIONALLSELECT1;/*--------+ | number | +--------+ | 1      | | 2      | | 3      | | 1      | +--------*/
SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3])ASnumberUNIONDISTINCTSELECT1;/*--------+ | number | +--------+ | 1      | | 2      | | 3      | +--------*/

The following example shows multiple chained operators:

SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3])ASnumberUNIONDISTINCTSELECT1UNIONDISTINCTSELECT2;/*--------+ | number | +--------+ | 1      | | 2      | | 3      | +--------*/

The following example shows input queries with multiple columns. Bothqueries specify the same column names but in different orders. As a result, thecolumn values are matched by column position in the input query and the columnnames are ignored.

SELECT1ASone_digit,10AStwo_digitUNIONALLSELECT20AStwo_digit,2ASone_digit;-- Column values are matched by position and not column name./*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | | 20        | 2         | +-----------+-----------*/

To resolve this ordering issue, the following example uses theBY NAME modifier to matchthe columns by name instead of by position in the query results.

SELECT1ASone_digit,10AStwo_digitUNIONALLBYNAMESELECT20AStwo_digit,2ASone_digit;-- Column values now match./*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | | 2         | 20        | +-----------+-----------*/

The previous set operation withBY NAME is equivalent to using theSTRICTCORRESPONDING modifier. TheBY NAME modifier is recommended because it'sshorter and clearer than theSTRICT CORRESPONDING modifier.

SELECT1ASone_digit,10AStwo_digitUNIONALLSTRICTCORRESPONDINGSELECT20AStwo_digit,2ASone_digit;/*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | | 2         | 20        | +-----------+-----------*/

The following example adds athree_digit column to the left input query and afour_digit column to the right input query. Because these columns aren'tpresent in both queries, theBY NAMEmodifier would trigger an error. Therefore, the exampleadds theINNERmode prefix so that the new columns are excluded from the results, executing thequery successfully.

SELECT1ASone_digit,10AStwo_digit,100ASthree_digitINNERUNIONALLBYNAMESELECT20AStwo_digit,2ASone_digit,1000ASfour_digit;/*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | | 2         | 20        | +-----------+-----------*/

To include the differing columns in the results, the following example usestheFULL OUTER mode prefix to populateNULL values for the missing column ineach query.

SELECT1ASone_digit,10AStwo_digit,100ASthree_digitFULLOUTERUNIONALLBYNAMESELECT20AStwo_digit,2ASone_digit,1000ASfour_digit;/*-----------+-----------+-------------+------------+ | one_digit | two_digit | three_digit | four_digit | +-----------+-----------+-------------+------------+ | 1         | 10        | 100         | NULL       | | 2         | 20        | NULL        | 1000       | +-----------+-----------+-------------+------------*/

Similarly, the following example uses theLEFT OUTER mode prefix to includethe new column from only the left input query and populate aNULL value forthe missing column in the right input query.

SELECT1ASone_digit,10AStwo_digit,100ASthree_digitLEFTOUTERUNIONALLBYNAMESELECT20AStwo_digit,2ASone_digit,1000ASfour_digit;/*-----------+-----------+-------------+ | one_digit | two_digit | three_digit | +-----------+-----------+-------------+ | 1         | 10        | 100         | | 2         | 20        | NULL        | +-----------+-----------+-------------*/

The following example adds the modifierON (column_list)to return only the specified columns in the specified order.

SELECT1ASone_digit,10AStwo_digit,100ASthree_digitFULLOUTERUNIONALLBYNAMEON(three_digit,two_digit)SELECT20AStwo_digit,2ASone_digit,1000ASfour_digit;/*-------------+-----------+ | three_digit | two_digit | +-------------+-----------+ | 100         | 10        | | NULL        | 20        | +-----------+-------------*/

INTERSECT

TheINTERSECT operator returns rows that are found in the results of both theleft and right input queries.

Examples

SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3,3,4])ASnumberINTERSECTDISTINCTSELECT*FROMUNNEST(ARRAY<INT64>[2,3,3,5])ASnumber;/*--------+ | number | +--------+ | 2      | | 3      | +--------*/

The following example shows multiple chained operations:

SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3,3,4])ASnumberINTERSECTDISTINCTSELECT*FROMUNNEST(ARRAY<INT64>[2,3,3,5])ASnumberINTERSECTDISTINCTSELECT*FROMUNNEST(ARRAY<INT64>[3,3,4,5])ASnumber;/*--------+ | number | +--------+ | 3      | +--------*/

The following example shows input queries that specify multiple columns. Bothqueries specify the same column names but in different orders. As a result, thesame columns in differing order are considered different columns, so the querydoesn't detect any intersecting row values. Therefore, no results are returned.

WITHNumbersTableAS(SELECT1ASone_digit,10AStwo_digitUNIONALLSELECT2,20UNIONALLSELECT3,30)SELECTone_digit,two_digitFROMNumbersTableINTERSECTDISTINCTSELECT10AStwo_digit,1ASone_digit;-- No intersecting values detected because columns aren't recognized as the same./*-----------+-----------+ +-----------+-----------*/

To resolve this ordering issue, the following example uses theBY NAME modifier to matchthe columns by name instead of by position in the query results.

WITHNumbersTableAS(SELECT1ASone_digit,10AStwo_digitUNIONALLSELECT2,20UNIONALLSELECT3,30)SELECTone_digit,two_digitFROMNumbersTableINTERSECTDISTINCTBYNAMESELECT10AStwo_digit,1ASone_digit;/*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | +-----------+-----------*/

The previous set operation withBY NAME is equivalent to using theSTRICTCORRESPONDING modifier. TheBY NAME modifier is recommended because it'sshorter and clearer than theSTRICT CORRESPONDING modifier.

WITHNumbersTableAS(SELECT1ASone_digit,10AStwo_digitUNIONALLSELECT2,20UNIONALLSELECT3,30)SELECTone_digit,two_digitFROMNumbersTableINTERSECTDISTINCTSTRICTCORRESPONDINGSELECT10AStwo_digit,1ASone_digit;/*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | +-----------+-----------*/

For more syntax examples with theBY NAMEmodifier, see theUNION set operator.

EXCEPT

TheEXCEPT operator returns rows from the left input query that aren't presentin the right input query.

Examples

SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3,3,4])ASnumberEXCEPTDISTINCTSELECT*FROMUNNEST(ARRAY<INT64>[1,2])ASnumber;/*--------+ | number | +--------+ | 3      | | 4      | +--------*/

The following example shows multiple chained operations:

SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3,3,4])ASnumberEXCEPTDISTINCTSELECT*FROMUNNEST(ARRAY<INT64>[1,2])ASnumberEXCEPTDISTINCTSELECT*FROMUNNEST(ARRAY<INT64>[1,4])ASnumber;/*--------+ | number | +--------+ | 3      | +--------*/

The following example modifies the execution behavior of the set operations. Thefirst input query is used against the result of the last two input queriesinstead of the values of the last two queries individually. In this example,theEXCEPT result of the last two input queries is2. Therefore, theEXCEPT results of the entire query are any values other than2 in the firstinput query.

SELECT*FROMUNNEST(ARRAY<INT64>[1,2,3,3,4])ASnumberEXCEPTDISTINCT(SELECT*FROMUNNEST(ARRAY<INT64>[1,2])ASnumberEXCEPTDISTINCTSELECT*FROMUNNEST(ARRAY<INT64>[1,4])ASnumber);/*--------+ | number | +--------+ | 1      | | 3      | | 4      | +--------*/

The following example shows input queries that specify multiple columns. Bothqueries specify the same column names but in different orders. As a result, thesame columns in differing order are considered different columns, so the querydoesn't detect any common rows that should be excluded. Therefore, all columnvalues from the left input query are returned with no exclusions.

WITHNumbersTableAS(SELECT1ASone_digit,10AStwo_digitUNIONALLSELECT2,20UNIONALLSELECT3,30)SELECTone_digit,two_digitFROMNumbersTableEXCEPTDISTINCTSELECT10AStwo_digit,1ASone_digit;-- No values excluded because columns aren't recognized as the same./*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 1         | 10        | | 2         | 20        | | 3         | 30        | +-----------+-----------*/

To resolve this ordering issue, the following example uses theBY NAME modifier to matchthe columns by name instead of by position in the query results.

WITHNumbersTableAS(SELECT1ASone_digit,10AStwo_digitUNIONALLSELECT2,20UNIONALLSELECT3,30)SELECTone_digit,two_digitFROMNumbersTableEXCEPTDISTINCTBYNAMESELECT10AStwo_digit,1ASone_digit;/*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 2         | 20        | | 3         | 30        | +-----------+-----------*/

The previous set operation withBY NAME is equivalent to using theSTRICTCORRESPONDING modifier. TheBY NAME modifier is recommended because it'sshorter and clearer than theSTRICT CORRESPONDING modifier.

WITHNumbersTableAS(SELECT1ASone_digit,10AStwo_digitUNIONALLSELECT2,20UNIONALLSELECT3,30)SELECTone_digit,two_digitFROMNumbersTableEXCEPTDISTINCTSTRICTCORRESPONDINGSELECT10AStwo_digit,1ASone_digit;/*-----------+-----------+ | one_digit | two_digit | +-----------+-----------+ | 2         | 20        | | 3         | 30        | +-----------+-----------*/

For more syntax examples with theBY NAMEmodifier, see theUNION set operator.

LIMIT andOFFSET clause

LIMITcount[OFFSETskip_rows]

Limits the number of rows to return in a query. Optionally includesthe ability to skip over rows.

Definitions

  • LIMIT: Limits the number of rows to produce.

    count is anINT64 constant expression that represents thenon-negative, non-NULL limit. No more thancount rows are produced.LIMIT 0 returns 0 rows.

    If there is a set operation,LIMIT is applied after the set operation isevaluated.

  • OFFSET: Skips a specific number of rows before applyingLIMIT.

    skip_rows is anINT64 constant expression that representsthe non-negative, non-NULL number of rows to skip.

Details

The rows that are returned byLIMIT andOFFSET have undefined order unlessthese clauses are used afterORDER BY.

A constant expression can be represented by a general expression, literal, orparameter value.

Note: Although theLIMIT clause limits the rows that a query produces, itdoesn't limit the amount of data processed by that query.

Examples

SELECT*FROMUNNEST(ARRAY<STRING>['a','b','c','d','e'])ASletterORDERBYletterASCLIMIT2;/*---------+ | letter  | +---------+ | a       | | b       | +---------*/
SELECT*FROMUNNEST(ARRAY<STRING>['a','b','c','d','e'])ASletterORDERBYletterASCLIMIT3OFFSET1;/*---------+ | letter  | +---------+ | b       | | c       | | d       | +---------*/

WITH clause

WITH [RECURSIVE ] {non_recursive_cte |recursive_cte }[, ...]

AWITH clause contains one or more common table expressions (CTEs).A CTE acts like a temporary table that you can reference within a singlequery expression. Each CTE binds the results of asubqueryto a table name, which can be used elsewhere in the same query expression,butrules apply.

CTEs can benon-recursive orrecursive and you can include both of these in yourWITH clause. A recursive CTE references itself, where anon-recursive CTE doesn't. If a recursive CTE is included in theWITH clause,theRECURSIVE keyword must also be included.

You can include theRECURSIVE keyword in aWITH clause even if norecursive CTEs are present. You can learn more about theRECURSIVE keywordhere.

GoogleSQL only materializesthe results of recursive CTEs, but doesn't materialize the resultsof non-recursive CTEs inside theWITH clause. If a non-recursive CTE isreferenced in multiple places in a query, then the CTE is executed once for eachreference. TheWITH clause with non-recursive CTEs is useful primarily forreadability.

RECURSIVE keyword

AWITH clause can optionally include theRECURSIVE keyword, which doestwo things:

  • Enables recursion in theWITH clause. If this keyword isn't present,you can only include non-recursive common table expressions (CTEs).If this keyword is present, you can use bothrecursive andnon-recursive CTEs.
  • Changes the visibility of CTEs in theWITH clause. If thiskeyword isn't present, a CTE is only visible to CTEs defined after it in theWITH clause. If this keyword is present, a CTE is visible to all CTEs in theWITH clause where it was defined.

Non-recursive CTEs

non_recursive_cte:cte_name AS (query_expr )

A non-recursive common table expression (CTE) containsa non-recursivesubqueryand a name associated with the CTE.

  • A non-recursive CTE can't reference itself.
  • A non-recursive CTE can be referenced by the query expression thatcontains theWITH clause, butrules apply.
Examples

In this example, aWITH clause defines two non-recursive CTEs thatare referenced in the related set operation, where one CTE is referenced byeach of the set operation's input query expressions:

WITHsubQ1AS(SELECTSchoolIDFROMRoster),subQ2AS(SELECTOpponentIDFROMPlayerStats)SELECT*FROMsubQ1UNIONALLSELECT*FROMsubQ2

You can break up more complex queries into aWITH clause andWITHSELECT statement instead of writing nested table subqueries.For example:

WITHq1AS(my_query)SELECT*FROM(WITHq2AS(SELECT*FROMq1)SELECT*FROMq2)
WITHq1AS(my_query)SELECT*FROM(WITHq2AS(SELECT*FROMq1),# q1 resolves to my_queryq3AS(SELECT*FROMq1),# q1 resolves to my_queryq1AS(SELECT*FROMq1),# q1 (in the query) resolves to my_queryq4AS(SELECT*FROMq1)# q1 resolves to the WITH subquery on the previous line.SELECT*FROMq1)# q1 resolves to the third inner WITH subquery.

Recursive CTEs

recursive_cte:cte_name AS ( recursive_union_operation )recursive_union_operation:  base_term union_operator recursive_termbase_term:query_exprrecursive_term:query_exprunion_operator:  UNION ALL

A recursive common table expression (CTE) contains arecursivesubquery and a name associated with the CTE.

  • A recursive CTE references itself.
  • A recursive CTE can be referenced in the query expression that contains theWITH clause, butrules apply.
  • When a recursive CTE is defined in aWITH clause, theRECURSIVE keyword must be present.

A recursive CTE is defined by arecursive union operation. Therecursive union operation defines how input is recursively processedto produce the final CTE result. The recursive union operation has thefollowing parts:

  • base_term: Runs the first iteration of therecursive union operation. This term must follow thebase term rules.
  • union_operator: TheUNION operator returns the rows that are fromthe union of the base term and recursive term. WithUNION ALL,each row produced in iterationN becomes part of the final CTE result andinput for iterationN+1. Iterationstops when an iteration produces no rows to move into the next iteration.
  • recursive_term: Runs the remaining iterations.It must include one self-reference (recursive reference) to the recursive CTE.Only this term can include a self-reference. This termmust follow therecursive term rules.

A recursive CTE looks like this:

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(SELECTn+1ASnFROMT1WHEREn <3))SELECTnFROMT1/*---+ | n | +---+ | 2 | | 1 | | 3 | +---*/

The first iteration of a recursive union operation runs the base term.Then, each subsequent iteration runs the recursive term and producesnew rows which are unioned with the previous iteration. The recursiveunion operation terminates when a recursive term iteration produces no newrows.

If recursion doesn't terminate, the query fails after reaching 500 iterations.

To learn more about recursive CTEs and troubleshooting iteration limit errors,seeWork with recursive CTEs.

Examples of allowed recursive CTEs

This is a simple recursive CTE:

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(SELECTn+2FROMT1WHEREn <4))SELECT*FROMT1ORDERBYn/*---+ | n | +---+ | 1 | | 3 | | 5 | +---*/

Multiple subqueries in the same recursive CTE are okay, aslong as each recursion has a cycle length of 1. It's also okay for recursiveentries to depend on non-recursive entries and vice-versa:

WITHRECURSIVET0AS(SELECT1ASn),T1AS((SELECT*FROMT0)UNIONALL(SELECTn+1FROMT1WHEREn <4)),T2AS((SELECT1ASn)UNIONALL(SELECTn+1FROMT2WHEREn <4)),T3AS(SELECT*FROMT1INNERJOINT2USING(n))SELECT*FROMT3ORDERBYn/*---+ | n | +---+ | 1 | | 2 | | 3 | | 4 | +---*/

Aggregate functions can be invoked in subqueries, as long as they aren'taggregating on the table being defined:

WITHRECURSIVET0AS(SELECT*FROMUNNEST([60,20,30])),T1AS((SELECT1ASn)UNIONALL(SELECTn+(SELECTCOUNT(*)FROMT0)FROMT1WHEREn <4))SELECT*FROMT1ORDERBYn/*---+ | n | +---+ | 1 | | 4 | +---*/

INNER JOIN can be used inside subqueries:

WITHRECURSIVET0AS(SELECT1ASn),T1AS((SELECT1ASn)UNIONALL(SELECTn+1FROMT1INNERJOINT0USING(n)))SELECT*FROMT1ORDERBYn/*---+ | n | +---+ | 1 | | 2 | +---*/

CROSS JOIN can be used inside subqueries:

WITHRECURSIVET0AS(SELECT2ASp),T1AS((SELECT1ASn)UNIONALL(SELECTT1.n+T0.pFROMT1CROSSJOINT0WHERET1.n <4))SELECT*FROMT1CROSSJOINT0ORDERBYn/*---+---+ | n | p | +---+---+ | 1 | 2 | | 3 | 2 | | 5 | 2 | +---+---*/

Recursive CTEs can be used insideCREATE TABLE AS SELECT statements. Thefollowing example creates a table namednew_table inmydataset:

CREATEORREPLACETABLE`myproject.mydataset.new_table`ASWITHRECURSIVET1AS(SELECT1ASnUNIONALLSELECTn+1FROMT1WHEREn <3)SELECT*FROMT1

Recursive CTEs can be used insideCREATE VIEW AS SELECT statements. Thefollowing example creates a view namednew_view inmydataset:

CREATEORREPLACEVIEW`myproject.mydataset.new_view`ASWITHRECURSIVET1AS(SELECT1ASnUNIONALLSELECTn+1FROMT1WHEREn <3)SELECT*FROMT1

Recursive CTEs can be used insideINSERT statements. The following exampledemonstrates how to insert data into a table by using recursive CTEs:

-- create a temp table.CREATETEMPTABLEtmp_table(nINT64);-- insert some values into the temp table by using recursive CTEs.INSERTINTOtmp_table(n)WITHRECURSIVET1AS(SELECT1ASnUNIONALLSELECTn+1FROMT1WHEREn <3)SELECT*FROMT1
Examples of disallowed recursive CTEs

The following recursive CTE is disallowed because theself-reference doesn't include a set operator, base term, andrecursive term.

WITHRECURSIVET1AS(SELECT*FROMT1)SELECT*FROMT1-- Error

The following recursive CTE is disallowed because the self-reference toT1is in the base term. The self reference is only allowed in the recursive term.

WITHRECURSIVET1AS((SELECT*FROMT1)UNIONALL(SELECT1))SELECT*FROMT1-- Error

The following recursive CTE is disallowed because there are multipleself-references in the recursive term when there must only be one.

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL((SELECT*FROMT1)UNIONALL(SELECT*FROMT1)))SELECT*FROMT1-- Error

The following recursive CTE is disallowed because the self-reference isinside anexpression subquery

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(SELECT(SELECTnFROMT1)))SELECT*FROMT1-- Error

The following recursive CTE is disallowed because there is aself-reference as an argument to a table-valued function (TVF).

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(SELECT*FROMMY_TVF(T1)))SELECT*FROMT1;-- Error

The following recursive CTE is disallowed because there is aself-reference as input to an outer join.

WITHRECURSIVET0AS(SELECT1ASn),T1AS((SELECT1ASn)UNIONALL(SELECT*FROMT1FULLOUTERJOINT0USING(n)))SELECT*FROMT1;-- Error

The following recursive CTE is disallowed because you can't use aggregationwith a self-reference.

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(SELECTCOUNT(*)FROMT1))SELECT*FROMT1;-- Error

The following recursive CTE is disallowed because you can't use thewindow functionOVER clause with a self-reference.

WITHRECURSIVET1AS((SELECT1.0ASn)UNIONALLSELECT1+AVG(n)OVER(ROWSbetween2PRECEDINGand0FOLLOWING)FROMT1WHEREn <10)SELECTnFROMT1;-- Error

The following recursive CTE is disallowed because you can't use aLIMIT clause with a self-reference.

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(SELECTnFROMT1LIMIT3))SELECT*FROMT1;-- Error

The following recursive CTEs are disallowed because you can't use anORDER BY clause with a self-reference.

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(SELECTn+1FROMT1ORDERBYn))SELECT*FROMT1;-- Error

The following recursive CTE is disallowed because tableT1 can't berecursively referenced from inside an innerWITH clause

WITHRECURSIVET1AS((SELECT1ASn)UNIONALL(WITHtAS(SELECTnFROMT1)SELECT*FROMt))SELECT*FROMT1-- Error

CTE rules and constraints

Common table expressions (CTEs) can be referenced inside the query expressionthat contains theWITH clause.

General rules

Here are some general rules and constraints to consider when working with CTEs:

  • Each CTE in the sameWITH clause must have a unique name.
  • You must include theRECURSIVE keyword keyword if theWITH clause contains a recursive CTE.
  • TheRECURSIVE keyword intheWITH clause changes the visibility of CTEs to other CTEs in thesameWITH clause. You can learn morehere.
  • WITH isn't allowed insideWITH RECURSIVE.
  • WITH RECURSIVE is allowed in theSELECT statement.
  • WITH RECURSIVE is only allowed at the top level of the query.
  • WITH RECURSIVE isn't allowed in functions.
  • WITH RECURSIVE isn't allowed in materialized views.
  • TheWITH RECURSIVE clause can't contain generative AI functions.
  • CREATE RECURSIVE VIEW isn't supported. To work around this, use theWITH RECURSIVE clause as thequery_expression in theCREATE VIEW statement. For more information, seeCREATE VIEW.
  • A local CTE overrides an outer CTE or table with the same name.
  • A CTE on a subquery may not reference correlated columns from the outer query.
Base term rules

The following rules apply to the base term in a recursive CTE:

  • The base term is required to be non-recursive.
  • The base term determines the names and types of all of thetable columns.
Recursive term rules

The following rules apply to the recursive term in a recursive CTE:

  • The recursive term must include exactly one reference to therecursively-defined table in the base term.
  • The recursive term must contain the same number of columns as thebase term, and the type of each column must be implicitly coercible tothe type of the corresponding column in the base term.
  • A recursive table reference can't be used as an operand to aFULL JOIN,a right operand to aLEFT JOIN, or a left operand to aRIGHT JOIN.
  • A recursive table reference can't be used with theTABLESAMPLE operator.
  • A recursive table reference can't be used as an operand to atable-valued function (TVF).
  • Use of theIN andEXISTS expression subqueries is limited within therecursive term. For example:
    • [NOT] IN and[NOT] EXISTS aren't allowed in theSELECT clause.
    • NOT IN isn't allowed in theWHERE clause.

The following rules apply to a subquery inside a recursive term:

  • A subquery with a recursive table reference must be aSELECT expression,not a set operation, such asUNION ALL.
  • A subquery can't contain, directly or indirectly, arecursive table reference anywhere outside of itsFROM clause.
  • A subquery with a recursive table reference can't contain anORDER BY orLIMIT clause.
  • A subquery with a recursive table reference can't invoke aggregate functions.
  • A subquery with a recursive table reference can't invoke window functions.
  • A subquery with a recursive table reference can't contain theDISTINCT keyword orGROUP BY clause.

CTE visibility

The visibility of a common table expression (CTE) within a query expressionis determined by whether or not you add theRECURSIVE keyword to theWITH clause where the CTE was defined. You can learn more about thesedifferences in the following sections.

Visibility of CTEs in aWITH clause with theRECURSIVE keyword

When you include theRECURSIVE keyword, references between CTEs in theWITHclause can go backwards and forwards. Cycles aren't allowed.

This is what happens when you have two CTEs that referencethemselves or each other in aWITH clause with theRECURSIVEkeyword. Assume thatA is the first CTE andB is the secondCTE in the clause:

  • A references A = Valid
  • A references B = Valid
  • B references A = Valid
  • A references B references A = Invalid (cycles aren't allowed)

A can reference itself because self-references are supported:

WITHRECURSIVEAAS(SELECT1ASnUNIONALL(SELECTn+1FROMAWHEREn <3))SELECT*FROMA/*---+ | n | +---+ | 1 | | 2 | | 3 | +---*/

A can referenceB because references between CTEs can go forwards:

WITHRECURSIVEAAS(SELECT*FROMB),BAS(SELECT1ASn)SELECT*FROMB/*---+ | n | +---+ | 1 | +---*/

B can referenceA because references between CTEs can go backwards:

WITHRECURSIVEAAS(SELECT1ASn),BAS(SELECT*FROMA)SELECT*FROMB/*---+ | n | +---+ | 1 | +---*/

This produces an error.A andB reference each other, which creates a cycle:

WITHRECURSIVEAAS(SELECT*FROMB),BAS(SELECT*FROMA)SELECT*FROMB-- Error

Visibility of CTEs in aWITH clause without theRECURSIVE keyword

When you don't include theRECURSIVE keyword in theWITH clause,references between CTEs in the clause can go backward but not forward.

This is what happens when you have two CTEs that referencethemselves or each other in aWITH clause withouttheRECURSIVE keyword. Assume thatA is the first CTE andBis the second CTE in the clause:

  • A references A = Invalid
  • A references B = Invalid
  • B references A = Valid
  • A references B references A = Invalid (cycles aren't allowed)

This produces an error.A can't reference itself because self-referencesaren't supported:

WITHAAS(SELECT1ASnUNIONALL(SELECTn+1FROMAWHEREn <3))SELECT*FROMA-- Error

This produces an error.A can't referenceB because references betweenCTEs can go backwards but not forwards:

WITHAAS(SELECT*FROMB),BAS(SELECT1ASn)SELECT*FROMB-- Error

B can referenceA because references between CTEs can go backwards:

WITHAAS(SELECT1ASn),BAS(SELECT*FROMA)SELECT*FROMB/*---+ | n | +---+ | 1 | +---*/

This produces an error.A andB reference each other, which creates acycle:

WITHAAS(SELECT*FROMB),BAS(SELECT*FROMA)SELECT*FROMB-- Error

AGGREGATION_THRESHOLD clause

Syntax for an aggregation threshold analysis rule–enforced query:

WITH AGGREGATION_THRESHOLD OPTIONS (threshold =threshold_amount,privacy_unit_column =column_name)

Syntax for an aggregation threshold analysis rule–enforced view:

WITH AGGREGATION_THRESHOLD [ OPTIONS (  [threshold =threshold_amount ],  [privacy_unit_column =column_name ]) ]

Description

Use theAGGREGATION_THRESHOLD clause to enforce anaggregation threshold. This clause counts the number of distinct privacy units(represented by the privacy unit column) for each group, and only outputs thegroups where the distinct privacy unit count satisfies theaggregation threshold. If you want to use anaggregation threshold analysis rule that you definedfor a view, use the syntax for ananalysis rule–enforced view.When querying a privacy-enforced view, theAGGREGATION_THRESHOLD clause doesnot need to include theOPTIONS clause.

Definitions:

  • threshold: The minimum number of distinctprivacy units (privacy unit column values) that need to contribute to each rowin the query results. If a potential row doesn't satisfy this threshold,that row is omitted from the query results.threshold_amount must bea positiveINT64 value.

    If you're using this query with an analysis rule–enforced view, you canoptionally add this query parameter to override thethreshold parameter forthe view. The threshold for the query must be equal to orgreater than the threshold for the view. If the threshold for the queryis less than the threshold for the view, an error is produced.

  • privacy_unit_column: The column that represents theprivacy unit column. Replacecolumn_name with thepath expression for the column. The first identifier in the path can startwith either a table name or a column name that's visible in theFROM clause.

    If you're using this query with an analysis rule–enforced view, you canoptionally add this query parameter. However, it must match thevalue forprivacy_unit_column on the view. If it doesn't, an error isproduced.

Details

The following functions can be used on any column in a query with theAGGREGATION_THRESHOLD clause, including the commonly usedCOUNT,SUM, andAVG functions:

  • APPROX_COUNT_DISTINCT
  • AVG
  • COUNT
  • COUNTIF
  • LOGICAL_AND
  • LOGICAL_OR
  • SUM
  • COVAR_POP
  • COVAR_SAMP
  • STDDEV_POP
  • STDDEV_SAMP
  • VAR_POP
  • VAR_SAMP

Example

In the following example, an aggregation threshold is enforcedon a query. Notice that some privacy units are dropped becausethere aren't enough distinct instances.

WITHExamTableAS(SELECT"Hansen"ASlast_name,"P91"AStest_id,510AStest_scoreUNIONALLSELECT"Wang","U25",500UNIONALLSELECT"Wang","C83",520UNIONALLSELECT"Wang","U25",460UNIONALLSELECT"Hansen","C83",420UNIONALLSELECT"Hansen","C83",560UNIONALLSELECT"Devi","U25",580UNIONALLSELECT"Devi","P91",480UNIONALLSELECT"Ivanov","U25",490UNIONALLSELECT"Ivanov","P91",540UNIONALLSELECT"Silva","U25",550)SELECTWITHAGGREGATION_THRESHOLDOPTIONS(threshold=3,privacy_unit_column=last_name)test_id,COUNT(DISTINCTlast_name)ASstudent_count,AVG(test_score)ASavg_test_scoreFROMExamTableGROUPBYtest_id;/*---------+---------------+----------------+ | test_id | student_count | avg_test_score | +---------+---------------+----------------+ | P91     | 3             | 510.0          | | U25     | 4             | 516.0          | +---------+---------------+----------------*/

In the following example, an aggregation threshold analysis rule is enforced ona view with the same results:

-- Create a table.CREATEORREPLACETABLEmydataset.ExamTableAS(SELECT"Hansen"ASlast_name,"P91"AStest_id,510AStest_scoreUNIONALLSELECT"Wang","U25",500UNIONALLSELECT"Wang","C83",520UNIONALLSELECT"Wang","U25",460UNIONALLSELECT"Hansen","C83",420UNIONALLSELECT"Hansen","C83",560UNIONALLSELECT"Devi","U25",580UNIONALLSELECT"Devi","P91",480UNIONALLSELECT"Ivanov","U25",490UNIONALLSELECT"Ivanov","P91",540UNIONALLSELECT"Silva","U25",550);-- Create a view for the table.CREATEORREPLACEVIEWmydataset.ExamViewOPTIONS(privacy_policy='{"aggregation_threshold_policy": {"threshold": 3, "privacy_unit_column": "last_name"}}')AS(SELECT*FROMmydataset.ExamTable);-- Query the aggregation threshold privacy-policy enforced view.SELECTWITHAGGREGATION_THRESHOLDtest_id,COUNT(DISTINCTlast_name)ASstudent_count,AVG(test_score)ASavg_test_scoreFROMmydataset.ExamViewGROUPBYtest_id;/*---------+---------------+----------------+ | test_id | student_count | avg_test_score | +---------+---------------+----------------+ | P91     | 3             | 510.0          | | U25     | 4             | 516.0          | +---------+---------------+----------------*/

In the following example, an aggregation threshold analysis rule is enforced onthe previous view, but the threshold is adjusted from3 in the view to4 in the query:

SELECTWITHAGGREGATION_THRESHOLDOPTIONS(threshold=4)test_id,COUNT(DISTINCTlast_name)ASstudent_count,AVG(test_score)ASavg_test_scoreFROMmydataset.ExamViewGROUPBYtest_id;/*---------+---------------+----------------+ | test_id | student_count | avg_test_score | +---------+---------------+----------------+ | U25     | 4             | 516.0          | +---------+---------------+----------------*/

In the following example, an aggregation threshold analysis rule is enforced onthe previous view, but the threshold is adjusted from3 in the view to5 in the query. While the analysis rule is satisfied, the query producesno data.

-- No data is produced.SELECTWITHAGGREGATION_THRESHOLDOPTIONS(threshold=5)test_id,COUNT(DISTINCTlast_name)ASstudent_count,AVG(test_score)ASavg_test_scoreFROMmydataset.ExamViewGROUPBYtest_id;

In the following example, an aggregation threshold analysis rule is enforced onthe previous view, but the threshold is adjusted from3 in the view to2 in the query:

-- Error: Aggregation threshold is too low.SELECTWITHAGGREGATION_THRESHOLDOPTIONS(threshold=2)test_id,COUNT(DISTINCTlast_name)ASstudent_count,AVG(test_score)ASavg_test_scoreFROMmydataset.ExamViewGROUPBYtest_id;

In the following example, an aggregation threshold analysis rule is enforced onthe previous view, but the threshold is adjusted fromlast_name in the viewtotest_id in the query:

-- Error: Can't override the privacy unit column set in view.SELECTWITHAGGREGATION_THRESHOLDOPTIONS(privacy_unit_column=test_id)test_id,COUNT(DISTINCTlast_name)ASstudent_count,AVG(test_score)ASavg_test_scoreFROMmydataset.ExamViewGROUPBYtest_id;

Differential privacy clause

WITH DIFFERENTIAL_PRIVACY OPTIONS(privacy_parameters )privacy_parameters:epsilon =expression,delta =expression,  [max_groups_contributed =expression ],privacy_unit_column =column_name

Description

This clause lets you transform the results of a query withdifferentially private aggregations. To learn more aboutdifferential privacy, seeDifferential privacy.

You can use the following syntax to build a differential privacy clause:

  • epsilon: Controls the amount of noise added to the results.A higher epsilon means less noise.expression must be a literal andreturn aFLOAT64.
  • delta: The probability that any row in the result fails tobe epsilon-differentially private.expression must be a literal and returnaFLOAT64.
  • max_groups_contributed: A positive integer identifying thelimit on the number of groups that an entity is allowed to contribute to.This number is also used to scale the noise for each group.expression mustbe a literal and return anINT64.
  • privacy_unit_column: The column that represents theprivacy unit column. Replacecolumn_name with the path expression for thecolumn. The first identifier in the path can start with either a table nameor a column name that's visible in theFROM clause.

If you want to use this syntax, add it after theSELECT keyword with one ormore differentially private aggregate functions in theSELECT list.To learn more about the privacy parameters in this syntax,seePrivacy parameters.

Privacy parameters

Privacy parameters control how the results of a query are transformed.Appropriate values for these settings can depend on many things suchas the characteristics of your data, the exposure level, and theprivacy level.

In this section, you can learn more about how you can useprivacy parameters to control how the results are transformed.

epsilon

Noise is added primarily based on the specifiedepsilondifferential privacy parameter. The higher the epsilon the less noise is added.More noise corresponding to smaller epsilons equals more privacy protection.

Noise can be eliminated by settingepsilon to1e20, which can beuseful during initial data exploration and experimentation withdifferential privacy. Extremely largeepsilon values, such as1e308,cause query failure.

GoogleSQL splitsepsilon between the differentially privateaggregates in the query. In addition to the explicitdifferentially private aggregate functions, the differential privacy processalso injects an implicit differentially private aggregate into the plan forremoving small groups that computes a noisy entity count per group. If you haven explicit differentially private aggregate functions in your query, then eachaggregate individually getsepsilon/(n+1) for its computation. If used withmax_groups_contributed, the effectiveepsilon per function per groups isfurther split bymax_groups_contributed. Additionally, if implicit clamping isused for an aggregate differentially private function, then half of thefunction's epsilon is applied towards computing implicit bounds, and half of thefunction's epsilon is applied towards the differentially private aggregationitself.

delta

Thedelta differential privacy parameter represents the probability that anyrow fails to beepsilon-differentially private in the result of adifferentially private query.

max_groups_contributed

Themax_groups_contributed differential privacy parameter is apositive integer that, if specified, scales the noise and limits the number ofgroups that each entity can contribute to.

max_groups_contributed is set by default, even if you don't specify it.The default value is1. Ifmax_groups_contributed is set toNULL, thenmax_groups_contributed is unspecified and there is no limit to the numberof groups that each entity can contribute to.

Ifmax_groups_contributed is unspecified, the language can't guarantee thatthe results will be differentially private. We recommend that you specifymax_groups_contributed. If you don't specifymax_groups_contributed, theresults might still be differentially private if certain preconditions are met.For example, if you know that the privacy unit column in a table or view isunique in theFROM clause, the entity can't contribute to more than one groupand therefore the results will be the same regardless of whethermax_groups_contributed is set.

privacy_unit_column

To learn about the privacy unit and how to define a privacy unit column, seeDefine a privacy unit column.

Differential privacy examples

This section contains examples that illustrate how to work withdifferential privacy in GoogleSQL.

Tables for examples

The examples in this section reference the following tables:

CREATEORREPLACETABLEprofessorsAS(SELECT101ASid,"pencil"ASitem,24ASquantityUNIONALLSELECT123,"pen",16UNIONALLSELECT123,"pencil",10UNIONALLSELECT123,"pencil",38UNIONALLSELECT101,"pen",19UNIONALLSELECT101,"pen",23UNIONALLSELECT130,"scissors",8UNIONALLSELECT150,"pencil",72);
CREATEORREPLACETABLEstudentsAS(SELECT1ASid,"pencil"ASitem,5ASquantityUNIONALLSELECT1,"pen",2UNIONALLSELECT2,"pen",1UNIONALLSELECT3,"pen",4);

Add noise

You can add noise to a differentially private query. Smaller groups might not beincluded. Smaller epsilons and more noise will provide greaterprivacy protection.

-- This gets the average number of items requested per professor and adds-- noise to the resultsSELECTWITHDIFFERENTIAL_PRIVACYOPTIONS(epsilon=10,delta=.01,max_groups_contributed=2,privacy_unit_column=id)item,AVG(quantity,contribution_bounds_per_group=>(0,100))ASaverage_quantityFROMprofessorsGROUPBYitem;-- These results will change each time you run the query.-- The scissors group was removed this time, but might not be-- removed the next time./*----------+------------------+ | item     | average_quantity | +----------+------------------+ | pencil   | 38.5038356810269 | | pen      | 13.4725028762032 | +----------+------------------*/

Remove noise

Removing noise removes privacy protection. Only remove noise fortesting queries on non-private data. Whenepsilon is high, noise is removedfrom the results.

-- This gets the average number of items requested per professor and removes-- noise from the resultsSELECTWITHDIFFERENTIAL_PRIVACYOPTIONS(epsilon=1e20,delta=.01,max_groups_contributed=2,privacy_unit_column=id)item,AVG(quantity,contribution_bounds_per_group=>(0,100))ASaverage_quantityFROMprofessorsGROUPBYitem;/*----------+------------------+ | item     | average_quantity | +----------+------------------+ | pencil   | 40               | | pen      | 18.5             | | scissors | 8                | +----------+------------------*/

Limit the groups in which a privacy unit ID can exist

A privacy unit column can exist within multiple groups. For example, in theprofessors table, the privacy unit column123 exists in thepencil andpen group. You can setmax_groups_contributed to different values to limit how manygroups each privacy unit column will be included in.

SELECTWITHDIFFERENTIAL_PRIVACYOPTIONS(epsilon=1e20,delta=.01,privacy_unit_column=id)item,AVG(quantity,contribution_bounds_per_group=>(0,100))ASaverage_quantityFROMprofessorsGROUPBYitem;-- The privacy unit column 123 was only included in the pen group in this example.-- Noise was removed from this query for demonstration purposes only./*----------+------------------+ | item     | average_quantity | +----------+------------------+ | pencil   | 40               | | pen      | 18.5             | | scissors | 8                | +----------+------------------*/

Using aliases

An alias is a temporary name given to a table, column, or expression present ina query. You can introduce explicit aliases in theSELECT list orFROMclause, or GoogleSQL infers an implicit alias for some expressions.Expressions with neither an explicit nor implicit alias are anonymous and thequery can't reference them by name.

Explicit aliases

You can introduce explicit aliases in either theFROM clause or theSELECTlist.

In aFROM clause, you can introduce explicit aliases for any item, includingtables, arrays, subqueries, andUNNEST clauses, using[AS] alias. TheASkeyword is optional.

Example:

SELECTs.FirstName,s2.SongNameFROMSingersASs,(SELECT*FROMSongs)ASs2;

You can introduce explicit aliases for any expression in theSELECT list using[AS] alias. TheAS keyword is optional.

Example:

SELECTs.FirstNameASname,LOWER(s.FirstName)ASlnameFROMSingerss;

Implicit aliases

In theSELECT list, if there is an expression that doesn't have an explicitalias, GoogleSQL assigns an implicit alias according to the followingrules. There can be multiple columns with the same alias in theSELECT list.

  • For identifiers, the alias is the identifier. For example,SELECT abcimpliesAS abc.
  • For path expressions, the alias is the last identifier in the path. Forexample,SELECT abc.def.ghi impliesAS ghi.
  • For field access using the "dot" member field access operator, the alias isthe field name. For example,SELECT (struct_function()).fname impliesASfname.

In all other cases, there is no implicit alias, so the column is anonymous andcan't be referenced by name. The data from that column will still be returnedand the displayed query results may have a generated label for that column, butthe label can't be used like an alias.

In aFROM clause,from_items aren't required to have an alias. Thefollowing rules apply:

  • If there is an expression that doesn't have an explicit alias, GoogleSQL assigns an implicit alias in these cases:
    • For identifiers, the alias is the identifier. For example,FROM abc impliesAS abc.
    • For path expressions, the alias is the last identifier in the path. For example,FROM abc.def.ghi impliesAS ghi
    • The column produced usingWITH OFFSET has the implicit aliasoffset.
  • Table subqueries don't have implicit aliases.
  • FROM UNNEST(x) doesn't have an implicit alias.

Alias visibility

After you introduce an explicit alias in a query, there are restrictions onwhere else in the query you can reference that alias. These restrictions onalias visibility are the result of GoogleSQL name scoping rules.

Visibility in theFROM clause

GoogleSQL processes aliases in aFROM clause from left to right,and aliases are visible only to subsequent path expressions in aFROMclause.

Example:

Assume theSingers table had aConcerts column ofARRAY type.

SELECTFirstNameFROMSingersASs,s.Concerts;

Invalid:

SELECTFirstNameFROMs.Concerts,SingersASs;//INVALID.

FROM clause aliases arenot visible to subqueries in the sameFROMclause. Subqueries in aFROM clause can't contain correlated references toother tables in the sameFROM clause.

Invalid:

SELECTFirstNameFROMSingersASs,(SELECT(2020-ReleaseDate)FROMs)//INVALID.

You can use any column name from a table in theFROM as an alias anywhere inthe query, with or without qualification with the table name.

Example:

SELECTFirstName,s.ReleaseDateFROMSingerssWHEREReleaseDate=1975;

If theFROM clause contains an explicit alias, you must use the explicit aliasinstead of the implicit alias for the remainder of the query (seeImplicit Aliases). A table alias is useful for brevity orto eliminate ambiguity in cases such as self-joins, where the same table isscanned multiple times during query processing.

Example:

SELECT*FROMSingersass,Songsass2ORDERBYs.LastName

Invalid —ORDER BY doesn't use the table alias:

SELECT*FROMSingersass,Songsass2ORDERBYSingers.LastName;//INVALID.

Visibility in theSELECT list

Aliases in theSELECT list are visible only to the following clauses:

  • GROUP BY clause
  • ORDER BY clause
  • HAVING clause

Example:

SELECTLastNameASlast,SingerIDFROMSingersORDERBYlast;

Visibility in theGROUP BY,ORDER BY, andHAVING clauses

These three clauses,GROUP BY,ORDER BY, andHAVING, can refer to only thefollowing values:

  • Tables in theFROM clause and any of their columns.
  • Aliases from theSELECT list.

GROUP BY andORDER BY can also refer to a third group:

  • Integer literals, which refer to items in theSELECT list. The integer1refers to the first item in theSELECT list,2 refers to the second item,etc.

Example:

SELECTSingerIDASsid,COUNT(Songid)ASs2idFROMSongsGROUPBY1ORDERBY2DESC;

The previous query is equivalent to:

SELECTSingerIDASsid,COUNT(Songid)ASs2idFROMSongsGROUPBYsidORDERBYs2idDESC;

Duplicate aliases

ASELECT list or subquery containing multiple explicit or implicit aliasesof the same name is allowed, as long as the alias name isn't referencedelsewhere in the query, since the reference would beambiguous.

When a top-levelSELECT list contains duplicate column names and nodestination table is specified, all duplicate columns, except for the first one,are automatically renamed to make them unique. The renamed columns appear in thequery result.

Example:

SELECT1ASa,2ASa;/*---+-----+ | a | a_1 | +---+-----+ | 1 | 2   | +---+-----*/

Duplicate column names in a table or view definition aren't supported. Thesestatements with queries that contain duplicate column names will fail:

-- This query fails.CREATETABLEmy_dataset.my_tableAS(SELECT1ASa,2ASa);
-- This query fails.CREATEVIEWmy_dataset.my_viewAS(SELECT1ASa,2ASa);

Ambiguous aliases

GoogleSQL provides an error if accessing a name is ambiguous, meaningit can resolve to more than one unique object in the query or in a table schema,including the schema of a destination table.

The following query contains column names that conflict between tables, sincebothSingers andSongs have a column namedSingerID:

SELECTSingerIDFROMSingers,Songs;

The following query contains aliases that are ambiguous in theGROUP BY clausebecause they are duplicated in theSELECT list:

SELECTFirstNameASname,LastNameASname,FROMSingersGROUPBYname;

The following query contains aliases that are ambiguous in theSELECT list andFROM clause because they share a column and field with same name.

  • Assume thePerson table has three columns:FirstName,LastName, andPrimaryContact.
  • Assume thePrimaryContact column represents a struct with these fields:FirstName andLastName.

The aliasP is ambiguous and will produce an error becauseP.FirstName intheGROUP BY clause could refer to eitherPerson.FirstName orPerson.PrimaryContact.FirstName.

SELECTFirstName,LastName,PrimaryContactASPFROMPersonASPGROUPBYP.FirstName;

A name isnot ambiguous inGROUP BY,ORDER BY orHAVING if it's botha column name and aSELECT list alias, as long as the name resolves to thesame underlying object. In the following example, the aliasBirthYear isn'tambiguous because it resolves to the same underlying column,Singers.BirthYear.

SELECTLastName,BirthYearASBirthYearFROMSingersGROUPBYBirthYear;

Range variables

In GoogleSQL, a range variable is a table expression alias in theFROM clause. Sometimes a range variable is known as atable alias. Arange variable lets you reference rows being scanned from a table expression.A table expression represents an item in theFROM clause that returns a table.Common items that this expression can represent includetables,value tables,subqueries,table-valued functions (TVFs),joins, andparenthesized joins.

In general, a range variable provides a reference to the rows of a tableexpression. A range variable can be used to qualify a column reference andunambiguously identify the related table, for examplerange_variable.column_1.

When referencing a range variable on its own without a specified column suffix,the result of a table expression is the row type of the related table.Value tables have explicit row types, so for range variables relatedto value tables, the result type is the value table's row type. Other tablesdon't have explicit row types, and for those tables, the range variabletype is a dynamically defined struct that includes all of thecolumns in the table.

Examples

In these examples, theWITH clause is used to emulate a temporary tablecalledGrid. This table has columnsx andy. A range variable calledCoordinate refers to the current row as the table is scanned.Coordinatecan be used to access the entire row or columns in the row.

The following example selects columnx from range variableCoordinate,which in effect selects columnx from tableGrid.

WITHGridAS(SELECT1x,2y)SELECTCoordinate.xFROMGridASCoordinate;/*---+ | x | +---+ | 1 | +---*/

The following example selects all columns from range variableCoordinate,which in effect selects all columns from tableGrid.

WITHGridAS(SELECT1x,2y)SELECTCoordinate.*FROMGridASCoordinate;/*---+---+ | x | y | +---+---+ | 1 | 2 | +---+---*/

The following example selects the range variableCoordinate, which is areference to rows in tableGrid. SinceGrid isn't a value table,the result type ofCoordinate is a struct that contains all the columnsfromGrid.

WITHGridAS(SELECT1x,2y)SELECTCoordinateFROMGridASCoordinate;/*--------------+ | Coordinate   | +--------------+ | {x: 1, y: 2} | +--------------*/

Value tables

In addition to standard SQLtables, GoogleSQL supportsvalue tables.In a value table, rather than having rows made up of a list of columns, each rowis a single value of typeSTRUCT, and there are no column names.

In the following example, a value table for aSTRUCT is produced with theSELECT AS VALUE statement:

SELECT*FROM(SELECTASVALUESTRUCT(123ASa,FALSEASb))/*-----+-------+ | a   | b     | +-----+-------+ | 123 | FALSE | +-----+-------*/

Return query results as a value table

You can use GoogleSQL to return query results as a value table. Thisis useful when you want to store a query result with aSTRUCT type as atable. To return a query result as a value table, use one of the followingstatements:

Value tables can also occur as the output of theUNNESToperator or asubquery. TheWITH clauseintroduces a value table if the subquery used produces a value table.

In contexts where a query with exactly one column is expected, a value tablequery can be used instead. For example, scalar andarraysubqueries normally require a single-column query,but in GoogleSQL, they also allow using a value table query.

Create a table with a value table

Value tables aren't supported as top-level queries in theCREATE TABLE statement, but they can be included in subqueries andUNNEST operations. For example, you can create a table from avalue table with this query:

CREATETABLEReviewsASSELECT*FROM(SELECTASVALUESTRUCT(5ASstar_rating,FALSEASup_down_rating))
Column NameData Type
star_ratingINT64
up_down_ratingBOOL

Use a set operation on a value table

You can't combine tables and value tables in aSET operation.

Table function calls

To call a TVF, use the function call in place of the table name in aFROMclause.

Appendix A: examples with sample data

These examples include statements which perform queries on theRoster andTeamMascot,andPlayerStats tables.

Sample tables

The following tables are used to illustrate the behavior of differentquery clauses in this reference.

Roster table

TheRoster table includes a list of player names (LastName) and theunique ID assigned to their school (SchoolID). It looks like this:

/*-----------------------+ | LastName   | SchoolID | +-----------------------+ | Adams      | 50       | | Buchanan   | 52       | | Coolidge   | 52       | | Davis      | 51       | | Eisenhower | 77       | +-----------------------*/

You can use thisWITH clause to emulate a temporary table name for theexamples in this reference:

WITHRosterAS(SELECT'Adams'asLastName,50asSchoolIDUNIONALLSELECT'Buchanan',52UNIONALLSELECT'Coolidge',52UNIONALLSELECT'Davis',51UNIONALLSELECT'Eisenhower',77)SELECT*FROMRoster

PlayerStats table

ThePlayerStats table includes a list of player names (LastName) and theunique ID assigned to the opponent they played in a given game (OpponentID)and the number of points scored by the athlete in that game (PointsScored).

/*----------------------------------------+ | LastName   | OpponentID | PointsScored | +----------------------------------------+ | Adams      | 51         | 3            | | Buchanan   | 77         | 0            | | Coolidge   | 77         | 1            | | Adams      | 52         | 4            | | Buchanan   | 50         | 13           | +----------------------------------------*/

You can use thisWITH clause to emulate a temporary table name for theexamples in this reference:

WITHPlayerStatsAS(SELECT'Adams'asLastName,51asOpponentID,3asPointsScoredUNIONALLSELECT'Buchanan',77,0UNIONALLSELECT'Coolidge',77,1UNIONALLSELECT'Adams',52,4UNIONALLSELECT'Buchanan',50,13)SELECT*FROMPlayerStats

TeamMascot table

TheTeamMascot table includes a list of unique school IDs (SchoolID) and themascot for that school (Mascot).

/*---------------------+ | SchoolID | Mascot   | +---------------------+ | 50       | Jaguars  | | 51       | Knights  | | 52       | Lakers   | | 53       | Mustangs | +---------------------*/

You can use thisWITH clause to emulate a temporary table name for theexamples in this reference:

WITHTeamMascotAS(SELECT50asSchoolID,'Jaguars'asMascotUNIONALLSELECT51,'Knights'UNIONALLSELECT52,'Lakers'UNIONALLSELECT53,'Mustangs')SELECT*FROMTeamMascot

GROUP BY clause

Example:

SELECTLastName,SUM(PointsScored)FROMPlayerStatsGROUPBYLastName;
LastNameSUM
Adams7
Buchanan13
Coolidge1

UNION

TheUNION operator combines the result sets of two or moreSELECT statementsby pairing columns from the result set of eachSELECT statement and verticallyconcatenating them.

Example:

SELECTMascotASX,SchoolIDASYFROMTeamMascotUNIONALLSELECTLastName,PointsScoredFROMPlayerStats;

Results:

XY
Jaguars50
Knights51
Lakers52
Mustangs53
Adams3
Buchanan0
Coolidge1
Adams4
Buchanan13

INTERSECT

This query returns the last names that are present in both Roster andPlayerStats.

SELECTLastNameFROMRosterINTERSECTDISTINCTSELECTLastNameFROMPlayerStats;

Results:

LastName
Adams
Coolidge
Buchanan

EXCEPT

The query below returns last names in Roster that arenot present inPlayerStats.

SELECTLastNameFROMRosterEXCEPTDISTINCTSELECTLastNameFROMPlayerStats;

Results:

LastName
Eisenhower
Davis

Reversing the order of theSELECT statements will return last names inPlayerStats that arenot present in Roster:

SELECTLastNameFROMPlayerStatsEXCEPTDISTINCTSELECTLastNameFROMRoster;

Results:

(empty)

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-11-24 UTC.