Lexical structure and syntax in GoogleSQL

A GoogleSQL statement comprises a series of tokens. Tokens includeidentifiers, quoted identifiers, literals, keywords, operators, andspecial characters. You can separate tokens with comments or whitespace suchas spaces, backspaces, tabs, or newlines.

Identifiers

Identifiers are names that are associated with columns, tables,fields, path expressions, and more. They can beunquoted orquoted and somearecase-sensitive.

Unquoted identifiers

  • Must begin with a letter or an underscore (_) character.
  • Subsequent characters can be letters, numbers, or underscores (_).

Quoted identifiers

  • Must be enclosed by backtick (`) characters.
  • Can contain any characters, including spaces and symbols.
  • Can't be empty.
  • Have the same escape sequences asstring literals.
  • If an identifier is the same as areserved keyword, theidentifier must be quoted. For example, the identifierFROM must be quoted.Additional rules apply forpath expressionsandfield names.

Identifier examples

Path expression examples:

-- Valid. _5abc and dataField are valid identifiers._5abc.dataField-- Valid. `5abc` and dataField are valid identifiers.`5abc`.dataField-- Invalid. 5abc is an invalid identifier because it's unquoted and starts-- with a number rather than a letter or underscore.5abc.dataField-- Valid. abc5 and dataField are valid identifiers.abc5.dataField-- Invalid. abc5! is an invalid identifier because it's unquoted and contains-- a character that isn't a letter, number, or underscore.abc5!.dataField-- Valid. `GROUP` and dataField are valid identifiers.`GROUP`.dataField-- Invalid. GROUP is an invalid identifier because it's unquoted and is a-- stand-alone reserved keyword.GROUP.dataField-- Valid. abc5 and GROUP are valid identifiers.abc5.GROUP

Function examples:

-- Valid. dataField is a valid identifier in a function called foo().foo().dataField

Array access operation examples:

-- Valid. dataField is a valid identifier in an array called items.items[OFFSET(3)].dataField

Named query parameter examples:

-- Valid. param and dataField are valid identifiers.@param.dataField

Protocol buffer examples:

-- Valid. dataField is a valid identifier in a protocol buffer called foo.(foo).dataField

Path expressions

A path expression describes how to navigate to an object in a graph of objectsand generally follows this structure:

path:  [path_expression][. ...]path_expression:  [first_part]/subsequent_part[ { / | : | - } subsequent_part ][...]first_part:  { unquoted_identifier | quoted_identifier }subsequent_part:  { unquoted_identifier | quoted_identifier | number }
  • path: A graph of one or more objects.
  • path_expression: An object in a graph of objects.
  • first_part: A path expression can start with a quoted orunquoted identifier. If the path expressions starts with areserved keyword, it must be a quoted identifier.
  • subsequent_part: Subsequent parts of a path expression can includenon-identifiers, such as reserved keywords. If a subsequent part of apath expressions starts with areserved keyword, itmay be quoted or unquoted.

Examples:

foo.barfoo.bar/25foo/bar:25foo/bar/25-31/foo/bar/25/foo/bar

Field names

A field name represents the name of a field inside a complex data type suchas a struct,protocol buffer message, or JSON object.

  • A field name can be a quoted identifier or an unquoted identifier.

Literals

A literal represents a constant value of a built-in data type. Some, but notall, data types can be expressed as literals.

String and bytes literals

A string literal represents a constant value of thestring data type. A bytes literal represents aconstant value of thebytes data type.

Both string and bytes literals must bequoted, either with single (') ordouble (") quotation marks, ortriple-quoted with groups of three single(''') or three double (""") quotation marks.

Formats for quoted literals

The following table lists all of the ways you can format a quoted literal.

LiteralExamplesDescription
Quoted string
  • "abc"
  • "it's"
  • 'it\'s'
  • 'Title: "Boy"'
Quoted strings enclosed by single (') quotes can contain unescaped double (") quotes, as well as the inverse.
Backslashes (\) introduce escape sequences. See the Escape Sequences table below.
Quoted strings can't contain newlines, even when preceded by a backslash (\).
Triple-quoted string
  • """abc"""
  • '''it's'''
  • '''Title:"Boy"'''
  • '''two
    lines'''
  • '''why\?'''
Embedded newlines and quotes are allowed without escaping - see fourth example.
Backslashes (\) introduce escape sequences. See Escape Sequences table below.
A trailing unescaped backslash (\) at the end of a line isn't allowed.
End the string with three unescaped quotes in a row that match the starting quotes.
Raw string
  • r"abc+"
  • r'''abc+'''
  • r"""abc+"""
  • r'f\(abc,(.*),def\)'
Quoted or triple-quoted literals that have the raw string literal prefix (r orR) are interpreted as raw strings (sometimes described as regex strings).
Backslash characters (\) don't act as escape characters. If a backslash followed by another character occurs inside the string literal, both characters are preserved.
A raw string can't end with an odd number of backslashes.
Raw strings are useful for constructing regular expressions.The prefix is case-insensitive.
Bytes
  • B"abc"
  • B'''abc'''
  • b"""abc"""
Quoted or triple-quoted literals that have the bytes literal prefix (b orB) are interpreted as bytes.
Raw bytes
  • br'abc+'
  • RB"abc+"
  • RB'''abc'''
A bytes literal can be interpreted as raw bytes if both ther andb prefixes are present. These prefixes can becombined in any order and are case-insensitive. For example,rb'abc*' andrB'abc*' andbr'abc*' areall equivalent. See the description for raw string to learn more aboutwhat you can do with a raw literal.

Escape sequences for string and bytes literals

The following table lists all valid escape sequences for representingnon-alphanumeric characters in string and bytes literals. Any sequence not inthis table produces an error.

Escape SequenceDescription
\aBell
\bBackspace
\fFormfeed
\nNewline
\rCarriage Return
\tTab
\vVertical Tab
\\Backslash (\)
\?Question Mark (?)
\"Double Quote (")
\'Single Quote (')
\`Backtick (`)
\oooOctal escape, with exactly 3 digits (in the range 0–7). Decodes to a single Unicode character (in string literals) or byte (in bytes literals).
\xhh or\XhhHex escape, with exactly 2 hex digits (0–9 or A–F or a–f). Decodes to a single Unicode character (in string literals) or byte (in bytes literals). Examples:
  • '\x41' =='A'
  • '\x41B' is'AB'
  • '\x4' is an error
\uhhhhUnicode escape, with lowercase 'u' and exactly 4 hex digits. Valid only in string literals or identifiers.
Note that the range D800-DFFF isn't allowed, as these are surrogate unicode values.
\UhhhhhhhhUnicode escape, with uppercase 'U' and exactly 8 hex digits. Valid only in string literals or identifiers.
The range D800-DFFF isn't allowed, as these values are surrogate unicode values. Also, values greater than 10FFFF aren't allowed.

Integer literals

Integer literals are either a sequence of decimal digits (0–9) or a hexadecimalvalue that's prefixed with "0x" or "0X". Integers can be prefixed by "+"or "-" to represent positive and negative values, respectively.Examples:

1230xABC-123

An integer literal is interpreted as anINT64.

A integer literal represents a constant value of theinteger data type.

NUMERIC literals

You can constructNUMERIC literals using theNUMERIC keyword followed by a floating point value in quotes.

Examples:

SELECTNUMERIC'0';SELECTNUMERIC'123456';SELECTNUMERIC'-3.14';SELECTNUMERIC'-0.54321';SELECTNUMERIC'1.23456e05';SELECTNUMERIC'-9.876e-3';

ANUMERIC literal represents a constant value of theNUMERIC data type.

Floating point literals

Syntax options:

[+-]DIGITS.[DIGITS][e[+-]DIGITS][+-][DIGITS].DIGITS[e[+-]DIGITS]DIGITSe[+-]DIGITS

DIGITS represents one or more decimal numbers (0 through 9) ande representsthe exponent marker (e or E).

Examples:

123.456e-67.1E458.4e2

Numeric literals that containeither a decimal point or an exponent marker are presumed to be type double.

Implicit coercion of floating point literals to float type is possible if thevalue is within the valid float range.

There is no literalrepresentation of NaN or infinity, but the following case-insensitive stringscan be explicitly cast to float:

  • "NaN"
  • "inf" or "+inf"
  • "-inf"

A floating-point literal represents a constant value of thefloating-point data type.

Array literals

Array literals are comma-separated lists of elementsenclosed in square brackets. TheARRAY keyword is optional, and an explicitelement type T is also optional.

Examples:

[1,2,3]['x','y','xy']ARRAY[1,2,3]ARRAY<string>['x','y','xy']ARRAY<int64>[]

An array literal represents a constant value of thearray data type.

Struct literals

A struct literal is a struct whose fields are all literals. Struct literals canbe written using any of the syntaxes forconstructing astruct (tuple syntax, typeless struct syntax, or typedstruct syntax).

Note that tuple syntax requires at least two fields, in order to distinguish itfrom an ordinary parenthesized expression. To write a struct literal with asingle field, use typeless struct syntax or typed struct syntax.

ExampleOutput Type
(1, 2, 3)STRUCT<INT64, INT64, INT64>
(1, 'abc')STRUCT<INT64, STRING>
STRUCT(1 AS foo, 'abc' AS bar)STRUCT<foo INT64, bar STRING>
STRUCT<INT64, STRING>(1, 'abc')STRUCT<INT64, STRING>
STRUCT(1)STRUCT<INT64>
STRUCT<INT64>(1)STRUCT<INT64>

A struct literal represents a constant value of thestruct data type.

Date literals

Syntax:

DATE'date_canonical_format'

Date literals contain theDATE keyword followed bydate_canonical_format,a string literal that conforms to the canonical date format, enclosed in singlequotation marks. Date literals support a range between theyears 1 and 9999, inclusive. Dates outside of this range are invalid.

For example, the following date literal represents September 27, 2014:

DATE'2014-09-27'

String literals in canonical date format also implicitly coerce to DATE typewhen used where a DATE-type expression is expected. For example, in the query

SELECT*FROMfooWHEREdate_col="2014-09-27"

the string literal"2014-09-27" will be coerced to a date literal.

A date literal represents a constant value of thedate data type.

Timestamp literals

Syntax:

TIMESTAMP'timestamp_canonical_format'

Timestamp literals contain theTIMESTAMP keyword andtimestamp_canonical_format, a string literal thatconforms to the canonical timestamp format, enclosed in single quotation marks.

Timestamp literals support a range between the years 1 and 9999, inclusive.Timestamps outside of this range are invalid.

A timestamp literal can include a numerical suffix to indicate the time zone:

TIMESTAMP'2014-09-27 12:30:00.45-08'

If this suffix is absent, the default time zone,America/Los_Angeles, is used.

For example, the following timestamp represents 12:30 p.m. on September 27,2014 in the default time zone, America/Los_Angeles:

TIMESTAMP'2014-09-27 12:30:00.45'

For more information about time zones, seeTime zone.

String literals with the canonical timestamp format, including those withtime zone names, implicitly coerce to a timestamp literal when used where atimestamp expression is expected. For example, in the following query, thestring literal"2014-09-27 12:30:00.45 America/Los_Angeles" is coercedto a timestamp literal.

SELECT*FROMfooWHEREtimestamp_col="2014-09-27 12:30:00.45 America/Los_Angeles"

A timestamp literal can include these optional characters:

  • T ort
  • Z orz

If you use one of these characters, a space can't be included before or afterit. These are valid:

TIMESTAMP'2017-01-18T12:34:56.123456Z'TIMESTAMP'2017-01-18t12:34:56.123456'TIMESTAMP'2017-01-18 12:34:56.123456z'TIMESTAMP'2017-01-18 12:34:56.123456Z'

A timestamp literal represents a constant value of thetimestamp data type.

Time zone

Since timestamp literals must be mapped to a specific point in time, a time zoneis necessary to correctly interpret a literal. If a time zone isn't specifiedas part of the literal itself, then GoogleSQL uses the default time zonevalue, which the GoogleSQL implementation sets.

GoogleSQL can represent a time zones using a string, which representstheoffset from Coordinated Universal Time (UTC).

Examples:

'-08:00''-8:15''+3:00''+07:30''-7'

Time zones can also be expressed using stringtime zone names.

Examples:

TIMESTAMP'2014-09-27 12:30:00 America/Los_Angeles'TIMESTAMP'2014-09-27 12:30:00 America/Argentina/Buenos_Aires'

Interval literals

An interval literal represents a constant value of theinterval data type. There are two types ofinterval literals:

An interval literal can be used directly inside of theSELECT statementand as an argument in some functions that support the interval data type.

Interval literal with a single datetime part

Syntax:

INTERVALint64_expressiondatetime_part

The single datetime part syntax includes anINT64 expression and asingleinterval-supported datetime part.For example:

-- 0 years, 0 months, 5 days, 0 hours, 0 minutes, 0 seconds (0-0 5 0:0:0)INTERVAL5DAY-- 0 years, 0 months, -5 days, 0 hours, 0 minutes, 0 seconds (0-0 -5 0:0:0)INTERVAL-5DAY-- 0 years, 0 months, 0 days, 0 hours, 0 minutes, 1 seconds (0-0 0 0:0:1)INTERVAL1SECOND

When a negative sign precedes the year or month part in an interval literal, thenegative sign distributes over the years and months. Or, when a negative signprecedes the time part in an interval literal, the negative sign distributesover the hours, minutes, and seconds. For example:

-- -2 years, -1 months, 0 days, 0 hours, 0 minutes, and 0 seconds (-2-1 0 0:0:0)INTERVAL-25MONTH-- 0 years, 0 months, 0 days, -1 hours, -30 minutes, and 0 seconds (0-0 0 -1:30:0)INTERVAL-90MINUTE

For more information on how to construct interval with a single datetime part,seeConstruct an interval with a single datetime part.

Interval literal with a datetime part range

Syntax:

INTERVALdatetime_parts_stringstarting_datetime_partTOending_datetime_part

The range datetime part syntax includes adatetime parts string,astarting datetime part, and anending datetime part.

For example:

-- 0 years, 0 months, 0 days, 10 hours, 20 minutes, 30 seconds (0-0 0 10:20:30.520)INTERVAL'10:20:30.52'HOURTOSECOND-- 1 year, 2 months, 0 days, 0 hours, 0 minutes, 0 seconds (1-2 0 0:0:0)INTERVAL'1-2'YEARTOMONTH-- 0 years, 1 month, -15 days, 0 hours, 0 minutes, 0 seconds (0-1 -15 0:0:0)INTERVAL'1 -15'MONTHTODAY-- 0 years, 0 months, 1 day, 5 hours, 30 minutes, 0 seconds (0-0 1 5:30:0)INTERVAL'1 5:30'DAYTOMINUTE

When a negative sign precedes the year or month part in an interval literal, thenegative sign distributes over the years and months. Or, when a negative signprecedes the time part in an interval literal, the negative sign distributesover the hours, minutes, and seconds. For example:

-- -23 years, -2 months, 10 days, -12 hours, -30 minutes, and 0 seconds (-23-2 10 -12:30:0)INTERVAL'-23-2 10 -12:30'YEARTOMINUTE-- -23 years, -2 months, 10 days, 0 hours, -30 minutes, and 0 seconds (-23-2 10 -0:30:0)SELECTINTERVAL'-23-2 10 -0:30'YEARTOMINUTE-- Produces an error because the negative sign for minutes must come before the hour.SELECTINTERVAL'-23-2 10 0:-30'YEARTOMINUTE-- Produces an error because the negative sign for months must come before the year.SELECTINTERVAL'23--2 10 0:30'YEARTOMINUTE-- 0 years, -2 months, 10 days, 0 hours, 30 minutes, and 0 seconds (-0-2 10 0:30:0)SELECTINTERVAL'-2 10 0:30'MONTHTOMINUTE-- 0 years, 0 months, 0 days, 0 hours, -30 minutes, and -10 seconds (0-0 0 -0:30:10)SELECTINTERVAL'-30:10'MINUTETOSECOND

For more information on how to construct interval with a datetime part range,seeConstruct an interval with a datetime part range.

Enum literals

There is no syntax for enum literals. Integer or string literals are coerced tothe enum type when necessary, or explicitly cast to a specific enum type name. For more information, seeLiteral coercion.

An enum literal represents a constant value of theenum data type.

JSON literals

Syntax:

JSON'json_formatted_data'

A JSON literal representsJSON-formatted data.

Example:

JSON'{  "id": 10,  "type": "fruit",  "name": "apple",  "on_menu": true,  "recipes":    {      "salads":      [        { "id": 2001, "type": "Walnut Apple Salad" },        { "id": 2002, "type": "Apple Spinach Salad" }      ],      "desserts":      [        { "id": 3001, "type": "Apple Pie" },        { "id": 3002, "type": "Apple Scones" },        { "id": 3003, "type": "Apple Crumble" }      ]    }}'

A JSON literal represents a constant value of theJSON data type.

Case sensitivity

GoogleSQL follows these rules for case sensitivity:

CategoryCase-sensitive?Notes
KeywordsNo 
Function namesNo 
Table namesSee Notes Table names are usually case-insensitive, but they might be case-sensitive when querying a database that uses case-sensitive table names.
Column namesNo 
Field namesNo 
All type names except for protocol buffer type namesNo 
Protocol buffer type namesYes 
Enum type namesYes 
String valuesYes Any value of typeSTRING preserves its case. For example, the result of an expression that produces aSTRING value or a column value that's of typeSTRING.
String comparisonsYes However, string comparisons are case-insensitive incollations that are case-insensitive. This behavior also applies to operations affected by collation, such asGROUP BY andDISTINCT clauses.
Aliases within a queryNo 
Regular expression matchingSee Notes Regular expression matching is case-sensitive by default, unless the expression itself specifies that it should be case-insensitive.
LIKE matchingYes 
Property graph namesNo 
Property graph label namesNo 
Property graph property namesNo 

Reserved keywords

Keywords are a group of tokens that have special meaning in the GoogleSQLlanguage, and have the following characteristics:

  • Keywords can't be used as identifiers unless enclosed by backtick (`) characters.
  • Keywords are case-insensitive.

GoogleSQL has the following reserved keywords.

ALL
AND
ANY
ARRAY
AS
ASC
ASSERT_ROWS_MODIFIED
AT
BETWEEN
BY
CASE
CAST
COLLATE
CONTAINS
CREATE
CROSS
CUBE
CURRENT
DEFAULT
DEFINE
DESC
DISTINCT
ELSE
END
ENUM
ESCAPE
EXCEPT
EXCLUDE
EXISTS
EXTRACT
FALSE
FETCH
FOLLOWING
FOR
FROM
FULL
GRAPH_TABLE
GROUP
GROUPING
GROUPS
HASH
HAVING
IF
IGNORE
IN
INNER
INTERSECT
INTERVAL
INTO
IS
JOIN
LATERAL
LEFT
LIKE
LIMIT
LOOKUP
MERGE
NATURAL
NEW
NO
NOT
NULL
NULLS
OF
ON
OR
ORDER
OUTER
OVER
PARTITION
PRECEDING
PROTO
RANGE
RECURSIVE
RESPECT
RIGHT
ROLLUP
ROWS
SELECT
SET
SOME
STRUCT
TABLESAMPLE
THEN
TO
TREAT
TRUE
UNBOUNDED
UNION
UNNEST
USING
WHEN
WHERE
WINDOW
WITH
WITHIN

Terminating semicolons

You can optionally use a terminating semicolon (;) when you submit a querystring statement through an Application Programming Interface (API).

In a request containing multiple statements, you must separate statements withsemicolons, but the semicolon is generally optional after the final statement.Some interactive tools require statements to have a terminating semicolon.

Trailing commas

You can optionally use a trailing comma (,) at the end of a column list in aSELECT statement. You might have a trailing comma as the result ofprogrammatically creating a column list.

Example

SELECTname,release_date,FROMBooks

Query parameters

You can use query parameters to substitute arbitrary expressions.However, query parameters can't be used to substitute identifiers,column names, table names, or other parts of the query itself.Query parameters are defined outside of the query statement.

Client APIs allow the binding of parameter names to values; the query enginesubstitutes a bound value for a parameter at execution time.

Parameterized queries have betterquery cachehit rates resulting in lower query latency and lower overall CPU usage.

For example, instead of using a query like the following:

SELECTAlbumIdFROMAlbumsWHERESEARCH(AlbumTitle_Tokens,'cat')

use the following syntax:

SELECTAlbumIdFROMAlbumsWHERESEARCH(AlbumTitle_Tokens,@p)

Spanner runs the query optimizer on distinct SQL. The fewerdistinct SQL instances the application uses, the fewer times the queryoptimization is invoked.

Named query parameters

Syntax:

@parameter_name

A named query parameter is denoted using anidentifierpreceded by the@ character.

A named query parameter can start with an identifier or a reserved keyword.An identifier can be unquoted or quoted.

Example:

This example returns all rows whereLastName is equal to the value of thenamed query parametermyparam.

SELECT*FROMRosterWHERELastName=@myparam

Hints

@{hint[,...]}hint:[engine_name.]hint_name=value

The purpose of a hint is to modify the execution strategy for a querywithout changing the result of the query. Hints generally don't affect querysemantics, but may have performance implications.These hint types are available:

Hint syntax requires the@ character followed by curly braces.You can create one hint or a group of hints. The optionalengine_name.prefix allows for multiple engines to define hints with the samehint_name.This is important if you need to suggest different engine-specificexecution strategies or different engines support different hints.

You can assignidentifiers andliterals to hints.

  • Identifiers are useful for hints that are meant to act like enums.You can use an identifier to avoid using a quoted string.In the resolved AST, identifier hints are represented as string literals,so@{hint="abc"} is the same as@{hint=abc}. Identifier hints can alsobe used for hints that take a table name or columnname as a single identifier.
  • NULL literals are allowed and are inferred as integers.

Hints are meant to apply only to the node they are attached to,and not to a larger scope.For example, a hint on aJOIN in the middle of theFROM clause is meant to apply to thatJOIN only, and not otherJOINsin theFROM clause.Statement-level hints can be used for hintsthat modify execution of an entire statement, for example an overall memorybudget or deadline.

Examples

In this example, a literal is assigned to a hint. This hint is only usedwith two database engines calleddatabase_engine_a anddatabase_engine_b.The value for the hint is different for each database engine.

@{database_engine_a.file_count=23,database_engine_b.file_count=10}

In this example, an identifier is assigned to a hint. There are uniqueidentifiers for each hint type. You can view a list of hint types at thebeginning of this topic.

@{JOIN_METHOD=HASH_JOIN}

Comments

Comments are sequences of characters that the parser ignores.GoogleSQL supports the following types of comments.

Single-line comments

Use a single-line comment if you want the comment to appear on a line by itself.

Examples

# this is a single-line commentSELECTbookFROMlibrary;
-- this is a single-line commentSELECTbookFROMlibrary;
/* this is a single-line comment */SELECTbookFROMlibrary;
SELECTbookFROMlibrary/* this is a single-line comment */WHEREbook="Ulysses";

Inline comments

Use an inline comment if you want the comment to appear on the same line asa statement. A comment that's prepended with# or-- must appear to theright of a statement.

Examples

SELECTbookFROMlibrary;# this is an inline comment
SELECTbookFROMlibrary;-- this is an inline comment
SELECTbookFROMlibrary;/* this is an inline comment */
SELECTbookFROMlibrary/* this is an inline comment */WHEREbook="Ulysses";

Multiline comments

Use a multiline comment if you need the comment to span multiple lines.Nested multiline comments aren't supported.

Examples

SELECTbookFROMlibrary/*  This is a multiline comment  on multiple lines*/WHEREbook="Ulysses";
SELECTbookFROMlibrary/* this is a multiline commenton two lines */WHEREbook="Ulysses";

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.