Life of a Spanner Query

Client

Spanner supportsSQL queries. Here's a sample query:

SELECTs.SingerId,s.FirstName,s.LastName,s.SingerInfoFROMSingersASsWHEREs.FirstName=@firstName;

The construct@firstName is a reference to a query parameter. You can use aquery parameter anywhere a literal value can be used. Using parameters inprogrammatic APIs is strongly recommended. Use of query parameters helps avoidSQL injection attacks and the resulting queries are more likelyto benefit from various server-side caches. For more information, seeCaching.

Query parameters must be bound to a value when the query is executed. Forexample:

Statementstatement=Statement.newBuilder("SELECT s.SingerId...").bind("firstName").to("Jimi").build();try(ResultSetresultSet=dbClient.singleUse().executeQuery(statement)){while(resultSet.next()){...}}

Once Spanner receives an API call, it analyzes the query andbound parameters to determine which Spanner server node shouldprocess the query. The server sends back a stream of result rows that areconsumed by the calls toResultSet.next().

Query execution

Query execution begins with the arrival of an "execute query" request at someSpanner server. The server performs the following steps:

Validate the request
Parse the query text
Generate an initial query algebra
Generate an optimized query algebra
Generate an executable query plan
Execute the plan (check permissions, read data, encode results, etc.)

Parsing

The SQL parser analyzes the query text and converts it to anabstract syntaxtree. It extracts the basic query structure(SELECT …FROM … WHERE …) and does syntactic checks.

Algebra

Spanner'stype system can represent scalars,arrays, structures, etc. The query algebra defines operators for table scans,filtering, sorting/grouping, all sorts of joins, aggregation, and much more. Theinitial query algebra is built from the output of the parser. Field namereferences in the parse tree are resolved using the database schema. This codealso checks for semantic errors (e.g., incorrect number of parameters, typemismatches, and so forth).

The next step ("query optimization") takes the initial algebra and generates amore-optimal algebra. This might be simpler, more efficient, or just more-suitedto the capabilities of the execution engine. For example, the initial algebramight specify just a "join" while the optimized algebra specifies a "hash join".

Execution

The final executable query plan is built from the rewritten algebra. Basically,the executable plan is adirected acyclic graph of"iterators". Each iterator exposes a sequence of values. Iterators may consumeinputs to produce outputs (e.g., sort iterator). Queries that involve a singlesplit can be executed by a single server (the one that holds the data).The server will scan ranges from various tables, execute joins, performaggregation, and all other operations defined by the query algebra.

Queries that involve multiple splits will be factored into multiple pieces. Somepart of the query will continue to be executed on the main (root) server. Otherpartial subqueries are handed-off to leaf nodes (those that own the splits beingread). This hand-off can be recursively applied for complex queries, resultingin a tree of server executions. All servers agree on a timestamp so that thequery results are a consistent snapshot of the data. Each leaf server sends backa stream of partial results. For queries involving aggregation, these could bepartially-aggregated results. The query root server processes results from theleaf servers and runs the remainder of the query plan. For more information, seeQuery execution plans.

When a query involves multiple splits, Spanner can execute thequery in parallel across the splits. The degree of parallelism depends on therange of data that the query scans, the query execution plan, and thedistribution of data across splits. Spanner automatically setsthe maximum degree of parallelism for a query based on its instance size andinstance configuration (regional or multi-region) in orderto achieve optimal query performance and avoidoverloading the CPU.

Caching

Many of the artifacts of query processing are automatically cached and re-usedfor subsequent queries. This includes query algebras, executable query plans,etc. The caching is based on the query text, names and types of boundparameters. This is why using bound parameters (like@firstName inthe example above) is better than using literal values in the query text. Theformer can be cached once and reused regardless of the actual bound value. SeeOptimizing Spanner Query Performancefor more details.

Error handling

TheexecuteQuery (orexecuteStreamingSql) andstreamingRead methods returna stream ofPartialResultSet messages. For efficiency, a single row orcolumn value might be split across multiplePartialResultSet messages,especially for large data.

This stream can be interrupted by transient network errors, split handoffs,or server restarts. Split handoffs might occur during load balancing and serverrestarts might occur during upgrades.

To handle these interruptions, Spanner includes opaqueresume_token strings in somePartialResultSet messages.

Key points aboutresume_token:

Not everyPartialResultSet contains aresume_token.
Aresume_token is typically included only at the end of a complete row,marking a safe resumption point.
PartialResultSet with achunked_value (for large values splitacross messages) won't have aresume_token until the entire value androw are sent.
To resume an interrupted stream, send a new request with thelast received non-emptyresume_token.

The Spanner client libraries automatically manage this bufferingand recovery. They assemble complete rows fromPartialResultSet messagesand track the latestresume_token. If the connection drops, the library usesthe last valid token to restart the stream, discarding any partial datareceived after that token. This process ensures you see a continuous,duplicate-free stream of complete rows, even if transient failures occur.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.

Movatterモバイル変換

Life of a Spanner Query Stay organized with collections Save and categorize content based on your preferences.