Query execution plans Stay organized with collections Save and categorize content based on your preferences.
Overview
This page provides concepts about query execution plans and how they are used bySpanner to perform queries in a distributed environment. To learn howto retrieve an execution plan for a specific query using theGoogle Cloud console, seeUnderstand how Spanner executesqueries. You can also view sampled historic query plans and compare theperformance of a query over time for certain queries. To learn more, seeSampled query plans.
Spanner uses declarative SQL statements to query its databases.SQL statements definewhat the user wants without specifyinghow to obtainthe results. Aquery execution plan is the set of steps for how the resultsare obtained. For a given SQL statement, there may be multiple ways to obtainthe results. The Spannerquery optimizer evaluatesdifferent execution plans and chooses the one it considers to be most efficient.Spanner then uses the execution plan to retrieve the results.Execution plans support GoogleSQL-dialect databases and PostgreSQL-dialect databases.
Conceptually, an execution plan is a tree of relational operators. Each operatorreads rows from its input(s) and produces output rows. The result of theoperator at the root of the execution is returned as the result of the SQLquery.
As an example, this query:
SELECT s.SongName FROM Songs AS s;results in a query execution plan that can be visualized as:

The queries and execution plans on this page are based on the following databaseschema:
CREATETABLESingers(SingerIdINT64NOTNULL,FirstNameSTRING(1024),LastNameSTRING(1024),SingerInfoBYTES(MAX),BirthDateDATE)PRIMARYKEY(SingerId);CREATEINDEXSingersByFirstLastNameONSingers(FirstName,LastName);CREATETABLEAlbums(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,AlbumTitleSTRING(MAX),MarketingBudgetINT64)PRIMARYKEY(SingerId,AlbumId),INTERLEAVEINPARENTSingersONDELETECASCADE;CREATEINDEXAlbumsByAlbumTitleONAlbums(AlbumTitle);CREATEINDEXAlbumsByAlbumTitle2ONAlbums(AlbumTitle)STORING(MarketingBudget);CREATETABLESongs(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,TrackIdINT64NOTNULL,SongNameSTRING(MAX),DurationINT64,SongGenreSTRING(25))PRIMARYKEY(SingerId,AlbumId,TrackId),INTERLEAVEINPARENTAlbumsONDELETECASCADE;CREATEINDEXSongsBySingerAlbumSongNameDescONSongs(SingerId,AlbumId,SongNameDESC),INTERLEAVEINAlbums;CREATEINDEXSongsBySongNameONSongs(SongName);CREATETABLEConcerts(VenueIdINT64NOTNULL,SingerIdINT64NOTNULL,ConcertDateDATENOTNULL,BeginTimeTIMESTAMP,EndTimeTIMESTAMP,TicketPricesARRAY<INT64>)PRIMARYKEY(VenueId,SingerId,ConcertDate);You can use the following Data Manipulation Language (DML) statements to adddata to these tables:
INSERTINTOSingers(SingerId,FirstName,LastName,BirthDate)VALUES(1,"Marc","Richards","1970-09-03"),(2,"Catalina","Smith","1990-08-17"),(3,"Alice","Trentor","1991-10-02"),(4,"Lea","Martin","1991-11-09"),(5,"David","Lomond","1977-01-29");INSERTINTOAlbums(SingerId,AlbumId,AlbumTitle)VALUES(1,1,"Total Junk"),(1,2,"Go, Go, Go"),(2,1,"Green"),(2,2,"Forever Hold Your Peace"),(2,3,"Terrified"),(3,1,"Nothing To Do With Me"),(4,1,"Play");INSERTINTOSongs(SingerId,AlbumId,TrackId,SongName,Duration,SongGenre)VALUES(2,1,1,"Let's Get Back Together",182,"COUNTRY"),(2,1,2,"Starting Again",156,"ROCK"),(2,1,3,"I Knew You Were Magic",294,"BLUES"),(2,1,4,"42",185,"CLASSICAL"),(2,1,5,"Blue",238,"BLUES"),(2,1,6,"Nothing Is The Same",303,"BLUES"),(2,1,7,"The Second Time",255,"ROCK"),(2,3,1,"Fight Story",194,"ROCK"),(3,1,1,"Not About The Guitar",278,"BLUES");Obtaining efficient execution plans is challenging because Spannerdivides data intosplits. Splits can move independently from each otherand get assigned to different servers, which could be in different physicallocations. To evaluate execution plans over the distributed data,Spanner uses execution based on:
- local execution ofsubplans in servers that contain the data
- orchestration and aggregation of multiple remote executions with aggressivedistribution pruning
Spanner uses the primitive operatordistributed union,along with its variantsdistributed cross apply anddistributed outer apply, to enable this model.
Sampled query plans
Spanner sampled query plans allow you to view samples of historicquery plans and compare the performance of a query over time. Not all querieshave sampled query plans available. Only queries that consume higher CPU mightbe sampled. The data retention for Spanner query plan samples is30 days. You can find query plan samples on theQuery insights page ofthe Google Cloud console. For instructions, seeView sampled query plans.
The anatomy of a sampled query plan is the same as a regular query executionplan. For more information on how to understand visual plans and use them todebug your queries, seeA tour of the query plan visualizer.
Common use cases for sampled query plans:
Some common use cases for sampled query plans include:
- Observe query plan changes due toschema changes(for example, adding or removing an index).
- Observe query plan changes due to anoptimizer version update.
- Observe query plan changes due tonew optimizer statistics,which are collected every three days automatically or performed manually usingthe
ANALYZEcommand.
If the performance of a query shows significant difference over time or if youwant to improve the performance of a query, seeSQL best practices toconstruct optimized query statements that help Spanner findefficient execution plans.
Life of a query
A SQL query in Spanner is first compiled into an execution plan,then it is sent to an initialroot server for execution. The root server ischosen so as to minimize the number of hops to reach the data being queried. Theroot server then:
- initiates remote execution of subplans (if necessary)
- waits for results from the remote executions
- handles any remaining local execution steps such as aggregating results
- returns results for the query
Remote servers that receive a subplan act as a "root" server for theirsubplan, following the same model as the topmost root server. The result is atree of remote executions. Conceptually, query execution flows from top tobottom, and query results are returned from bottom to top.The following diagramshows this pattern:
The following examples illustrate this pattern in more detail.
Aggregate queries
An aggregate query implementsGROUP BY queries.
For example, using this query:
SELECT s.SingerId, COUNT(*) AS SongCountFROM Songs AS sWHERE s.SingerId < 100GROUP BY s.SingerId;These are the results:
+----------+-----------+| SingerId | SongCount |+----------+-----------+| 3 | 1 || 2 | 8 |+----------+-----------+Conceptually, this is the execution plan:

Spanner sends the execution plan to a root server thatcoordinates the query execution and performs the remote distribution of subplans.
This execution plan starts with adistributed union, which distributessubplans to remote servers whose splits satisfySingerId < 100. After the scanon individual splits completes, thestream aggregate operator aggregates rowsto get the counts for eachSingerId. Theserialize result operator thenserializes the result. Finally, thedistributed union combines all resultstogether and returns the query results.
You can learn more about aggregates ataggregate operator.
Co-located join queries
Interleaved tables are physically stored with their rows of related tablesco-located. Aco-located join is a join between interleaved tables. Co-locatedjoins can offer performance benefits over joins that require indexes or backjoins.
For example, using this query:
SELECT al.AlbumTitle, so.SongNameFROM Albums AS al, Songs AS soWHERE al.SingerId = so.SingerId AND al.AlbumId = so.AlbumId;(This query assumes thatSongs is interleaved inAlbums.)
These are the results:
+-----------------------+--------------------------+| AlbumTitle | SongName |+-----------------------+--------------------------+| Nothing To Do With Me | Not About The Guitar || Green | The Second Time || Green | Starting Again || Green | Nothing Is The Same || Green | Let's Get Back Together || Green | I Knew You Were Magic || Green | Blue || Green | 42 || Terrified | Fight Story |+-----------------------+--------------------------+This is the execution plan:

This execution plan starts with adistributed union, whichdistributes subplans to remote servers that have splits of the tableAlbums.BecauseSongs is an interleaved table ofAlbums, each remote server is ableto execute the entire subplan on each remote server without requiring a join toa different server.
The subplans contain across apply. Each cross apply performs atablescan on tableAlbums to retrieveSingerId,AlbumId, andAlbumTitle. The cross apply then maps output from the table scan to outputfrom anindex scan on indexSongsBySingerAlbumSongNameDesc, subject to afilter of theSingerId in the index matching theSingerId from thetable scan output. Each cross apply sends its results to aserialize resultoperator which serializes theAlbumTitle andSongName data and returnsresults to the localdistributed unions. The distributed union aggregatesresults from the local distributed unions and returns them as the query result.
Index and back join queries
The example above used a join on two tables, one interleaved in the other.Execution plans are more complex and less efficient when two tables, or a tableand an index, are not interleaved.
Consider an index created with the following command:
CREATE INDEX SongsBySongName ON Songs(SongName)Use this index in this query:
SELECT s.SongName, s.DurationFROM Songs@{force_index=SongsBySongName} AS sWHERE STARTS_WITH(s.SongName, "B");These are the results:
+----------+----------+| SongName | Duration |+----------+----------+| Blue | 238 |+----------+----------+This is the execution plan:

The resulting execution plan is complicated because the indexSongsBySongNamedoes not contain columnDuration. To obtain theDuration value,Spanner needs toback join the indexed results to the tableSongs. This is a join but it is not co-located because theSongs table andthe global indexSongsBySongName are not interleaved. The resulting executionplan is more complex than the co-located join example becauseSpanner performs optimizations to speed up the execution if dataisn't co-located.
The top operator is adistributed cross apply. This input side ofthis operator are batches of rows from the indexSongsBySongName that satisfythe predicateSTARTS_WITH(s.SongName, "B"). The distributed cross applythen maps these batches to remote servers whose splits contain theDurationdata. The remote servers use atable scan to retrieve theDuration column.The table scan uses the filterCondition:($Songs_key_TrackId' =$batched_Songs_key_TrackId), which joinsTrackId from theSongs table toTrackId of the rows that were batched from the indexSongsBySongName.
The results are aggregated into the final query answer. In turn, the input sideof the distributed cross apply contains a distributed union/local distributedunion pair to evaluate rows from the index that satisfy theSTARTS_WITHpredicate.
Consider a slightly different query that doesn't select thes.Duration column:
SELECT s.SongNameFROM Songs@{force_index=SongsBySongName} AS sWHERE STARTS_WITH(s.SongName, "B");This query is able to fully leverage the index as shown in this execution plan:

The execution plan doesn't require a back join because all the columns requestedby the query are present in the index.
What's next
Learn aboutQuery execution operators
Learn about the Spannerquery optimizer
Learn how toManage the query optimizer
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.