Query execution operators

This page describes details about operators used in SpannerQuery execution plans. To learn how to retrieve an execution plan for aspecific query using the Google Cloud console, seeUnderstanding howSpanner executes queries.

The queries and execution plans on this page are based on the following databaseschema:

CREATETABLESingers(SingerIdINT64NOTNULL,FirstNameSTRING(1024),LastNameSTRING(1024),SingerInfoBYTES(MAX),BirthDateDATE)PRIMARYKEY(SingerId);CREATEINDEXSingersByFirstLastNameONSingers(FirstName,LastName);CREATETABLEAlbums(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,AlbumTitleSTRING(MAX),MarketingBudgetINT64)PRIMARYKEY(SingerId,AlbumId),INTERLEAVEINPARENTSingersONDELETECASCADE;CREATEINDEXAlbumsByAlbumTitleONAlbums(AlbumTitle);CREATEINDEXAlbumsByAlbumTitle2ONAlbums(AlbumTitle)STORING(MarketingBudget);CREATETABLESongs(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,TrackIdINT64NOTNULL,SongNameSTRING(MAX),DurationINT64,SongGenreSTRING(25))PRIMARYKEY(SingerId,AlbumId,TrackId),INTERLEAVEINPARENTAlbumsONDELETECASCADE;CREATEINDEXSongsBySingerAlbumSongNameDescONSongs(SingerId,AlbumId,SongNameDESC),INTERLEAVEINAlbums;CREATEINDEXSongsBySongNameONSongs(SongName);CREATETABLEConcerts(VenueIdINT64NOTNULL,SingerIdINT64NOTNULL,ConcertDateDATENOTNULL,BeginTimeTIMESTAMP,EndTimeTIMESTAMP,TicketPricesARRAY<INT64>)PRIMARYKEY(VenueId,SingerId,ConcertDate);

You can use the following Data Manipulation Language (DML) statements to adddata to these tables:

INSERTINTOSingers(SingerId,FirstName,LastName,BirthDate)VALUES(1,"Marc","Richards","1970-09-03"),(2,"Catalina","Smith","1990-08-17"),(3,"Alice","Trentor","1991-10-02"),(4,"Lea","Martin","1991-11-09"),(5,"David","Lomond","1977-01-29");INSERTINTOAlbums(SingerId,AlbumId,AlbumTitle)VALUES(1,1,"Total Junk"),(1,2,"Go, Go, Go"),(2,1,"Green"),(2,2,"Forever Hold Your Peace"),(2,3,"Terrified"),(3,1,"Nothing To Do With Me"),(4,1,"Play");INSERTINTOSongs(SingerId,AlbumId,TrackId,SongName,Duration,SongGenre)VALUES(2,1,1,"Let's Get Back Together",182,"COUNTRY"),(2,1,2,"Starting Again",156,"ROCK"),(2,1,3,"I Knew You Were Magic",294,"BLUES"),(2,1,4,"42",185,"CLASSICAL"),(2,1,5,"Blue",238,"BLUES"),(2,1,6,"Nothing Is The Same",303,"BLUES"),(2,1,7,"The Second Time",255,"ROCK"),(2,3,1,"Fight Story",194,"ROCK"),(3,1,1,"Not About The Guitar",278,"BLUES");
Note: You can run queries and retrieve execution plans even if the tables haveno data.

Leaf operators

Aleaf operator is an operator that has no children. The types of leafoperators are:

Array unnest

Anarray unnest operator flattens an input array into rows of elements. Eachresulting row contains up to two columns: the actual value from the array, andoptionally the zero-based position in the array.

For example, using this query:

SELECTa,bFROMUNNEST([1,2,3])aWITHOFFSETb;

The query flattens the array[1,2,3] in columna and shows the arrayposition in columnb.

These are the results:

ab
10
21
32

This is the execution plan:

array unnest operator

Generate relation

Agenerate relation operator returns zero or more rows.

Unit relation

Theunit relation returns one row. It is a special case of thegeneraterelation operator.

For example, using this query:

SELECT1+2ASResult;

The result is:

Result
3

This is the execution plan:

unit relation operator

Empty relation

Theempty relation returns no rows. It is a special case of thegeneraterelation operator.

For example, using this query:

SELECT*FROMalbumsLIMIT0

The result is:

No results

This is the execution plan:

empty relation operator

Scan

Ascan operator returns rows by scanning a source of rows. These are the typesof scan operators:

  • Table scan: The scan occurs on a table.
  • Index scan: The scan occurs on an index.
  • Batch scan: The scan occurs on intermediate tables created by otherrelational operators (for example, a table created by adistributed crossapply).

Whenever possible, Spanner applies predicates on keys as part ofa scan. Scans execute more efficiently when predicates are applied because thescan doesn't need to read the entire table or index. Predicates appear in theexecution plan in the formKeyPredicate: column=value.

In the worst case, a query may need to look up all the rows in a table. Thissituation leads to a full scan, and appears in the execution plan asfull scan:true.

For example, using this query:

SELECTs.lastnameFROMsingers@{FORCE_INDEX=SingersByFirstLastName}assWHEREs.firstname='Catalina';

These are the results:

LastName
Smith

This is the execution plan:

scan operator

In the execution plan, the top-leveldistributed unionoperator sends subplans to remote servers. Each subplan has aserializeresult operator and an index scan operator. The predicateKey Predicate: FirstName = 'Catalina' restricts the scan to rows in the indexSingersByFirstLastname that haveFirstName equal toCatalina. The outputof the index scan is returned to the serialize result operator.

Unary operators

Aunary operator is an operator that has a single relational child.

The following operators are unary operators:

Aggregate

Anaggregate operator implementsGROUP BY SQL statements and aggregatefunctions (such asCOUNT). The input for an aggregate operator is logicallypartitioned into groups arranged on key columns (or into a single group ifGROUP BY isn't present). For each group, zero or more aggregates are computed.

For example, using this query:

SELECTs.singerid,Avg(s.duration)ASaverage,Count(*)AScountFROMsongsASsGROUPBYsingerid;

The query groups bySingerId and performs anAVG aggregation and aCOUNTaggregation.

These are the results:

SingerIdaveragecount
32781
2225.8758

This is the execution plan:

aggregate operator

Aggregate operators can bestream-based orhash-based. The previousexecution plan shows a stream-based aggregate. Stream-based aggregates read fromalready pre-sorted input (ifGROUP BY is present) and compute the groupwithout blocking. Hash-based aggregates build hash tables to maintainincremental aggregates of multiple input rows simultaneously. Stream-basedaggregates are faster and use less memory than hash-based aggregates, butrequire the input to be sorted (either by key columns orsecondaryindexes).

For distributed scenarios, an aggregate operator can be separated into alocal-global pair. Each remote server performs the local aggregation on itsinput rows, and then returns its results to the root server. The root serverperforms the global aggregation.

Apply mutations

Anapply mutations operator applies the mutationsfrom aData Manipulation Statement (DML) to thetable. It's the top operator in a query plan for a DML statement.

For example, using this query:

DELETEFROMsingersWHEREfirstname='Alice';

These are the results:

4rowsdeletedThisstatementdeleted4rowsanddidnotreturnanyrows.

This is the execution plan:

apply mutations operator

Create batch

Acreate batch operator batches its input rows into a sequence. A create batchoperation usually occurs as a part of adistributed crossapply operation. The input rows can be re-orderedduring the batching. The number of input rows that get batched in each executionof the batch operator varies.

See thedistributed cross apply operator for anexample of a create batch operator in an execution plan.

Compute

Acompute operator produces output by reading its input rows and adding one ormore additional columns that are computed using scalar expressions. See theunion all operator for an example of a compute operator in anexecution plan.

Compute struct

Acompute struct operator creates a variable for a structure that containsfields for each of the input columns.

For example, using this query:

SELECTFirstName,ARRAY(SELECTASSTRUCTsong.SongName,song.SongGenreFROMSongsASsongWHEREsong.SingerId=singer.SingerId)FROMsingersASsingerWHEREsinger.SingerId=3;

These are the results:

FirstNameUnspecified
Alice[["Not About The Guitar","BLUES"]]

This is the execution plan:

compute struct operator

In the execution plan, the array subquery operator receives input from adistributed union operator, which receives input from acompute struct operator. The compute struct operator creates a structure fromtheSongName andSongGenre columns in theSongs table.

DataBlockToRowAdapter

ADataBlockToRowAdapter operator is automatically inserted by theSpanner query optimizer between a pair of operators that operateusing different execution methods. Its input is an operator using thebatch-oriented execution method and its output is fed into an operator executingin the row-oriented execution method. For more information, seeOptimize queryexecution.

Filter

Afilter operator reads all rows from its input, applies a scalar predicate oneach row, and then returns only the rows that satisfy the predicate.

For example, using this query:

SELECTs.lastnameFROM(SELECTs.lastnameFROMsingersASsLIMIT3)sWHEREs.lastnameLIKE'Rich%';

These are the results:

LastName
Richards

This is the execution plan:

filter operator

The predicate for singers whose last name starts withRich is implemented as afilter. The filter's input is the output from an indexscan, and thefilter's output are rows whereLastName starts withRich.

For performance, whenever a filter is directly positioned above ascan,the filter impacts how data is read. For example, consider a table with keyk.A filter with predicatek = 5 directly on top of a scan of the table looks forrows that matchk = 5, without reading the entire input. This results in moreefficient execution of the query. In the previous example, the filter operatorreads only the rows that satisfy theWHERE s.LastName LIKE 'Rich%' predicate.

Filter scan

Afilter scan operator is always on top of atable or index scan. Itworks with the scan to reduce the number of rows read from the database, and theresulting scan is typically faster than with afilter.Spanner applies the filter scan in certain conditions:

  • Seekable condition: The seekable condition applies if Spannercan determine a specific row to access in the table. In general, thishappens when the filter is on a prefix of the primary key. For example, ifthe primary key consists ofCol1 andCol2, then aWHERE clause thatincludes explicit values forCol1, orCol1 andCol2 is seekable. Inthat case, Spanner reads data only within the key range.
  • Residual condition: Any other condition where Spanner canevaluate the scan to limit the amount of data that's read.

For example, using this query:

SELECTlastnameFROMsingersWHEREsingerid=1

These are the results:

LastName
Richards

This is the execution plan:

filter scan operator

Limit

Alimit operator constrains the number of rows returned. An optionalOFFSETparameter specifies the starting row to return. For distributed scenarios, alimit operator can be separated into a local-global pair. Each remote serverapplies the local limit for its output rows, and then returns its results to theroot server. The root server aggregates the rows sent by the remote servers andthen applies the global limit.

For example, using this query:

SELECTs.songnameFROMsongsASsLIMIT3;

These are the results:

SongName
Not About The Guitar
The Second Time
Starting Again

This is the execution plan:

limit operator

The local limit is the limit for each remote server. The root server aggregatesthe rows from the remote servers and then applies the global limit.

Random ID Assign

Arandom ID assign operator produces output by reading its input rows andadding a random number to each row. It works with aFilter orSort operatorto achieve sampling methods. Supported sampling methods areBernoulliandReservoir.

For example, the following query uses Bernoulli sampling with a sampling rate of10 percent.

SELECTs.songnameFROMsongsASsTABLESAMPLEbernoulli(10PERCENT);

These are the results:

SongName
Starting Again
Nothing Is The Same

Note that because the result is a sample, the result could vary each time thequery is run even though the query is the same.

This is the execution plan:

bernoulli sample operator

In this execution plan, theRandom Id Assign operator receives its input fromadistributed union operator, which receives its inputfrom anindex scan. The operator returns the rows with random ids andtheFilter operator then applies a scalar predicate on the random ids andreturns approximately 10 percent of the rows.

The following example usesReservoir

sampling with a sampling rate of 2 rows.

SELECTs.songnameFROMsongsASsTABLESAMPLEreservoir(2rows);

These are the results:

SongName
I Knew You Were Magic
The Second Time

Note that because the result is a sample, the result could vary each time thequery is run even though the query is the same.

This is the execution plan:

reservoir sample operator

In this execution plan, theRandom Id Assign operator receives its input fromadistributed union operator, which receives its inputfrom anindex scan. The operator returns the rows with random ids andtheSort operator then applies the sort order on the random ids and applyLIMIT with 2 rows.

Local split union

Alocal split union operator finds tablesplits stored on the localserver, runs a subquery on each split, and then creates a union that combinesall results.

Alocal split union appears in execution plans that scan aplacementtable. Placements can increase the number of splits in a table, making it moreefficient to scan splits in batches based on their physical storage locations.

For example, suppose theSingers table uses a placement key to partitionsinger data:

CREATETABLESingers(SingerIdINT64NOTNULL,SingerNameSTRING(MAX)NOTNULL,...LocationSTRING(MAX)NOTNULLPLACEMENTKEY)PRIMARYKEY(SingerId);

Now, consider this query:

SELECTBirthDateFROMSingers;

This is the execution plan:

local split union operator

Thedistributed union sends a subquery to each batch ofsplits physically stored together in the same server. On each server, thelocalsplit union finds splits storingSingers data, executes the subquery on eachsplit, and returns the combined results. In this way, the distributed union andlocal split union work together to efficiently scan theSingers table. Withouta local split union, the distributed union would send one RPC per split insteadof per split batch, resulting in redundant RPC round trips when there's morethan one split per batch.

RowToDataBlockAdapter

ARowToDataBlockAdapter operator is automatically inserted by theSpanner query optimizer between a pair of operators that operateusing different execution methods. Its input is an operator using therow-oriented execution method and its output is fed into an operator executingin the batch-oriented execution method. For more information, seeOptimizequery execution.

Serialize result

Aserialize result operator is a special case of the compute struct operatorthat serializes each row of the final result of the query, for returning to theclient.

For example, using this query:

SELECTarray(selectasstructso.songname,so.songgenreFROMsongsASsoWHEREso.singerid=s.singerid)FROMsingersASs;

The query asks for an array ofSongNameandSongGenre based onSingerId.

These are the results:

Unspecified
[]
[[Let's Get Back Together, COUNTRY], [Starting Again, ROCK]]
[[Not About The Guitar, BLUES]]
[]
[]

This is the execution plan:

Serialize result operator

The serialize result operator creates a result that contains, for each row oftheSingers table, an array ofSongName andSongGenre pairs for the songsby the singer.

Sort

Asort operator reads its input rows, orders them by column(s), and thenreturns the sorted results.

For example, using this query:

SELECTs.songgenreFROMsongsASsORDERBYsonggenre;

These are the results:

SongGenre
BLUES
BLUES
BLUES
BLUES
CLASSICAL
COUNTRY
ROCK
ROCK
ROCK

This is the execution plan:

sort operator

In this execution plan, the sort operator receives its input rows from adistributed union operator, sorts the input rows, andreturns the sorted rows to aserialize result operator.

To constrain the number of rows returned, a sort operator can optionally haveLIMIT andOFFSET parameters. For distributed scenarios, a sort operator withaLIMIT orOFFSET operator is separated into a local-global pair. Eachremote server applies the sort order and the local limit or offset for its inputrows, and then returns its results to the root server. The root serveraggregates the rows sent by the remote servers, sorts them, and then applies theglobal limit/offset.

For example, using this query:

SELECTs.songgenreFROMsongsASsORDERBYsonggenreLIMIT3;

These are the results:

SongGenre
BLUES
BLUES
BLUES

This is the execution plan:

sort operator with limit

The execution plan shows the local limit for the remote servers and the globallimit for the root server.

TVF

Atable valued function operator produces output by reading its input rows andapplying the specified function. The function might implement mapping and returnthe same number of rows as input. It can also be a generator that returns morerows or a filter that returns less rows.

For example, using this query:

SELECTgenre,songnameFROMml.predict(modelgenreclassifier,TABLEsongs)

These are the results:

GenreSongName
CountryNot About The Guitar
RockThe Second Time
PopStarting Again
PopNothing Is The Same
CountryLet's Get Back Together
PopI Knew You Were Magic
ElectronicBlue
Rock42
RockFight Story

This is the execution plan:

tvf operator

Union input

Aunion input operator returns results to aunion all operator.See theunion all operator for an example of a union inputoperator in an execution plan.

Binary operators

Abinary operator is an operator that has two relational children. Thefollowing operators are binary operators:

Cross apply

Across apply operator runs a table query on each row retrieved by a query ofanother table, and returns the union of all the table query runs. Cross applyandouter apply operators execute row-oriented processing,unlike operators that execute set-based processing such ashashjoin . The cross apply operator has two inputs,input andmap.The cross apply operator applies each row in the input side to the map side. Theresult of the cross apply has columns from both the input and map sides.

For example, using this query:

SELECTsi.firstname,(SELECTso.songnameFROMsongsASsoWHEREso.singerid=si.singeridLIMIT1)FROMsingersASsi;

The query asks for the first name of each singer, along with the name of onlyone of the singer's songs.

These are the results:

FirstNameUnspecified
AliceNot About The Guitar
CatalinaLet's Get Back Together
DavidNULL
LeaNULL
MarcNULL

The first column is populated from theSingers table, and the second column ispopulated from theSongs table. In cases where aSingerId existed in theSingers table but there was no matchingSingerId in theSongs table, thesecond column containsNULL.

This is the execution plan:

cross apply operator

The top-level node is adistributed union operator. Thedistributed union operator distributes sub plans to remote servers. The subplancontains aserialize result operator that computes thesinger's first name and the name of one of the singer's songs and serializeseach row of the output.

The serialize result operator receives its input from a cross apply operator.The input side for the cross apply operator is a tablescan on theSingers table.

The map side for the cross apply operation contains the following (from top tobottom):

  • Anaggregate operator that returnsSongs.SongName.
  • Alimit operator that limits the number of songs returned to oneper singer.
  • An indexscan on theSongsBySingerAlbumSongNameDesc index.

The cross apply operator maps each row from the input side to a row in the mapside that has the sameSingerId. The cross apply operator output is theFirstName value from the input row, and theSongName value from the map row.(TheSongName value isNULL if there is no map row that matches onSingerId.) The distributed union operator at the top of the execution planthen combines all of the output rows from the remote servers and returns them asthe query results.

Hash join

Ahash join operator is a hash-based implementation of SQL joins. Hash joinsexecute set-based processing. The hash join operator reads rows from inputmarked asbuild and inserts them into a hash table based on a join condition.The hash join operator then reads rows from input marked asprobe. For eachrow it reads from the probe input, the hash join operator looks for matchingrows in the hash table. The hash join operator returns the matching rows as itsresult.

For example, using this query:

SELECTa.albumtitle,s.songnameFROMalbumsASajoin@{join_method=hash_join}songsASsONa.singerid=s.singeridANDa.albumid=s.albumid;

These are the results:

AlbumTitleSongName
Nothing To Do With MeNot About The Guitar
GreenThe Second Time
GreenStarting Again
GreenNothing Is The Same
GreenLet's Get Back Together
GreenI Knew You Were Magic
GreenBlue
Green42
TerrifiedFight Story

This is the execution plan:

hash join operator

In the execution plan,build is adistributed union thatdistributesscans on the tableAlbums.Probe is a distributed unionoperator that distributes scans on the indexSongsBySingerAlbumSongNameDesc.The hash join operator reads all rows from the build side. Each build row isplaced in a hash table based on the columns in the conditiona.SingerId =s.SingerId AND a.AlbumId = s.AlbumId. Next, the hash join operator reads allrows from the probe side. For each probe row, the hash join operator looks formatches in the hash table. The resulting matches are returned by the hash joinoperator.

Resulting matches in the hash table might also be filtered by a residualcondition before they're returned. (An example of where residual conditionsappear is in non-equality joins). Hash join execution plans can be complex dueto memory management and join variants. The main hash join algorithm is adaptedto handle inner, semi, anti, and outer join variants.

Note: Avisualized query planrendersbuild andprobe as the left and right children, respectively, of a hash join.

Merge join

Amerge join operator is a merge-based implementation of SQL join. Both sidesof the join produce rows ordered by the columns used in the join condition. Themerge join consumes both input streams concurrently and outputs rows when thejoin condition is satisfied. If the inputs are not originally sorted as requiredthen the optimizer adds explicitSort operators to the plan.

Merge join isn't selected automatically by the optimizer. To use thisoperator, set the join method toMERGE_JOIN on the query hint, as shownin the following example:

SELECTa.albumtitle,s.songnameFROMalbumsASajoin@{join_method=merge_join}songsASsONa.singerid=s.singeridANDa.albumid=s.albumid;

These are the results:

AlbumTitleSongName
GreenThe Second Time
GreenStarting Again
GreenNothing Is The Same
GreenLet's Get Back Together
GreenI Knew You Were Magic
GreenBlue
Green42
TerrifiedFight Story
Nothing To Do With MeNot About The Guitar

This is the execution plan:

merge join operator_1

In this execution plan, the merge join is distributed so that the join executeswhere the data is located. This also allows the merge join in this example tooperate without the introduction of additional sort operators, because bothtable scans are already sorted bySingerId,AlbumId, which is the joincondition. In this plan the left hand side scan of theAlbums table advanceswhenever itsSingerId,AlbumId is comparatively less than the right handsideSongsBySingerAlbumSongNameDesc index scanSingerId_1,AlbumId_1 pair.Similarly, the right hand side advances whenever it's less than the left handside. This merge advance continues searching for equivalences such thatresulting matches can be returned.

Consider anothermerge join example using the following query:

SELECTa.albumtitle,s.songnameFROMalbumsASajoin@{join_method=merge_join}songsASsONa.albumid=s.albumid;

It yields the following results:

AlbumTitleSongName
Total JunkThe Second Time
Total JunkStarting Again
Total JunkNothing Is The Same
Total JunkLet's Get Back Together
Total JunkI Knew You Were Magic
Total JunkBlue
Total Junk42
Total JunkNot About The Guitar
GreenThe Second Time
GreenStarting Again
GreenNothing Is The Same
GreenLet's Get Back Together
GreenI Knew You Were Magic
GreenBlue
Green42
GreenNot About The Guitar
Nothing To Do With MeThe Second Time
Nothing To Do With MeStarting Again
Nothing To Do With MeNothing Is The Same
Nothing To Do With MeLet's Get Back Together
Nothing To Do With MeI Knew You Were Magic
Nothing To Do With MeBlue
Nothing To Do With Me42
Nothing To Do With MeNot About The Guitar
PlayThe Second Time
PlayStarting Again
PlayNothing Is The Same
PlayLet's Get Back Together
PlayI Knew You Were Magic
PlayBlue
Play42
PlayNot About The Guitar
TerrifiedFight Story

This is the execution plan:

merge join operator_2

In the preceding execution plan, additionalSort operators have beenintroduced by the query optimizer to achieve the necessary properties for themerge join to execute. TheJOIN condition in this example's query is only onAlbumId, which isn't how the data is stored, so a sort must be added. Thequery engine supports a Distributed Merge algorithm, allowing the sort to happenlocally instead of globally, which distributes and parallelizes the CPU cost.

The resulting matches might also be filtered by a residual condition before theyare returned. (An example of where residual conditions appear is in non-equalityjoins). Merge join execution plans can be complex due to additional sortrequirements. The main merge join algorithm is adapted to handle inner, semi,anti, and outer join variants.

Push broadcast hash join

Apush broadcast hash join operator is a distributed hash-join-basedimplementation of SQL joins. The push broadcast hash join operator reads rowsfrom the input side in order to construct a batch of data. That batch is thenbroadcast to all servers containing map side data. On the destination serverswhere the batch of data is received, a hash join is built using the batch as thebuild side data and the local data is then scanned as the probe side of the hashjoin.

Push broadcast hash join isn't selected automatically by the optimizer. To usethis operator, set the join method toPUSH_BROADCAST_HASH_JOIN on thequery hint, as shown in the following example:

SELECTa.albumtitle,s.songnameFROMalbumsASajoin@{join_method=push_broadcast_hash_join}songsASsONa.singerid=s.singeridANDa.albumid=s.albumid;

These are the results:

AlbumTitleSongName
GreenThe Second Time
GreenStarting Again
GreenNothing Is The Same
GreenLets Get Back Together
GreenI Knew You Were Magic
GreenBlue
Green42
TerrifiedFight Story
Nothing To Do With MeNot About The Guitar

This is the execution plan:

push_broadcast hash_join operator

The input to the Push broadcast hash join is theAlbumsByAlbumTitle index.That input is serialized into a batch of data. That batch is then sent to allthe local splits of the indexSongsBySingerAlbumSongNameDesc, where the batchis then deserialized and built into a hash table. The hash table then uses thelocal index data as a probe returning resulting matches.

Resulting matches might also be filtered by a residual condition before they'rereturned. (An example of where residual conditions appear is in non-equalityjoins).

Outer apply

Anouter apply operator is similar to across apply operator,except an outer apply operator ensures that each execution on the map sidereturns at least one row by manufacturing a NULL-padded row if needed. (In otherwords, it provides left outer join semantics.)

Recursive union

Arecursive union operator performs a union of two inputs, one that representsabase case, and the other that represents arecursive case. It's used ingraph queries with quantified path traversals. The base input is processed firstand exactly once. The recursive input is processed until the recursionterminates. The recursion terminates when the upper bound, if specified, isreached, or when the recursion doesn't produce any new results. In the followingexample, theCollaborations table is added to the schema, and a property graphcalledMusicGraph is created.

CREATETABLECollaborations(SingerIdINT64NOTNULL,FeaturingSingerIdINT64NOTNULL,AlbumTitleSTRING(MAX)NOTNULL,)PRIMARYKEY(SingerId,FeaturingSingerId,AlbumTitle);CREATEORREPLACEPROPERTYGRAPHMusicGraphNODETABLES(SingersKEY(SingerId)LABELSingersPROPERTIES(BirthDate,FirstName,LastName,SingerId,SingerInfo))EDGETABLES(CollaborationsASCollabWithKEY(SingerId,FeaturingSingerId,AlbumTitle)SOURCEKEY(SingerId)REFERENCESSingers(SingerId)DESTINATIONKEY(FeaturingSingerId)REFERENCESSingers(SingerId)LABELCollabWithPROPERTIES(AlbumTitle,FeaturingSingerId,SingerId),);

The following graph query finds singers who have collaborated with a givensinger or collaborated with those collaborators.

GRAPHMusicGraphMATCH(singer:Singers{singerId:42})-[c:CollabWith]->{1,2}(featured:Singers)RETURNsinger.SingerIdASsinger,featured.SingerIdASfeatured

recursive union operator

Therecursive union operator filters theSingers table to find the singerwith the givenSingerId. This is the base input to therecursive union. Therecursive input to therecursive union comprises adistributed crossapply or other join operator for other queries thatrepeatedly joins theCollaborations table with the results of the previousiteration of the join. The rows from the base input form the zeroth iteration.At each iteration, the output of the iteration is stored by therecursive spoolscan. Rows from therecursive spool scan are joined with theCollaborationstable onspoolscan.featuredSingerId = Collaborations.SingerId. Recursionterminates when two iterations are complete, since that's the specified upperbound in the query.

N-ary operators

AnN-ary operator is an operator that has more than two relational children.The following operators are N-ary operators:

Union all

Aunion all operator combines all row sets of its children without removingduplicates. Union all operators receive their input fromunioninput operators that are distributed across multiple servers. Theunion all operator requires that its inputs have the same schema, that's, thesame set of data types for each column.

For example, using this query:

SELECT1a,2bUNIONALLSELECT3a,4bUNIONALLSELECT5a,6b;

The row type for the children consists of two integers.

These are the results:

ab
12
34
56

This is the execution plan:

union_all_operator

The union all operator combines its input rows, and in this example it sends theresults to aserialize result operator.

A query such as the following would succeed, because the same set of data typesis used for each column, even though the children use different variables forthe column names:

SELECT1a,2bUNIONALLSELECT3c,4e;

A query such as the following wouldn't succeed, because the children usedifferent data types for the columns:

SELECT1a,2bUNIONALLSELECT3a,'This is a string'b;

Scalar subqueries

Ascalar subquery is a SQL sub-expression that's part of a scalar expression.Spanner attempts to remove scalar subqueries whenever possible.In certain scenarios, however, plans can explicitly contain scalar subqueries.

For example, using this query:

SELECTfirstname,IF(firstname='Alice',(SELECTCount(*)FROMsongsWHEREduration >300),0)FROMsingers;

This is the SQL sub-expression:

SELECTCount(*)FROMsongsWHEREduration >300;

These are the results (of the complete query):

FirstName
Alice1
Catalina0
David0
Lea0
Marc0

This is the execution plan:

scalar subquery operator

The execution plan contains a scalar subquery, shown asScalar Subquery,over anaggregate operator.

Spanner sometimes converts scalar subqueries into anotheroperator such as a join or cross apply, to possibly improve performance.

For example, using this query:

SELECT*FROMsongsWHEREduration=(SELECTMax(duration)FROMsongs);

This is the SQL sub-expression:

SELECTMAX(Duration)FROMSongs;

These are the results (of the complete query):

SingerIdAlbumIdTrackIdSongNameDurationSongGenre
216Nothing Is The Same303BLUES

This is the execution plan:

scalar subquery operator not display in plan

The execution plan doesn't contain a scalar subquery becauseSpanner converted the scalar subquery to a cross apply.

Array subqueries

Anarray subquery is similar to a scalar subquery, except that the subquery isallowed to consume more than one input row. The consumed rows are converted to asingle scalar output array that contains one element per consumed input row.

For example, using this query:

SELECTa.albumid,array(selectconcertdateFROMconcertsWHEREconcerts.singerid=a.singerid)FROMalbumsASa;

This is the subquery:

SELECTconcertdateFROMconcertsWHEREconcerts.singerid=a.singerid;

The results of the subquery for eachAlbumId are converted into an array ofConcertDate rows against thatAlbumId. The execution plan contains an arraysubquery, shown asArray Subquery, above a distributed union operator:

array subquery operator

Distributed operators

The operators described previously on this page execute within the boundaries ofa single machine.Distributed operators execute across multiple servers.

The following operators are distributed operators:

The distributed union operator is the primitive operator from which distributedcross apply and distributed outer apply are derived.

Distributed operators appear in execution plans with adistributed unionvariant on top of one or morelocal distributed union variants. Adistributed union variant performs the remote distribution of subplans. A localdistributed union variant is on top of each of the scans performed for thequery, as shown in this execution plan:

distributed operator

The local distributed union variants ensure stable query execution when restartsoccur for dynamically changing split boundaries.

Whenever possible, a distributed union variant has a split predicate thatresults in split pruning, meaning the remote servers execute subplans on onlythe splits that satisfy the predicate. This improves both latency and overallquery performance.

Distributed union

Adistributed union operator conceptually divides one or more tables intomultiplesplits, remotely evaluates a subquery independently on eachsplit, and then unions all results.

For example, using this query:

SELECTs.songname,s.songgenreFROMsongsASsWHEREs.singerid=2ANDs.songgenre='ROCK';

These are the results:

SongNameSongGenre
Starting AgainROCK
The Second TimeROCK
Fight StoryROCK

This is the execution plan:

distributed union operator

The distributed union operator sends subplans to remote servers, which perform atablescan across splits that satisfy the query's predicateWHEREs.SingerId = 2 AND s.SongGenre = 'ROCK'. Aserializeresult operator computes theSongName andSongGenrevalues from the rows returned by the table scans. The distributed union operatorthen returns the combined results from the remote servers as the SQL queryresults.

Distributed merge union

Thedistributed merge union operator distributes a query across multipleremote servers. It then combines the query results to produce a sorted result,known as adistributed merge sort.

A distributed merge union executes the following steps:

  1. The root server sends a subquery to each remote server that hosts asplit of the queried data. The subquery includes instructions thatresults are sorted in a specific order.

  2. Each remote server executes the subquery on its split, then sends theresults back in the requested order.

  3. The root server merges the sorted subquery to produce a completely sortedresult.

Distributed merge union is turned on, by default, for Spanner Version 3and later.

Distributed cross apply

Adistributed cross apply (DCA) operator extends thecrossapply operator by executing across multiple servers. The DCAinput side groupsbatches of rows (unlike a regular cross apply operator,which acts on only one input row at a time). The DCA map side is a set of crossapply operators that execute on remote servers.

For example, using this query:

SELECTalbumtitleFROMsongsJOINalbumsONalbums.albumid=songs.albumid;

The results are in the format:

AlbumTitle
Green
Nothing To Do With Me
Play
Total Junk
Green

This is the execution plan:

distributed cross apply operator

The DCA input contains an indexscan on theSongsBySingerAlbumSongNameDesc index that batches rows ofAlbumId. The mapside for this cross apply operator is an index scan on the indexAlbumsByAlbumTitle, subject to the predicate ofAlbumId in the input rowmatching theAlbumId key in theAlbumsByAlbumTitle index. The mappingreturns theSongName for theSingerId values in the batched input rows.

To summarize the DCA process for this example, the DCA's input is the batchedrows from theAlbums table, and the DCA's output is the application of theserows to the map of the index scan.

Distributed outer apply

Adistributed outer apply operator extends theouter applyoperator by executing over multiple servers, similar to the way adistributed cross apply operator extends a cross apply operator.

For example, using this query:

SELECTlastname,concertdateFROMsingersLEFTOUTERjoin@{JOIN_TYPE=APPLY_JOIN}concertsONsingers.singerid=concerts.singerid;

The results are in the format:

LastNameConcertDate
Trentor2014-02-18
Smith2011-09-03
Smith2010-06-06
Lomond2005-04-30
Martin2015-11-04
Richards

This is the execution plan:

distributed outer apply operator

Apply mutations

Anapply mutations operator applies the mutationsfrom aData Manipulation Statement (DML) to thetable. It's the top operator in a query plan for a DML statement.

For example, using this query:

DELETEFROMsingersWHEREfirstname='Alice';

These are the results:

4rowsdeletedThisstatementdeleted4rowsanddidnotreturnanyrows.

This is the execution plan:

apply mutations operator

Additional information

This section describes items that are not standalone operators, but insteadexecute tasks to support one or more of the operators previously listed. Theitems described here are technically operators, but they're not separatedoperators in your query plan.

Struct constructor

Astruct constructor creates astruct, or a collection of fields. Ittypically creates a struct for rows that result from a compute operation. Astruct constructor isn't a standalone operator. Instead, it appears incomputestruct operators orserialize resultoperators.

For a compute struct operation, the struct constructor creates a struct socolumns for the computed rows can use a single variable reference to the struct.

For a serialize result operation, the struct constructor creates a struct toserialize the results.

For example, using this query:

SELECTIF(TRUE,struct(1ASA,1ASB),struct(2ASA,2ASB)).A;

These are the results:

A
1

This is the execution plan:

struct constructor

In the execution plan, struct constructors appear inside a serialize resultoperator.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.