Spanner federated queries

As a data analyst, you can query data in Spanner from BigQueryusingfederated queries.

BigQuery Spanner federation enablesBigQuery to query data residing in Spanner inreal-time, without copying or moving data.

You can query Spanner data in two ways:

Use external datasets

The simplest way to query Spanner tables is tocreate an external dataset. Once you create the external dataset, your tables from the corresponding Spanner database are visible in BigQuery and you can use them in your queries - for example in joins, unions or subqueries. However, no data is moved from Spanner to BigQuery storage.

You don't need to create a connection to query Spanner data if you create an external dataset.

UseEXTERNAL_QUERY function

Like for other federated databases, you can also query Spannerdata with anEXTERNAL_QUERYfunction. This may be useful if you want to have more control over the connection parameters.

Before you begin

  • Ensure that your BigQuery administrator has created a Spanner connection andshared it with you. See Choose the right connection.
  • To get the permissions that you need to query a Spanner instance, ask your administrator to grant you the BigQuery Connection User (roles/bigquery.connectionUser) Identity and Access Management (IAM) role. You also need to ask your administrator to grant you one of the following:
    • If you are a fine-grained access control user, you need access to a database role that has theSELECT privilege on all Spanner schema objects in your queries.
    • If you aren't a fine-grained access control user, you need the Cloud Spanner Database Reader (roles/spanner.databaseReader) IAM role.

    For information about granting IAM roles, see Manage access to projects, folders, and organizations. For information about fine-grained access control, seeAbout fine-grained access control.

Choose the right connection

If you are a Spanner fine-grained access control user, when you run afederated query with anEXTERNAL_QUERY function, you must use aSpanner connection that specifies a database role. Then allqueries that you run with this connection use that database role.

If you use a connection that doesn't specify a database role, you must have theIAM roles indicated inBefore you begin.

Query data

To send a federated query to Spanner from a GoogleSQL query, use theEXTERNAL_QUERYfunction.

Formulate your Spanner query in either GoogleSQL orPostgreSQL, depending on the specified dialect of the database.

The following example makes a federated query to a Spannerdatabase namedorders and joins the results with a BigQuerytable namedmydataset.customers.

SELECTc.customer_id,c.name,rq.first_order_dateFROMmydataset.customersAScLEFTOUTERJOINEXTERNAL_QUERY('my-project.us.example-db','''SELECT customer_id, MIN(order_date) AS first_order_date  FROM orders  GROUP BY customer_id''')ASrqONrq.customer_id=c.customer_idGROUPBYc.customer_id,c.name,rq.first_order_date;

Spanner Data Boost

Data Boost is a fully managed, serverless feature that providesindependent compute resources for supported Spanner workloads.Data Boost lets you execute analytics queries and data exportswith near-zero impact to existing workloads on the provisionedSpanner instance. Data Boost lets you run federated querieswith independent compute capacity separate from your provisioned instances toavoid impacting existing workloads on Spanner. Data Boost ismost impactful when you run complex ad hoc queries, or when you want to processlarge amounts of data without impacting the existing Spannerworkload. Running federated queries with Data Boost can lead tosignificantly lower CPU consumption, and in some cases, lower query latency.

Before you begin

To get the permission that you need to enable access to Data Boost, ask your administrator to grant you theCloud Spanner Database Reader with DataBoost (roles/spanner.databaseReaderWithDataBoost) IAM role on the Spanner database. For more information about granting roles, seeManage access to projects, folders, and organizations.

This predefined role contains the spanner.databases.useDataBoost permission, which is required to enable access to Data Boost.

You might also be able to get this permission withcustom roles or otherpredefined roles.

Enable Data Boost

When using external datasets, Data Boost is always used and you don't have to enable it manually.

If you want to use Data Boost for yourEXTERNAL_QUERY queries, you must enable it whencreating a connection that is used by your query.

Read data in parallel

Spanner can divide certain queries into smaller pieces, orpartitions, and fetch the partitions in parallel. For more information, including a list of limitations, seeRead data in parallel in theSpanner documentation.

To view the query execution plan for aSpanner query, seeUnderstand how Spanner executes queries.

When running federated queries with external datasets, the "Read data in parallel" option is always used.

To enable parallel reads when using theEXTERNAL_QUERY,enable it when youcreate the Connection.

Manage query execution priority

When you run federated queries with anEXTERNAL_QUERY function, you can assign priority (high,medium, orlow) to individual queries by specifying thequery_execution_priority option:

SELECT*FROMEXTERNAL_QUERY('my-project.us.example-db','''SELECT customer_id, MIN(order_date) AS first_order_date  FROM orders  GROUP BY customer_id''','{"query_execution_priority":"high"}');

The default priority ismedium.

Queries with priorityhigh will compete with transactional traffic.Queries with prioritylow are best-effort, and might get preempted bybackground load, for example scheduled backups.

Caution: Queries withlow priority fall below queries like backup jobswhich might never complete within the timeouts for BigQuery.

When running federated queries with external datasets, all queries have alwaysmedium priority.

View a Spanner table schema

If you use external datasets, your Spanner tables are visible directly in BigQuery Studio and you can see their schemas.

However, you can also see the schemas without defining external datasets. You can useEXTERNAL_QUERY function also to query information_schema views to access database metadata. The following example returns information about the columns in the tableMyTable:

Google SQL database

SELECT*FROMEXTERNAL_QUERY('my-project.us.example-db','''SELECT t.column_name, t.spanner_type, t.is_nullable    FROM information_schema.columns AS t    WHERE      t.table_catalog =''      AND t.table_schema =''     AND t.table_name = 'MyTable'    ORDER BY t.ordinal_position''');

PostgreSQL database

SELECT*fromEXTERNAL_QUERY('my-project.us.postgresql.example-db','''SELECT t.column_name, t.data_type, t.is_nullable    FROM information_schema.columns AS t    WHERE      t.table_schema = 'public' and t.table_name='MyTable'    ORDER BY t.ordinal_position''');

For more information, see the following information schema references in theSpanner documentation:

Pricing

Cross region queries

BigQuery supports federated queries where Spannerinstances and BigQuery datasets are in different regions. Thesequeries incur an additional Spanner data transfer charge.For more information seeSpanner pricing.

You are charged for the data transfer, based on the followingSKUs:

  • Network Intra-region Cross-Zone Data Transfer Out
  • Network Inter-Region Data Transfer Out to the Same Continent
  • Network Inter-Region Data Transfer Out to a DifferentContinent

Data transfer is charged based on the BigQuery regionyou run the query in and the nearest Spanner region thathas read-write or read-only replicas.

For BigQuery multi-region configurations(US orEU), data transfer costs from Spanner are determined as follows:

  • BigQueryUS multi-region: Spanner regionus-central1
  • BigQueryEU multi-region: Spanner regioneurope-west1

For example:

  • BigQuery (US multi-region) and Spanner (us-central1):Costs apply for data transfer within the same region.
  • BigQuery (US multi-region) and Spanner (us-west4):Costs apply for Data transfer between regions within the same continent.

Troubleshooting

This section helps you troubleshoot issues you might encounter when sendinga federated query to Spanner.

Issue: Query is not root partitionable.
Resolution: If you configure the connection to read data in parallel,either the first operator in the query execution plan must be a distributedunion, or your execution plan must not have any distributed unions. To resolvethis error, view the query execution plan and rewrite the query. For moreinformation, seeUnderstand how Spanner executes queries.
Issue: Deadline exceeded.
Resolution: Select the option toread data in parallel and rewrite the query to be rootpartitionable. For more information, seeUnderstand how Spanner executes queries.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.