Run federated queries with Data Boost Stay organized with collections Save and categorize content based on your preferences.
This page explains how to use Spanner Data Boost when you run federated queriesfrom BigQuery to a Spanner database. WithData Boost, federated queries run with minimal impact to existingworkloads on the provisioned Spanner instance. TheData Boost queries from BigQuery to aSpanner database can join BigQuery data withSpanner data.
Spanner federation lets BigQuery query dataresiding in Spanner in real time, without copying or moving data.To learn more about Spanner federated queries, seeSpanner federated queries.To learn about Data Boost, seeData Boostoverview.
Before you begin
Before you can run federated queries with Data Boost, you need tocomplete the following tasks:
Create a Spanner instance and database
If you don't have a Spanner instance and database, follow thesteps inCreate and query a database using the Google Cloud console to create them.
Enable the BigQuery connection API
The BigQuery connection API lets you manageBigQuery connections to external data sources such as aSpanner database.
Enable the BigQuery connection API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.
For more information, seeBigQuery connectionAPI in theBigQuery documentation.
Grant IAM permissions for Data Boost to principals
A principal must be granted the following permissions to run federated querieswith Data Boost:
spanner.instances.get- lets you get the configuration of an instance.spanner.databases.useDataBoost- lets you use theSpanner Data Boost computeresources to processpartitioned queries.
For more information about Spanner permissions, seeIdentity and Access Management(IAM) permissions.
To grant these required permissions, we recommend that you use theCloudSpanner Database Reader With DataBoost(roles/spanner.databaseReaderWithDataBoost) IAM role. You canadd that role to any principal that needs to run federated querieswith Data Boost. To learn more about predefined roles inSpanner, seePredefined roles. To learnhow to create a custom IAM role, seeCreate a customrole.
Run a federated Data Boost query
To run a Data Boost query from BigQuery to anexternal source, you need a BigQuery connection to the externalsource and the ID of the connection. When you run a federatedSpanner query with Data Boost, the external sourceis a Spanner database. After you create your connection ID, it'sused by BigQuery to run a Data Boostquery of a Spanner database.
Use one of the following options to create a BigQuery connectionID, and then use the connection ID to run a Data Boost query fromBigQuery:
Start in Spanner - Create theBigQuery external connection ID in the Spannerconsole. After your connection ID is created in the Spannerconsole, you're redirected to the BigQuery console to run afederated Data Boost query to a Spannerdatabase.
Start in BigQuery - Create theData Boost external connection ID in theBigQuery console or using the
bqcommand-line tool. Afteryou create the connection ID, you stay in the BigQueryconsole to run a federated Data Boost query to aSpanner database.
Start in Spanner to run a Data Boost query
To run a federated Data Boost query starting in theSpanner Studio, do the following:
Go to the SpannerInstances page in theGoogle Cloud console.
The console shows a list of your Spannerinstances.
Select a Spanner instance, and then select a database.
On theDatabase overview page, in the navigation menu, clickSpanner Studio.
ClickView in BiqQuery.
In theView in BigQuery dialog, enter a connection ID.
The connection ID is used to create a new BigQuery externalconnection to your Spanner database. You reference yourexternal connection using the following pattern:
PROJECT-ID.LOCATION.CONNECTION-IDAn error occurs if the ID already exists.
Fill in the rest of the dialog and do the following:
- SelectRead data in parallel.
- SelectUse Spanner Data Boost.
ClickView in BigQuery.
BigQuery Studio opens with the following query:
SELECT*FROMEXTERNAL_QUERY("PROJECT-ID.LOCATION.CONNECTION-ID","SELECT * FROM INFORMATION_SCHEMA.TABLES;");You can replace this with your federated query. For example, you might makea query that's similar to the following example. This example makes afederated query from a table named
ordersin a Spannerdatabase and joins the results with a BigQuery table namedmydataset.customers.SELECTc.customer_id,c.name,rq.first_order_dateFROMmydataset.customersAScLEFTOUTERJOINEXTERNAL_QUERY('my-project.us.example-db','''SELECT customer_id, MIN(order_date) AS first_order_date FROM orders GROUP BY customer_id''')ASrqONrq.customer_id=c.customer_idGROUPBYc.customer_id,c.name,rq.first_order_date;
Start in BigQuery to run a Data Boost query
To create an external data connection from BigQuery to aSpanner database and use that connection to run a federatedData Boost query from BigQuery, select one of thefollowing options:
Console
Go toCreate Spanner connectionsin the BigQuery documentation and follow the instructions ontheConsole tab.
In theExternal data source pane, do the following:
- SelectRead data in parallel.
- SelectUse Spanner Data Boost.
bq
Go toCreate Spanner connectionsin the BigQuery documentation and follow instructions on thebq* tab.
Set the following connection properties to
true:useParallelismuseDataBoost
The following example uses thebqmk command to create anew connection namedmy_connection with the two required properties forData Boost:
bqmk--connection--connection_type='CLOUD_SPANNER'--location='us'\--properties='{"database":"projects/my-project/instances/my-instance/databases/my-database", "useParallelism":true, "useDataBoost": true}'my_connectionWhat's next
- Learn about Data Boost inData Boost overview
- Use Data Boost in your applications
- Monitor Data Boost usage
- Monitor and manage Data Boost quota usage
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.