Introduction to federated queries

This page introduces how to use federated queries and provides guidance onquerying Spanner, AlloyDB, and Cloud SQL data from BigQuery.

Federated queries let you send a query statement to AlloyDB, Spanner, or Cloud SQL databases and get the result back as a temporary table. Federated queries use the BigQuery Connection API to establish a connection with AlloyDB, Spanner, or Cloud SQL. In your query, you use theEXTERNAL_QUERY function to send a query statement to the external database, using that database's SQL dialect. The results are converted to GoogleSQL data types.

Supported data stores

You can use federated queries with the following data stores:

Workflow

  • Identify the Google Cloud project that includes the data source that youwant to query.
  • Abigquery.admin usercreates a connection resource in BigQuery.
  • The admin usergrants permission to use the connection resourceto user B.
    • If the admin and user B are the same person, there is no need to grantpermission.
  • User B writes a query in BigQuery with the newEXTERNAL_QUERY SQL function.
Caution: The performance of federated queries might be lower than queries that readdata residing in BigQuery storage.

Alternatives to federated queries: external tables and datasets

Another option to query operational databases such as Bigtable, Spanner,Cloud Storage, Google Drive, and Salesforce Data Cloud, is to useexternal tables and datasets. External datasets and tables let you view tablesand their schemas and query them without using anEXTERNAL_QUERYSQL function. You don't have to bring data back into BigQuery andyou can use the BigQuery syntax instead of writing in the specificSQL database dialect of SQL.

Supported regions

For a list of supported locations, see the following sections:

AlloyDB and Cloud SQL

Federated queries are only supported in regions that support boththe external data source and BigQuery.

You can create a connection and run a federated query across regions according to the following rules:

Single regions

A BigQuery single region can only query a resource in the sameregion.

For example, if your dataset is inus-east4, you can queryCloud SQL instances or AlloyDB instances that arelocated inus-east4. Thequery processing locationis the BigQuery single region.

Multi-regions

A BigQuery multi-region can query any data source region in thesame large geographic area (US, EU).Multi-regional locationsaren't available for Cloud SQL instances, because these are only usedfor backups.

The query performance varies based on the proximity between the dataset and theexternal data source. For example, a federated query between a dataset inthe US multi-region and a Cloud SQL instance inus-central1 isfast. However, if you run the same query between the US multi-regionand a Cloud SQL instance inus-east4, the performance might be slower.

Thequery processing locationis the multi-region location, eitherUS orEU.

Spanner

ForSpanner, both regional and multi-regionalconfigurations are supported.A BigQuery single region/multi-region can query a Spanner instancein any supported Spanner region. For more details refer tocross region queries.

Data type mappings

When you execute a federated query, the data from the external data sourceis converted to GoogleSQLtypes. For more information, seeCloud SQL federated queries.

Quotas and limits

  • Cross-region federated querying. If theBigQuery query processing locationand the external data source location are different, this is a cross-regionquery. You can run up to 1 TB in cross-region queries per project perday. The following is an example of a cross-region query.
    • The Cloud SQL instance is inus-west1 while theBigQuery connection is based in the US multi-region. TheBigQuery query processing location isUS.
  • Quota. Users should control query quota in the external data source,such as Cloud SQL or AlloyDB. There is no extra quota setting for federatedquerying. To achieve workload isolation, it's recommended to only query adatabase read replica.
  • Maximum bytes billed allowed. This field isn't supported for federatedqueries. Calculating the bytes billed before actuallyexecuting the federated queries isn't possible.
  • Number of connections. A federated query can have at most 10 uniqueconnections.
  • Cloud SQLMySQL andPostgreSQL. Quotas and limitations apply.

Limitations

Federated queries are subject to the following limitations:

  • Performance. A federated query is likely to not be as fast as queryingonly BigQuery storage. BigQuery needs to wait forthe source database to execute the external query and temporarily move datafrom the external data source to BigQuery. Also, the sourcedatabase might not be optimized for complex analytical queries.

    The query performance also varies based on the proximity between the datasetand the external data source. For more information, seeSupportedregions.

  • Federated queries are read-only. The external query that is executedin the source database must be read-only. Therefore, DML or DDL statements arenot supported.

  • Unsupported data types. If your external query contains a data type thatis unsupported in BigQuery, the query fails immediately. You cancast the unsupported data type to a different supported data type.

  • Customer-managed encryption keys (CMEK). CMEK is configured separately for BigQuery and for external data sources.If you configure the source database to use CMEK but not BigQuery, then the temporarytable that contains results of a federated query is encrypted with a Google-owned and Google-managed encryption key.

Pricing

  • If you are using theon-demand pricing model, you are charged for the number of bytes returned fromthe external query when executing federated queries fromBigQuery. For more information, seeOn-demand analysispricing.

  • If you are usingBigQuery editions, you are charged based on the number of slotsyou use. For more information, seeCapacity computepricing.

SQL pushdowns

Federated queries are subject to the optimization technique known as SQL pushdowns.They improve the performance of a query by delegating operations like filtering downto the external data source instead of performing them in BigQuery.Reducing the amount of data transferred from the external data source can reducequery execution time and lower costs. SQL pushdowns include column pruning(SELECT clauses) and filter pushdowns (WHERE clauses).

When you use theEXTERNAL_QUERY function, SQL pushdowns work by rewriting the original query.In the following example, theEXTERNAL_QUERY function is used to communicate with a Cloud SQL database:

SELECTCOUNT(*)FROM(SELECT*FROMEXTERNAL_QUERY("CONNECTION_ID","select * from operations_table"))WHEREa='Y'ANDbNOTIN('COMPLETE','CANCELLED');

ReplaceCONNECTION_ID with the ID of theBigQuery connection.

Without SQL pushdowns, the following query is sent to Cloud SQL:

SELECT*FROMoperations_table

When this query is executed, the entire table is sent back to BigQuery,even though only some rows and columns are needed.

With SQL pushdowns, the following query is sent to Cloud SQL:

SELECT`a`,`b`FROM(SELECT*FROMoperations_table)tWHERE((`a`='Y')AND(NOT`b`IN('COMPLETE','CANCELLED')))

When this query is executed, only two columns and the rows that match the filteringpredicate are sent back to BigQuery.

SQL pushdowns are also applied when running federated queries withSpanner external datasets.

You can examine applied pushdowns (if any) in thequery plan.

Limitations

SQL pushdowns have various limitations that vary depending on the external data source and on the way you query data.

Limitations for query federation when usingEXTERNAL_QUERY

  • SQL pushdowns are only applied to federated queries of the formSELECT * FROM T.
  • Only column pruning and filter pushdowns are supported. Specifically compute, join, limit, order by and aggregation pushdowns aren't supported.
  • For filter pushdowns, literals must beof one of the following types:BOOL,INT64,FLOAT64,STRING,DATE,DATETIME,TIMESTAMP. Literals that are structs aren't supported.
  • SQL function pushdowns are applied only for functions that are supported by both BigQuery and a destination database.
  • SQL pushdowns are only supported for AlloyDB, Cloud SQL, and Spanner.
  • SQL pushdowns aren't supported for SAP Datasphere.

Limitations for query federation when using Spanner external datasets

  • Column pruning, filter, compute and partial aggregation pushdowns are supported. Specifically join, limit and order by pushdowns aren't supported.
  • For filter pushdowns, literals must be one of the following types:BOOL,INT64,FLOAT64,STRING,DATE,DATETIME,TIMESTAMP,BYTE or Arrays. Literals that are structs aren't supported.
  • SQL function pushdowns are applied only for functions that are supported by both BigQuery and Spanner.

Supported functions by data source

The following are supported SQL functions by data source. No functions aresupported for SAP Datasphere.

Cloud SQL MySQL

  • Logical operators:AND,OR,NOT.
  • Comparison operators:=,>,>=,<,<=,<>,IN,BETWEEN,IS NULL.
  • Arithmetic operators:+,-,* (only forINT64 andFLOAT64).

Cloud SQL PostgreSQL and AlloyDB

  • Logical operators:AND,OR,NOT.
  • Comparison operators:=,>,>=,<,<=,<>,IN,BETWEEN,IS NULL.
  • Arithmetic operators:+,-,*,/ (only forINT64,FLOAT64, andDATE types, except forDATE subtraction).

Spanner - PostgreSQL dialect

  • Logical operators:AND,OR,NOT.
  • Comparison operators:=,>,>=,<,<=,<>,IN,BETWEEN,IS NULL.
  • Arithmetic operators:+,-,*,/ (only forINT64,FLOAT64,NUMERIC).
  • When usingexternal datasets, additionally:

    • Compute pushdown
    • Partial Aggregate pushdown
    • String functions
    • Math functions
    • Cast functions
    • Array functions
  • Expect GoogleSQL semantics, notPostgreSQL semantics, when queries are run. Forexample:

    • NULL values sort first in ascending order by default,unlike PostgreSQL where they sort last bydefault.
    • PostgreSQLNUMERIC values read fromSpanner are handled in accordance with theSpanner to BigQuerytype mapping.For example, if a numeric column has the1.1234567891value, then the following query returns 0 rows:

      SELECT*FROMEXTERNAL_QUERY("CONNECTION_ID","SELECT * fromoperations_table where numeric_col = 1.123456789")
      However the following statement returns 1 row based onGoogleSQL semantics:
      SELECT*fromoperations_tablewherenumeric_col=1.123456789

    • JSON object normalization behaves differently. Keys aresorted strictly lexicographically in SpannerJSON, but in PostgreSQLPG JSONB, they aresorted first by key length, then lexicographically withequivalent key length.

Spanner - GoogleSQL dialect

  • Logical operators:AND,OR,NOT.
  • Comparison operators:=,>,>=,<,<=,<>,IN,BETWEEN,IS NULL.
  • Arithmetic operators:+,-,*,/ (only forINT64,FLOAT64,NUMERIC).
  • Safe arithmetic operators:SAFE_ADD,SAFE_SUBTRACT,SAFE_MULTIPLY,SAFE_DIVIDE(only forINT64,FLOAT64,NUMERIC).
  • When usingexternal datasets, additionally:
    • Compute pushdown,
    • Partial Aggregate pushdown,
    • String functions,
    • Math functions,
    • Cast functions,
    • Array functions.

Work with collations in external data sources

An external data source might have a collation set on a column(for example, case-insensitivity). When you execute a federated query, theremote database takes into account the configured collation.

Consider the following example where you have aflag column with acase-insensitive collation in the external data source:

SELECT*FROMEXTERNAL_QUERY("CONNECTION_ID","select * from operations_table where flag = 'Y'")

ReplaceCONNECTION_ID with the ID of theBigQuery connection.

The preceding query returns rows whereflag isy orY because the queryis executed on the external data source.

However, for query federation with Cloud SQL, SAP Datasphere,or AlloyDB data sources, if you add a filter on your main query,the query is executed on the BigQuery side with the defaultcollation. See the following query:

SELECT*FROM(SELECT*FROMEXTERNAL_QUERY("CONNECTION_ID","select * from operations_table"))WHEREflag='Y'

Due to the default case-sensitivecollationin BigQuery, the preceding query only returns rows where the flagisY and filters out rows where the flag isy. To make yourWHERE clausecase-insensitive, specify the collation in the query:

SELECT*FROM(SELECT*FROMEXTERNAL_QUERY("CONNECTION_ID","select * from operations_table"))WHERECOLLATE(flag,'und:ci')='Y'

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.