Cross-region dataset replication

With BigQuery dataset replication, you can set up automaticreplication of a dataset between two different regions or multi-regions.

Overview

When you create a dataset in BigQuery, you select the region ormulti-region where the data is stored. Aregion is a collection of datacenters within a geographical area, and amulti-region is a large geographicarea that contains two or more geographic regions. Your data is stored in one ofthe contained regions, and is not replicated within the multi-region. For moreinformation about regions and multi-regions, seeBigQuerylocations.

BigQuery always stores copies of your data in two differentGoogle Cloud zones within the dataset location. Azone is a deployment areafor Google Cloud resources within a region. In all regions, replicationbetween zones uses synchronous dual writes. Selecting a multi-region locationdoes not provide cross-region replication or regional redundancy, so there is noincrease in dataset availability in the event of a regional outage. Data isstored in a single region within the geographic location.

For additional geo-redundancy, you can replicate any dataset.BigQuery creates a secondary replica of the dataset, located inanother region that you specify. This replica is then asynchronously replicatedbetween two zones with the other region, for a total of four zonal copies.

Dataset replication

If you replicate a dataset, BigQuery stores the data in theregion that you specify.

  • Primary region. When you first create a dataset, BigQueryplaces the dataset in the primary region.

  • Secondary region. When you add a dataset replica, BigQueryplaces the replica in the secondary region.

Initially, the replica in the primary region is theprimary replica, and thereplica in the secondary region is thesecondary replica.

The primary replica is writeable, and the secondary replica is read-only. Writesto the primary replica are asynchronously replicated to the secondary replica.Within each region, the data is stored redundantly in two zones. Network trafficnever leaves the Google Cloud network.

The following diagram shows the replication that occurs when a dataset isreplicated:

The primary replica in the primary zone of region 1 is simultaneously replicated to the primary and secondary zones of region 2.

If the primary region is online, you can manually switch to the secondaryreplica. For more information, seePromote the secondaryreplica.

Pricing

You are billed for the following for replicated datasets:

  • Storage. Storage bytes in the secondary region are billed as a separatecopy in the secondary region. Tables and partitions inlong-term storage aren'treset to active storage in the secondary replica.
  • Data replication. For more information on how you are billed for datareplication, seeData replicationpricing.

Data replication is managed by BigQuery and doesn't use yourslot resources. You are billed for data replicationseparately.

Compute capacity in the secondary region

To run jobs and queries against the replica in the secondary region, you mustpurchaseslots within the secondary region or run anon-demand query.

You can use the slots to perform read-only queries from the secondaryreplica. If you promote the secondary replica to be the primary, youcan also use those slots to write to the replica.

You can purchase the same number of slots as you have in the primaryregion, or a different number of slots. If you purchase fewer slots, it mightaffect query performance.

Location considerations

Before adding a dataset replica, you need to create the initial dataset you wantto replicate in BigQuery if it doesn't exist already. Thelocation of the added replica is set to the location that you specify whenadding the replica. The location of the added replica must be different from thelocation of the initial dataset. This means that the data in your dataset iscontinually replicated between the location the dataset was created in and thelocation of the replica. For replicas that require colocation, such as views,materialized views, or non-BigLake external tables, adding a replica ina location that is different from, or not compatible with, your source data'slocation could result in job errors.

When customers replicate a dataset across regions, BigQueryensures data is located only in the locations where the replicas were created.

Colocation requirements

Using dataset replication is dependent on the following colocation requirements.

Cloud Storage

Querying data on Cloud Storage requires that the Cloud Storage bucket iscolocated with the replica. Use theexternal tables locationconsiderations when decidingwhere to place your replica.

Limitations

BigQuery dataset replication is subject to thefollowing limitations:

  • Streaming data written to the primary replica from theBigQuery Storage Write API or thetabledata.insertAllmethod, which is then replicated into the secondary replica, is best-effortand may see high replication delay.
  • Streaming upserts written to the primary replica fromDatastream orBigQuery change data capture ingestion,which is then replicated into the secondary replica, is best-effort and maysee high replication delay. Once replicated, the upserts in the secondaryreplica are merged into the secondary replica's table baseline as per thetable's configuredmax_stalenessvalue.
  • You can't enablefine-grained DMLon a table in a replicated dataset, and you can't replicatea dataset that contains a table with fine-grained DML enabled.
  • Replication and switchover are managed through SQLdata definition language(DDL)statements.
  • You are limited to one replica of each dataset for each region ormulti-region. You cannot create two secondary replicas of the same dataset inthe same destination region.
  • Resources within replicas are subject to the limitations as described inResource behavior.
  • Policy tags andassociated data policies are not replicated to the secondary replica. Anyqueries that reference columns with policy tags in regions other than theoriginal region fail, even if that replica is promoted.
  • Time travel is only available in the secondaryreplica after the creation of the secondary replica is completed.
  • The destination region size limit (in logical bytes) for enablingcross-region replication on a dataset is 10 PB forus andeumulti-regions and 500 TB for otherregions by default. These limits areconfigurable. For more information, reach out toGoogle Cloud Support.
  • The quota applies to logical resources.
  • You can only replicate a dataset with fewer than 100,000 tables.
  • You are limited to a maximum of 4 replicas added (then dropped) to the sameregion per dataset per day.
  • You are limited bybandwidth.
  • Tables withCustomer-managed encryptionkeys (CMEK) applied are notqueryable in the secondary region if thereplica_kms_key value is notconfigured.
  • BigLake tables are not supported.
  • You can't replicate external or federated datasets.
  • BigQuery Omni locationsaren't supported.
  • You can't configure the following region pairs if you are configuring data replication fordisaster recovery:
    • us-central1 -us multi-region
    • us-west1 -us multi-region
    • eu-west1 -eu multi-region
    • eu-west4 -eu multi-region
  • Routine-level access controls can't be replicated, but you can replicatedataset-level access controls for routines.
  • The following behavior applies to search indexes:
    • Only the search index metadata is replicated to the secondary region, notindex data itself.
    • If you switch over to the replica, then your index is deleted from theprevious primary region and regenerated in the promoted region.
    • If you switch back and forth within 8 hours, then your index generation isdelayed by 8 hours.

Resource behavior

The following operations are not supported on resources within the secondary replica:

The secondary replica is read-only. If you need to create a copy of a resourcein a secondary replica, you must either copy the resource or query the resourcefirst, and then materialize the results outside of the secondary replica.For example, useCREATE TABLE AS SELECTto create a new resource from the secondary replica resource.

Primary and secondary replicas are subject to the following differences:

Region 1 primary replicaRegion 2 secondary replicaNotes
BigLake tableBigLake tableNot supported.
BigLake Apache Iceberg tableBigLake Apache Iceberg tableSee [BigLake metastore cross-region replication and disaster recovery](/biglake/docs/about-managed-disaster-recovery).
External tableExternal tableOnly the external table definition is replicated. The query fails when the Cloud Storage bucket is not co-located in the same location as a replica.
Logical viewLogical viewLogical views that reference a dataset or resource that is not located in the same location as the logical view fail when queried.
Managed tableManaged tableNo difference.
Materialized viewMaterialized viewIf a referenced table is not in the same region as the materialized view, the query fails. Replicated materialized views may see staleness above the view'smax staleness.
ModelModelStored as managed tables.
Remote functionRemote functionConnections are regional. Remote functions that reference a dataset or resource (connection) that is not located in the same location as the remote function fail when run.
RoutinesUser-defined function (UDF) or stored procedureRoutines that reference a dataset or resource that is not located in the same location as the routine fail when run. Any routine that references a connection, such as remote functions, does not work outside the source region.
Row Access PolicyRow Access PolicyNo difference.
Search indexSearch indexOnly index metadata is replicated. Index data only exists in primary region.
Stored procedureStored procedureStored procedures that reference a dataset or resource that is not located in the same location as the stored procedure fail when run.
Table cloneManaged tableBilled as a deep copy in secondary replica.
Table snapshotTable snapshotBilled as a deep copy in secondary replica.
Table-valued function (TVF)TVFTVFs that reference a dataset or resource that is not located in the same location as the TVF fail when run.
UDFUDFUDFs that reference a dataset or resource that is not located in the same location as the UDF fail when run.
Data policy on a columnData policy on a column Custom data policies that reference a UDF that is not located in the same location as the policy fail when querying the table that the policy is attached to.

Outage scenarios

Cross-region replication is not intended for use as a disaster recovery planduring a total-region outage. In the case of a total region outage in theprimary replica's region, you cannot promote the secondary replica. Becausesecondary replicas are read-only, you can't run any write jobs on the secondaryreplica and can't promote the secondary region until the primary replica'sregion is restored. For more information about preparing for disaster recovery,seeManaged disaster recovery.

The following table explains the impact of total-region outages on yourreplicated data:

Region 1Region 2Outage regionImpact
Primary replicaSecondary replicaRegion 2Read-only jobs running in region 2 against the secondary replica fail.
Primary replicaSecondary replicaRegion 1All jobs running in region 1 fail. Read-only jobs continue to run in region 2 where the secondary replica is located. The contents of region 2 are stale until it is successfully synced with region 1.

Use dataset replication

This section describes how to replicate a dataset, promote the secondaryreplica, and run BigQuery read jobs in the secondary region.

Required permissions

To get the permissions that you need to manage replicas, ask your administratorto grant youbigquery.datasets.update permission.

Replicate a dataset

To replicate a dataset, use theALTER SCHEMA ADD REPLICA DDL statement.

You can add a replica to any dataset that's located in a region or multi-regionthat is not already replicated in that region or multi-region. After you add areplica, it takes time for the initial copy operation to complete. You can stillrun queries referencing the primary replica while the data is being replicated,with no reduction in query processing capacity. You can't replicate data withinthe geo-locations within a multi-region.

The following example creates a dataset namedmy_dataset in theus-central1region and then adds a replica in theus-east4 region:

-- Create the primary replica in the us-central1 region.CREATESCHEMAmy_datasetOPTIONS(location='us-central1');-- Create a replica in the secondary region.ALTERSCHEMAmy_datasetADDREPLICA`my_replica`OPTIONS(location='us-east4');

To confirm when the secondary replica has successfully been created, you canquery thecreation_complete column in theINFORMATION_SCHEMA.SCHEMATA_REPLICASview.

After the secondary replica has been created, you can query it by explicitlysetting the location of the queryto the secondary region. If a location is not explicitly set,BigQuery uses the region of the primary replica of the dataset.

Promote the secondary replica

If the primary region is online, you can promote the secondary replica.Promotion switches the secondary replica to be the writeable primary. Thisoperation completes within a few seconds if the secondary replica is caught upwith the primary replica. If the secondary replica is not caught up, thepromotion can't complete until it is caught up. The secondary replica can't bepromoted to the primary if the region containing the primary has an outage.

Note the following:

  • All writes to tables return errors while promotion is in process. The old primaryreplica becomes non-writable immediately when the promotion begins.
  • Tables that aren't fully replicated at the time the promotion is initiatedreturn stale reads.

To promote a replica to be the primary replica, use theALTER SCHEMA SETOPTIONS DDLstatement and set theprimary_replica option.

Note the following: - You must explicitly set the job location to the secondary region in query settings. SeeBigQuery specify locations.

The following example promotes theus-east4 replica to be the primary:

ALTERSCHEMAmy_datasetSETOPTIONS(primary_replica='us-east4')

To confirm when the secondary replica has successfully been promoted, you canquery thereplica_primary_assignment_complete column in theINFORMATION_SCHEMA.SCHEMATA_REPLICASview.

Remove a dataset replica

To remove a replica and stop replicating the dataset, use theALTER SCHEMA DROP REPLICA DDL statement.

The following example removes theus replica:

ALTERSCHEMAmy_datasetDROPREPLICAIFEXISTS`us`;

You must first drop any secondary replicas to delete the entire dataset. If youdelete the entire dataset—for example, by using theDROPSCHEMAstatement—withoutdropping all secondary replicas, you receive the following error:

The dataset replica of the cross region dataset 'project_id:dataset_id' in region 'REGION' is not yet writable because the primary assignment is not yet complete.

For more information, seePromote the secondary replica.

List dataset replicas

To list the dataset replicas in a project, query theINFORMATION_SCHEMA.SCHEMATA_REPLICASview.

Migrate datasets

You can use cross-region dataset replication to migrate your datasets from oneregion to another. The following example demonstrates the process of migratingthe existingmy_migration dataset from theUS multi-region to theEUmulti-region using cross-region replication.

Replicate the dataset

To begin the migration process, first replicate the dataset in the region that youwant to migrate the data to. In this scenario, you are migrating themy_migration dataset to theEU multi-region.

-- Create a replica in the secondary region.ALTERSCHEMAmy_migrationADDREPLICA`eu`OPTIONS(location='eu');

This creates a secondary replica namedeu in theEU multi-region.The primary replica is themy_migration dataset in theUS multi-region.

Promote the secondary replica

To continue migrating the dataset to theEU multi-region, promote thesecondary replica:

ALTERSCHEMAmy_migrationSETOPTIONS(primary_replica='eu')

After the promotion is complete,eu is the primary replica. It is a writablereplica.

Complete the migration

To complete the migration from theUS multi-region to theEU multi-region,delete theus replica. This step is not required but is useful if you don'tneed a dataset replica beyond your migration needs.

ALTERSCHEMAmy_migrationDROPREPLICAIFEXISTSus;

Your dataset is located in theEU multi-region and there are no replicas ofthemy_migration dataset. You have successfully migrated your dataset to theEU multi-region. The complete list of resources that are migrated can be foundinResource behavior.

Customer-managed encryption keys (CMEK)

Customer-managed Cloud Key Management Service keysare not automatically replicated when you create a secondary replica. In orderto maintain the encryption on your replicated dataset, you must set thereplica_kms_key for the location of the added replica. You can set thereplica_kms_key using theALTER SCHEMA ADD REPLICA DDLstatement.

Replicating datasets with CMEK behaves as described in the following scenarios:

  • If the source dataset has adefault_kms_key, you must provide areplica_kms_key that was created in the replica dataset's region when usingtheALTER SCHEMA ADD REPLICA DDL statement.

  • If the source dataset doesn't have a value set fordefault_kms_key, youcan't set thereplica_kms_key.

  • If you are usingCloud KMS key rotation on either (or both) of thedefault_kms_key or thereplica_kms_key the replicated dataset is stillqueryable after the key rotation.

    • Key rotation in the primary region updates the key version only in tables createdafter the rotation, tables that existed prior to the key rotation still use thekey version that was set prior to the rotation.
    • Key rotation in the secondary regionupdates all tables in the secondary replica to the new key version.
    • Switching the primary replica to secondary replica updates all tables in thesecondary replica (formerly the primary replica) to the new key version.
    • If the key version set on tables in the primary replica prior to key rotationis deleted, any tables still using the key version set prior to key rotationcannot be queried until the key version is updated. In order to update the keyversion, the old key version must be active (not disabled or deleted).
  • If the source dataset doesn't have a value set fordefault_kms_key, butthere are individual tables in the source dataset with CMEK applied, thosetables aren't queryable in the replicated dataset. To query the tables, do thefollowing:

    • Add adefault_kms_key value for the source dataset.
    • When you create anew replica using theALTER SCHEMA ADD REPLICA DDL statement, set a valuefor thereplica_kms_keyoption. The CMEK tables are queryable in the destination region.

    All the CMEK tables in the destination region use the samereplica_kms_key,regardless of the key used in the source region.

Create a replica with CMEK

The following example creates a replica in theus-west1 region with areplica_kms_key value set. For CMEK key, grant theBigQuery service account permissionto encrypt and decrypt.

-- Create a replica in the secondary region.ALTERSCHEMAmy_datasetADDREPLICA`us-west1`OPTIONS(location='us-west1',replica_kms_key='my_us_west1_kms_key_name');

CMEK limitations

Replicating datasets with CMEK applied are subject to the following limitations:

  • You can't update the replicated Cloud KMS key after the replica iscreated.

  • You can't update thedefault_kms_key value on the source datasetafter the dataset replicas have been created.

  • If the providedreplica_kms_key is not valid in the destination region, thedataset won't be replicated.

Data policies assigned to a column

The following sections describe howdata policies assigned directly to acolumn(Preview) interactwith cross-region replication.

Data policies assigned to columns are regional resources. This means the datapolicy and its attached table must be in the same region. Data policies areautomatically replicated when you create a secondary replica. If a data policyis attached to any table within a replicated dataset, BigQuerycreates or updates the data policy and its corresponding IAMpolicies in the secondary region.

Mutability

Replicated data policies are read-only in secondary regions. You can'tupdatethe datapolicy in asecondary region. The original data policy attached to the table in the primaryregion remains mutable. A data policy in a region is mutable only if it is notattached to any secondary tables. If it is attached to tables in secondaryregions, it becomes immutable. BigQuery rejects any operationsthat update or setIAM policies if the data policy is immutable.

Naming conflicts

The data policy resource is the same between the primary and secondary regions,except for the location. For a data policy and it's replica in the secondaryregions, the IDs in the format ofprojects/PROJECT_NUMBER/locations/LOCATION_ID/dataPolicies/DATA_POLICY_IDare identical, except for the value ofLOCATION_ID.Replication fails if a data policy with a conflicting ID already exists in thesecondary region. You must resolve the naming conflict in either the primary orsecondary region before replication proceeds.

Custom masking policies

If you use acustom maskingroutine, ensure thatyou replicate the custom UDFs. You can include these UDFs as part of the datasetyou are replicating.

Attaching, detaching, or deleting data policies to table columns

Attaching, detaching, or deleting a data policy to or from a column in theprimary region is a table schema change. The table schemas in all secondaryregions are updated to reflect the change. However, detaching a data policy fromall table columns in the primary region makes the data policy orphaned in thesecondary region, which you must clean up manually. Manually attaching areplicated data policy in the secondary region isn't recommended because it canlead to complicated situations, for example, FGAC grants can't be addedor removed.

If you delete a detached data policy, a background job removes it from the tableschema. When this happens, the table schema change propagates to the destinationregion. As with detaching policies, you must manually delete the data policy inthe secondary region.

Drop a replica

If you drop the replica, the attached data policies are not automaticallydeleted. As with detaching policies, you must manually delete the data policy inthe secondary region.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.