Add tables to a replication job

After you deploy a replication job, you cannot edit or add tables toit. Instead, add the tables to a new or duplicate replication job.

Option 1: Create a new replication job

Adding tables to a new job is the simplest approach. It prevents historicalreloading of all the tables and prevents data inconsistency issues.

The drawbacks are the increased overhead of managing multiplereplication jobs and the consumption of more compute resources, aseach job runs on a separate ephemeral Dataproc cluster bydefault. The latter can be mitigated to some extent by using a shared staticDataproc cluster for both jobs.

For more information about creating new jobs, see theReplication tutorials.

For more information about using static Dataproc cluster inCloud Data Fusion, seeRun a pipeline against an existing Dataproc cluster

Option 2: Stop the current replication job and create a duplicate

If you duplicate the replication job to add the tables, consider thefollowing:

  • Enabling the snapshot for the duplicate job results in the historical load ofall the tables from scratch. This is recommended if you cannot use theprevious option, where you run separate jobs.

  • Disabling the snapshot to prevent the historical load can result in dataloss, as there could be missed events between when the old pipeline stops andthe new one starts. Creating an overlap to mitigate this issue isn'trecommended, as it can also result in data loss—historical data for the newtables isn't replicated.

To create a duplicate replication job, follow these steps:

  1. Stop the existing pipeline.

  2. From the Replication jobs page, locate the job that you want to duplicate,click andDuplicate.

  3. Enable the snapshot:

    1. Go toConfigure source.
    2. In theReplicate existing data field, selectYes.
  4. Add tables in theSelect tables and transformations window and follow thewizard to deploy the replication pipeline.

Note: If you run a duplicate replication job against the same target BigQuery dataset as the original job, don't run the original job again, as it can cause data inconsistency.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.