Manage backfill for the objects of a stream

A stream in Datastream can backfill historical data, as well as stream ongoing changes into a destination. As part of creating a stream, youconfigured information about the source database for the stream.

If you selected theBackfill historical data checkbox, then Datastream will stream all existing data, in addition to changes to the data, from the source into the destination.

If you didn't select this checkbox, then Datastream will stream only changes to the data. To have Datastream stream a snapshot of all existing data from the source to the destination, you must initiate backfill for the objects that contain this data. The objects are in the form of database schemas, tables, and columns.

Another reason for initiating backfill for an object is if data is out of sync between the source and the destination. For example, a user can delete data in the destination inadvertently, and the data is now lost. In this case, initiating backfill for the object serves as a "reset mechanism" because all data is streamed into the destination in one shot. As a result, the data is synced between the source and the destination.

After initiating backfill for an object, you can stop backfill for it. In the preceding example, the user modifies the database schema, and the schema or data is corrupted. You don't want this schema or data to be streamed into the destination, and so you stop backfill for the object.

You can also stop backfill for objects for load balancing purposes. Datastream can run multiple backfills in parallel. This may put an additional load on the source. If the load is significant, stop backfill for the objects, and then initiate backfill for them, one by one.

Object statuses

The various statuses in the lifecycle of initiating and stopping backfill for an object include:

  • No status (represented in the UI as-): Reasons for an object receivingthis status include:

    • The stream hasn't been started.
    • TheBackfill historical data checkbox wasn't selected (so thebackfill is defined as manual).
    • The object is excluded explicitly from being backfilled automatically.
    • The stream is configured to include future tables. If this happens, thenwhen new tables are added to the source, there's no automatic backfilltask created for them (because new tables typically don't have any"historical" data to backfill).
    Note: For more information, seeConfigure information about the source database for the stream.
  • Pending: backfill hasn't yet started for the object.

  • Active: backfill is in progress for the object.

  • Completed: backfill is completed for the object.

  • Stopped: backfill is stopped for the object. If backfill is initiatedagain for the object, then Datastream will stream all existing dataassociated with the object from the source into the destination.

  • Failed: backfill failed for the object and the backfill must be initiatedagain.

Initiate backfill

  1. Go to theStreams page in the Google Cloud Console.

    Go to the Streams page

  2. Click the stream that contains objects for which you want to initiate backfill.

  3. Click theOBJECTS tab.

  4. Select the checkbox for each object for which you want to initiate backfill.

  5. ClickINITIATE BACKFILL.

    If an object has a status ofPending orActive, then you can't initiate backfill for the object.

  6. If you selected only one object, then in the dialog, clickINITIATE OBJECT BACKFILL. Otherwise, if you selected multiple objects, then clickINITIATE OBJECT BACKFILLS.

    Datastream will start backfill for the objects that you selected, andthe status of each object will change fromPending toActive toCompleted. When an object has a status ofCompleted, this means thatDatastream has read all the data for the object, but the data mightstill be loading to the destination.

    If an object has a status ofFailed, then backfill failed for the object, and you must initiate the backfill again.

Stop backfill

  1. Go to theStreams page in the Google Cloud Console.

    Go to the Streams page

  2. Click the stream that contains objects for which you want to stop backfill.

  3. Click theOBJECTS tab.

  4. Select the checkbox for each object for which you want to stop backfill.

  5. ClickSTOP BACKFILL.

    If an object has a status ofCompleted,Stopped, orFailed, then you can't stop backfill for the object.

  6. If you selected only one object, then in the dialog, clickSTOP OBJECT BACKFILL. Otherwise, if you selected multiple objects, then clickSTOP OBJECT BACKFILLS.

    Datastream will stop backfill for the objects that you selected, and the status of each object will change toStopped.

    When an object has this status, backfill is stopped for the object. If backfill is initiated again for the object, then Datastream will stream all existing data associated with the object from the source into the destination.

You can also use theOBJECTS tab to view additional information about the objects of a stream. This information includes:

  • The status of the objects.
  • How many events Datastream processed and loaded into the destination for an object in the last 7 days.
  • The total size (in GB) of all events that Datastream processed and loaded into the destination for an object in the last 30 days.
  • Details about the table columns of the database schemas that are streamed from the source into the destination.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.