Change streams overview

Achange stream watches and streams out a Spannerdatabase'sdata changes—inserts, updates, and deletes—in nearreal-time.

This page offers a high-level overview of Spanner changestreams: what they do, and how they work. To learn how to create andmanage change streams in your database and connect them with otherservices, follow the links inWhat's Next.

Purpose of change streams

Change streams provide a flexible, scalable way to stream datachanges to other services. Common use cases include:

  • Replicating Spanner data changes to a data warehouse, such asBigQuery, for analytics.

  • Triggering application logic based on data changes sent to a messagequeue, such asPub/Sub.

  • Storing data changes inCloud Storage, for complianceor archival purposes.

Change stream configuration

Spanner treats change streams as schema objects, muchlike tables and indexes. As such, youcreate, modify, and delete changestreams using DDL statements, and you canview a database'schange streams just like other DDL-managed schemaobjects.

You can configure a change stream to watch data changes across anentire database, or limit its scope to specific tables and columns. Adatabase can have multiple change streams, and a particular table orcolumn can have multiple streams watching it,withinlimits.

You can optionally configure a change stream with the following:

Issuing the DDL that creates a change stream starts along-runningoperation. When itcompletes, the new change stream immediately begins to watch the tablesand columns assigned to it.

Implicitly watching tables and columns

Change streams that watch an entire table implicitly watch all thecolumns in that table, even when that table definition is updated. Forexample, when you add new columns to that table, the change streamautomatically begins to watch those new columns, without requiring anymodification to that change stream's configuration. Similarly, thechange stream automatically stops watching any columns that are droppedfrom that table.

Whole-database change streams work the same way. They implicitly watchevery column in every table, automatically watching any tables orcolumns added after the change stream's creation, and ceasing to watchany tables or columns dropped.

Explicitly watching tables and columns

If you configure a change stream to watch only particular columns in atable, and you later add columns to that table, the change stream willnot begin to watch those columns unless you reconfigure that changestream to do so.

The database's schema treats change streams as dependent objects of anycolumns or tables that they explicitly watch. Before you can drop anysuch column or table, you must manually remove it from the configurationof any change stream explicitly watching it.

Types of data changes that change streams watch

The data changes that a change stream watches include all inserts,updates, and deletes made to the tables and columns that it watches. Thesechanges can come from:

Change streams can watch data changes only in user-createdcolumns and tables. They don't watch indexes, views, other change streams,or system tables such as the information schema or statistics tables. Changestreams don't watch generated columns unless the column is part of theprimary key. Primary key columns are always tracked.

Furthermore, change streams don't watch schema changes or any datachanges that directly result from schema changes, other than backfillsfordefault values. Forexample, a change stream watching an entire database doesn't considerand record a table deletion as a data change, even though this actiondeletes all of that table's data from the database.

How Spanner writes and stores change streams

Every time Spanner detects a data change in a column beingwatched by a change stream, it writes adata change record toits internal storage.The data change write and the data change record are written within the sametransaction. Spanner co-locates both of these writes so they areprocessed by the same server, minimizing write processing. The transaction isthen replicated across the database's replicas, subjecting it to storage andreplication costs.For more information, seeSpanner pricing.

Content of a data change record

Every data change record written by a change stream includes thefollowing information about the data change:

  • The name of the affected table

  • The names, values, and data types of the primary keys identifying thechanged row

  • The names and data types of the changed row's columns that werecaptured based on the change stream definition.

  • The old values of the row's columns. The availability of the old valuesand the content they track, which can be either the modified columns only orthe entire tracked row, depends on the user-configuredvalue capture type.

  • The new values of the row's columns. The availability of the new valuesand the content they track depends on the user-configuredvalue capture type.

  • The modification type (insert, update, or delete)

  • The commit timestamp

  • The transaction ID

  • The record sequence number

  • The data change record's value capture type.

For a deeper look at the structure of data change records, seeData change records.

Note: A change stream'svalue capture type configuration optioncontrols the way that it records a changed row's values.OLD_AND_NEW_VALUES is this option's default setting. For more information,seevalue capture type.

Data retention

A change stream retains its data change records for a period of timebetween one and thirty days.You can use DDL tospecify a data-retention limit other than the one-day default wheninitially creating a change stream, or adjust it at any future time.Note that reducing a change stream's data retention limit will make allhistorical change data older than the new limit immediately andpermanently unavailable to that change stream's readers.

This data retention period presents a trade-off; a longer retentionperiod carries greater storage demands onthe stream's database.

Value capture type

A change stream'svalue capture type configuration optioncontrols the way that it stores a changed row's values.You can use DDLto specify one of the following value capture types for a change stream:

  • OLD_AND_NEW_VALUES: Captures both old and new values of a row's modifiedcolumns.

  • NEW_VALUES: Captures only the new values of the non-key columns, but no oldvalues.

  • NEW_ROW: Captures all new values of watched columns, both modified andunmodified, whenever any of those columns change. No old values are captured.

  • NEW_ROW_AND_OLD_VALUES: Captures all new values for both modified andunmodified columns, and old values for modified columns.

Exclude time-to-live based deletes

In Spanner,time-to-live (TTL) lets youset policies to periodically delete data from Spanner tables.By default, change streams include all TTL-based deletes. You can useexclude_ttl_deletes to set your change stream to exclude TTL-based deletes.When you set this filter to exclude TTL-based deletes, only future TTL-baseddeletes are excluded from your change stream.

The default value for this filter isfalse. To exclude TTL-based deletes,set the filter totrue. You can eitheradd the filter when you create a change streamormodify an existing change stream to include the filter.

Table modification type

By default, change streams include all table modifications, such as inserts,updates, and deletes.You can filter one or more of these table modifications from yourchange stream's scope using the following available filter options:

  • exclude_insert: exclude allINSERT table modifications
  • exclude_update: exclude allUPDATE table modifications
  • exclude_delete: exclude allDELETE table modifications

The default value for these filters isfalse. To exclude a specific type oftable modification, set the filter totrue. You can set one or morefilters at the same time.

You canadd a filter for a table modification typewhen you create a change stream ormodify the filter for a table modification typefor an existing change stream.

Transaction-level records exclusion

By default, a change stream watches all write transactions in the databasebecause theallow_txn_exclusion DDL option is set tofalse. You can set theallow_txn_exclusion option totrue to enable your change stream toignore records from specified write transactions. If you don't set this optiontotrue, then all write transactions are watched, even if you use theexclude_txn_from_change_streams parameter in your write transaction.

You can eitherenable this option when you create a change streamormodify an existing change stream.

Exclude write transaction from change streams

To exclude a write transaction from change streams, you must set theexclude_txn_from_change_streams parameter totrue. This parameter ispart of theTransactionOptions andBatchWriteRequestmethods. The default value for this parameter isfalse. You can set thisparameter with the RPC API, REST API, or using the client libraries. For moreinformation, seeSpecify a write transaction to be excluded from change streams.

You can't set this parameter totrue for read-only transactions. If you dothis, then the API returns an invalid argument error.

For change streams monitoring columns modified by transactions, whenexclude_txn_from_change_streams is set totrue, two scenarios are possible:

  • If the DDL optionallow_txn_exclusion is set totrue, then the updatesmade within this transaction aren't recorded in the change stream.
  • If you don't set the DDL optionallow_txn_exclusion or if it's set tofalse, then the updates made within this transaction are recorded in thechange stream.

If you don't set theexclude_txn_from_change_streams option or if it's set tofalse, then any change streams monitoring columns modified by transactionswill capture the updates made within that transaction.

Reading change streams

Spanner offers multiple ways to read a change stream's data:

  • Through Dataflow, using the Apache Beam SpannerIO connector.This is our recommended solution for most change stream applications.Google also provides Dataflow templates for common use cases.

  • Directly, using the Spanner API. This trades away theabstraction and capabilities of Dataflow pipelines for maximumspeed and flexibility.

  • Through using the Debezium-based Kafka connector for Spannerchange streams. This connector streams change records directly into Kafka topics.

You can provide partial isolation for change streams reads by using directedreads. Directed reads can help to minimize impact on transactional workloads inyour database. You can use the Spanner API to route changestreams reads to a specific replica type or region within a multi-regioninstance configuration or a custom regional configuration with optionalread-only region(s). For more information, seedirected reads.

Using Dataflow

Use theApache Beam SpannerIO connectortobuild Dataflow pipelines that read from change streams. Afteryou configure the connector with detailsabout a particular change stream, it automatically outputs new datachange records into a single, unboundedPCollectiondata set, ready for further processing by subsequent transforms in theDataflow pipeline.

Dataflow uses windowing functions to divide unbounded collections intological components, or windows. As a result, Dataflow providesnear real-time streaming whenreading from change streams.

Google provides templates that let you rapidly buildDataflow pipelines for common change stream use cases,including sending all of a stream's data changesto aBigQuery dataset, or copying themto aCloud Storage bucket.

For a more detailed overview of how change streams and Dataflowwork together, seeBuild change streams connections with Dataflow.

Using the API

As an alternative to using Dataflow to build change streampipelines, you can instead write code that uses theSpanner API to read a change stream's records directly. Thisallows you to read data change records in the same way that the SpannerIOconnector does, by providing the lowest possible latencies when reading changestream data instead of providing the flexibility of Dataflow.

To learn more, seeQuery change streams. For a more detaileddiscussion on how to query change streams and interpret the recordsreturned, seeChange streams partitions, records, and queries.

Using the Kafka connector

The Kafka connector directly outputs change stream records into a Kafka topic.It abstracts away the details of querying change streams usingthe Spanner API.

To learn more about how change streams and the Kafka connector worktogether, seeBuild change streams connections with the Kafka connector.

Limits

There are several limits on change streams, including the maximum numberof change streams a database can have, and the maximum number of streamsthat can watch a single column. For a full list, seeChange streamlimits.

Permissions

Change streams uses the following:

  • Creating, updating, or dropping change streams requiresspanner.databases.updateDdl.

  • Reading a change stream's data requiresspanner.databases.select.

If using the SpannerIO connector, then the owner of theDataflow job that reads change stream data requires additionalIAM permissions, either on your application database or on a separatemetadata database; seeCreate a metadata database.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.