Write from Dataflow to Bigtable

To write data from Dataflow to Bigtable, use theApache BeamBigtable I/O connector.

Note: Depending on your scenario, consider using one of theGoogle-provided Dataflow templates.Several of these write to Bigtable.

Parallelism

Parallelism is controlled by the number ofnodes in theBigtable cluster. Each node manages one or more key ranges,although key ranges can move between nodes as part ofload balancing. For more information,seeUnderstand performance in theBigtable documentation.

You are charged for the number of nodes in your instance's clusters. SeeBigtable pricing.

Performance

The following table shows performance metrics for Bigtable I/Owrite operations. The workloads were run on onee2-standard2 worker, usingthe Apache Beam SDK 2.48.0 for Java. They did not use Runner v2.

100M record | 1kB | 1 columnThroughput (bytes)Throughput (elements)
Write65 MBps60,000 elements per second

These metrics are based on simple batch pipelines. They are intended to compare performance between I/O connectors, and are not necessarily representative of real-world pipelines. Dataflow pipeline performance is complex, and is a function of VM type, the data being processed, the performance of external sources and sinks, and user code. Metrics are based on running the Java SDK, and aren't representative of the performance characteristics of other language SDKs. For more information, seeBeam IO Performance.

Best practices

  • In general, avoid using transactions. Transactions aren't guaranteed to beidempotent, and Dataflow might invoke them multiple times dueto retries, causing unexpected values.

  • A single Dataflow worker might process data for many keyranges, leading to inefficient writes to Bigtable. UsingGroupByKey to group data by Bigtable key can significantlyimprove write performance.

  • If you write large datasets to Bigtable, consider callingwithFlowControl. This setting automatically rate-limitstraffic to Bigtable, to ensure the Bigtableservers have enough resources available to serve data.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.