Export data to Bigtable (reverse ETL)

This document describes how you can set up reverse ETL (RETL) fromBigQuery to Bigtable. You can do this by using theEXPORTDATA statement toexport data from a BigQuery table to aBigtable table.

You can use a RETL workflow to Bigtable tocombine BigQuery's analytics capabilities withBigtable's low latency and high throughput. This workflow letsyou serve data to application users without exhausting quotas and limits onBigQuery.

Characteristics of Bigtable tables

Bigtable tables are different from BigQuery tablesin several ways:

  • Both Bigtable and BigQuery tables are made ofrows, but a Bigtable row is made of row key and column familiesthat have an arbitrary number of columns belonging to the same column family.
  • Column families for a given table are created at table creation time but canalso be added or removed later. When a column family is created, columns thatbelong to it don't need to be specified.
  • Bigtable columns don't need to be defined ahead of time and canbe used to store data in their name (also known as aqualifier) withindata size limits within tables.
  • Bigtable columns can have any binary value withindata sizelimits within tables.
  • Bigtable columns always have a temporal dimension (also knownasversion). Any number of values might be stored in any row for the same columnas long as the timestamp is not identical.
  • A Bigtable timestamp is measured in microseconds sinceUnix epoch time—forexample, 0 represents 1970-01-01T00:00:00 UTC. Timestampsmust be a non-negative number of microseconds with millisecondgranularity (only multiples of 1000us are accepted). The defaultBigtable timestamp is 0.
  • Data in Bigtable isread by row key, multiple row keys, range of row keys, or by using a filter.At least one row key or row keys rangeis required in all types of read requests except for a full table scan.

For information about preparing BigQuery results for export toBigtable, seePrepare query results for export.

Before you begin

You must create aBigtable instance and aBigtable tableto receive the exported data.

GrantIdentity and Access Management (IAM) roles that give users thenecessary permissions to perform each task in this document.

Required roles

To get the permissions that you need to export BigQuery data to Bigtable, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

Limitations

Location considerations

  • If your BigQuery dataset is in a multi-region, yourBigtable app profile must be configured to route data to a Bigtable cluster within that multi-region. For example, if your BigQuery dataset is in theUS multi-region, the Bigtable cluster can be located in theus-west1 (Oregon) region, which is within the United States.
  • If your BigQuery dataset is in a single region, yourBigtable app profile must be configured to route data to a Bigtable cluster in the same region. For example, if your BigQuery dataset is in theasia-northeast1 (Tokyo) region, your Bigtable cluster must also be in theasia-northeast1 (Tokyo) region.

For more information, seeBigtable locations.

Supported BigQuery types

The following types of data are supported when they're written toBigtable:

BigQuery typeBigtable value written
BYTESExported as is.
STRINGConverted toBYTES.
INTEGERIfbigtable_options.column_families.encoding is set toBINARY, then the value is written in an 8 byte, big-endian format (most significant byte first). Ifbigtable_options.column_families.encoding is set toTEXT, then the value is written as a human-readable string representing a number.
FLOATWrites value in the IEEE 754 8-byte output format.
BOOLEANIfbigtable_options.column_families.encoding is set toBINARY, then the value is written as a 1 byte value (false = 0x00 ortrue = 0x01). Ifbigtable_options.column_families.encoding is set toTEXT, then the value is written as a text ("true" or"false").
JSON
An exported column ofJSON type is interpreted as a group of columns belonging to a specific Bigtable column family. Members of the JSON object are interpreted as columns and their values are to be written to Bigtable. The name of the column to be written can be adjusted using thebigtable_options configuration.

For example:
JSON'{"FIELD1":"VALUE1","FIELD2":"VALUE2"}'asMY_COLUMN_FAMILY
Where valuesVALUE1 andVALUE2 are written to Bigtable as columnsFIELD1 andFIELD2 to the column familyMY_COLUMN_FAMILY.
Note: A JSON document nested in anotherJSON orSTRUCT type is written to Bigtable as a string. Exporting aSTRUCT value nested in another struct is subject to limitations explained in theConfigure exports withbigtable_options section.
STRUCT
An exported column ofSTRUCT type is interpreted as a group of columns belonging to a specific Bigtable column family. Members of the struct are interpreted as columns and their values to be written to Bigtable. The name of the column to be written can be adjusted using thebigtable_options configuration.

For example:
STRUCT<FIELD1STRING,FIELD2INTEGER>asMY_COLUMN_FAMILY
Where valuesFIELD1 andFIELD2 are written to Bigtable as columnsFIELD1 andFIELD2 to the column familyMY_COLUMN_FAMILY.

These supported data types are similar to reading fromexternal Bigtable tablesfor BigQuery.

NULL values in Bigtable

NULL values in Bigtable have the following constraints:

  • Bigtable has no analog forNULL values. Exporting aNULLvalue for a given column family and column in Bigtable deletesthe present values from a Bigtable row.

  • If a Bigtable value with a given row key, column family, columnqualifier, and timestamp doesn't exist prior to the export, the exportedNULL values have no effect on the Bigtable row.

  • When exporting aNULL value of theSTRUCT orJSON type, all column valuesbelonging to the corresponding column family of the affected row are deleted.You should cast theNULL value to theSTRUCT orJSON type in order for the SQLengine to attach a correct type to it. The following query deletes all datafrom column familycolumn_family1 with a set of given rowkeys:

    EXPORTDATAOPTIONS(...)ASSELECTrowkey,CAST(NULLasSTRUCT<INT64>)AScolumn_family1FROMT
  • Rows withNULL row keys are skipped during the export. The number of skippedrows is returned in export statistics to the caller.

Configure exports withbigtable_options

You can use thebigtable_options configuration during an export to bridge thedifferences between BigQuery and Bigtable storagemodels. The configuration is expressed in the form of a JSON string, as seen inthe following example:

EXPORTDATAOPTIONS(uri="https://bigtable.googleapis.com/projects/PROJECT_ID/instances/INSTANCE_ID/appProfiles/APP_PROFILE_ID/tables/TABLE",bigtable_options="""{     "columnFamilies": [{       "familyId": "COLUMN_FAMILY_NAME",       "encoding": "ENCODING_VALUE",       "columns": [         {           "qualifierString": "BIGTABLE_COLUMN_QUALIFIER",           ["qualifierEncoded": "BASE_64_ENCODED_VALUE",]           "fieldName": "BIGQUERY_RESULT_FIELD_NAME"         }       ]    }]   }""")

The following table describes the possible fields used in abigtable_optionsconfiguration:

Field nameDescription
columnFamiliesAn array of column family descriptors.
columnFamilies.familyIdIdentifier of Bigtable column family.
columnFamilies.encodingValue can be set toBINARY orTEXT. For information about how types are encoded, seeSupported BigQuery types.
columnFamilies.columnsAn array of Bigtable column mappings.
columnFamilies.columns.qualifierStringOptional: A Bigtable column qualifier. Specify this value if the column qualifier has no non-UTF-8 codes. The fieldsqualifierString andqualifierEncoding are mutually exclusive. If neitherqualifierString norqualifierEncoded are specified,fieldName is used as a column qualifier.
columnFamilies.columns.qualifierEncodedOptional: Base64-encoded column qualifier. Similar toqualifierString in case the column qualifier must have non-UTF-8 codes.
columnFamilies.columns.fieldNameRequired: BigQuery result set field name. Can be an empty string in certain cases. For an example of how an emptyfieldName value is used with fields of simple types, seePrepare query results for export.

Prepare query results for export

To export query results to Bigtable, the results must meet thefollowing requirements:

  • The result set must contain a columnrowkey of the typeSTRING orBYTES.
  • Row keys, column qualifiers, values, and timestamps must not exceed Bigtabledata size limits within tables.
  • At least one column other thanrowkey must be present in the result set.
  • Each result set column must be of one of thesupported BigQuery types. Any unsupported column types must be converted to one of the supported types before exporting to Bigtable.

Bigtable doesn't require column qualifiers to be valid BigQuery column names, and Bigtable supports using any bytes. For information about overriding target column qualifiers for an export, seeConfigure exports withbigtable_options.

If you use exported values with Bigtable APIs, such asReadModifyWriteRow, any numerical values must use the correctbinary encoding.

By default, standalone result columns of types other thanSTRUCT orJSON are interpreted as values for destination column familiesequal to result column name, and column qualifier equal to an empty string.

To demonstrate how these data types are written, consider the follow SQL example,wherecolumn andcolumn2 are standalone result columns:

SELECTxascolumn1,yascolumn2FROMtable

In this example query,SELECT x as column1 writes values to Bigtableunder thecolumn1 column family and'' (empty string) column qualifier whenhandling types other thanJSON orSTRUCT.

You can change how these types are written in an export using thebigtable_options configuration, asseen in the following example:

EXPORTDATAOPTIONS(bigtable_options="""{   "columnFamilies" : [      {        "familyId": "ordered_at",        "columns": [           {"qualifierString": "order_time", "fieldName": ""}        ]      }   ]}""")ASSELECTorder_idasrowkey,STRUCT(product,amount)ASsales_info,EXTRACT(MILLISECONDFROMorder_timestampATTIMEZONE"UTC")ASordered_atFROMT

In this example, BigQuery tableT contains thefollowing row:

order_idorder_timestampproductamount
1012023-03-28T10:40:54ZJoystick2

If you use the precedingbigtable_options configuration with tableT, thefollowing data is written to Bigtable:

rowkeysales_info (column family)ordered_at (column family)
101productamountorder_time
1970-01-01T00:00:00ZJoystick1970-01-01T00:00:00Z21680000054000

1680000054000 represents2023-03-28T10:40:54Z in milliseconds sinceUnix epoch time in the UTC time zone.

Auto-create new column families

To auto-create new column families in a Bigtable table,set theauto_create_column_families option in theEXPORT DATA statement totrue. This option requires thebigtable.tables.update permission, which is included in roles such asBigtable Administrator (roles/bigtable.admin).

EXPORTDATAOPTIONS(uri="https://bigtable.googleapis.com/projects/PROJECT-ID/instances/INSTANCE-ID/appProfiles/APP_PROFILE_ID/tables/TABLE",format="CLOUD_BIGTABLE",auto_create_column_families=true)ASSELECTorder_idasrowkey,STRUCT(product,amount)ASsales_infoFROMT

Set timestamp for all cells in a row using_CHANGE_TIMESTAMP

You can add a_CHANGE_TIMESTAMP column of theTIMESTAMP type to the result for export.Every cell written to Bigtable uses the timestamp value from the_CHANGE_TIMESTAMP of the exported result row.

Bigtable doesn't support timestamps earlier than Unix epoch(1970-01-01T00:00:00Z). If_CHANGE_TIMESTAMP value isNULL, theUnix epoch time of0 is used as the default timestamp value.

The following query writes cells forproduct andamount columns with thetimestamp specified in theorder_timestamp column of tableT.

EXPORTDATAOPTIONS(...)ASSELECTrowkey,STRUCT(product,amount)ASsales_info,order_timestampas_CHANGE_TIMESTAMPFROMT

Export continuously

If you want to continually process an export query, you can configure it as acontinuous query.

Export multiple results with the samerowkey value

When you export a result containing multiple rows with the samerowkey value,values written to Bigtable end up in the sameBigtable row.

You can use this method to generate multiple versions of column values in thesame row. In this example, theorders table in BigQuery containsthe following data:

idcustomerorder_timestampamount_spent
100Bob2023-01-01T10:10:54Z10.99
101Alice2023-01-02T12:10:50Z102.7
102Bob2023-01-04T15:17:01Z11.1

The user then executes the followingEXPORT DATA statement:

EXPORTDATAOPTIONS(uri="https://bigtable.googleapis.com/projects/PROJECT-ID/instances/INSTANCE-ID/appProfiles/APP_PROFILE_ID/tables/TABLE",format="CLOUD_BIGTABLE")ASSELECTcustomerasrowkey,STRUCT(amount_spent)asorders_column_family,order_timestampas_CHANGE_TIMESTAMPFROMorders

Using this statement with the BigQueryorders tableresults in the following data written to Bigtable:

orders_column_family
Row keyamount_spent
Alice2023-01-02T12:10:50Z102.7
Bob2023-01-01T10:10:54Z10.99
2023-01-04T15:17:01Z11.1

Exporting to Bigtable merges new values into the tableinstead of replacing entire rows. If values are already present inBigtable for a row key, then new values can partially or fullyoverride earlier values depending on the column family, column names, andtimestamps of the cells being written.

Caution: Avoid exporting results that have multiple rows with different values for the same row key, column family, column qualifier, and timestamp. The outcome of such an export is non-deterministic and depends on the query plan and scheduling inside BigQuery. BigQuery cannot determine whether the value overridden during the export existed in Bigtable before export or was previously inserted by the same export process.

Export multiple columns as Protocol Buffer (Protobuf) values

Protocol buffers provide a flexible and efficientmechanism for serializing structured data. Exporting as a Protobuf can be beneficial consideringhow different types are handled between BigQuery andBigtable. You can use BigQuery user-definedfunctions (UDFs) to export data as Protobuf binary values toBigtable. For more information, seeExport data as Protobuf columns.

Export optimization

You can change the throughput at which records are exported from BigQuery toBigtable by modifying the number of nodes in theBigtable destination cluster.The throughput (rows written per second) linearly scales with the numbers of nodes in the destination cluster.For example, if you double the number of nodes in your destination cluster, your export throughput will roughly double.

Pricing

When you export data in a standard query, you are billed usingdata extraction pricing.When you export data in a continuous query, you are billed usingBigQuery capacity compute pricing.To run continuous queries, you must have areservation that uses theEnterprise or Enterprise Plus edition,and areservation assignmentthat uses theCONTINUOUS job type.

After the data is exported, you're charged for storing the data inBigtable. For more information,seeBigtable pricing.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.