Bigtable to Cloud Storage SequenceFile template Stay organized with collections Save and categorize content based on your preferences.
The Bigtable to Cloud Storage SequenceFile template is a pipeline that reads data from a Bigtable table and writes the data to a Cloud Storage bucket in SequenceFile format. You can use the template to copy data from Bigtable to Cloud Storage.
Pipeline requirements
- The Bigtable table must exist.
- The output Cloud Storage bucket must exist before running the pipeline.
Template parameters
Required parameters
- bigtableProject: The ID of the Google Cloud project that contains the Bigtable instance that you want to read data from.
- bigtableInstanceId: The ID of the Bigtable instance that contains the table.
- bigtableTableId: The ID of the Bigtable table to export.
- destinationPath: The Cloud Storage path where data is written. For example,
gs://your-bucket/your-path/. - filenamePrefix: The prefix of the SequenceFile filename. For example,
output-.
Optional parameters
- bigtableAppProfileId: The ID of the Bigtable application profile to use for the export. If you don't specify an app profile, Bigtable uses the instance's default app profile:https://cloud.google.com/bigtable/docs/app-profiles#default-app-profile.
- bigtableStartRow: The row where to start the export from, defaults to the first row.
- bigtableStopRow: The row where to stop the export, defaults to the last row.
- bigtableMaxVersions: Maximum number of cell versions. Defaults to: 2147483647.
- bigtableFilter: Filter string. See:http://hbase.apache.org/book.html#thrift. Defaults to empty.
- bigtableReadRpcTimeoutMs: The operation timeout of the Bigtable read. Default is 12 hours.
- bigtableReadRpcAttemptTimeoutMs: The attempt timeout of the Bigtable read. Default is 10 minutes.
Run the template
Console
- Go to the DataflowCreate job from template page. Go to Create job from template
- In theJob name field, enter a unique job name.
- Optional: ForRegional endpoint, select a value from the drop-down menu. The default region is
us-central1.For a list of regions where you can run a Dataflow job, seeDataflow locations.
- From theDataflow template drop-down menu, select theCloud Bigtable to SequenceFile Files on Cloud Storage template .
- In the provided parameter fields, enter your parameter values.
- ClickRun job.
gcloud
Note: To use the Google Cloud CLI to run classic templates, you must haveGoogle Cloud CLI version 138.0.0 or later.In your shell or terminal, run the template:
gclouddataflowjobsrunJOB_NAME\--gcs-locationgs://dataflow-templates-REGION_NAME/VERSION/Cloud_Bigtable_to_GCS_SequenceFile\--regionREGION_NAME\--parameters\bigtableProject=BIGTABLE_PROJECT_ID,\bigtableInstanceId=INSTANCE_ID,\bigtableTableId=TABLE_ID,\bigtableAppProfileId=APPLICATION_PROFILE_ID,\destinationPath=DESTINATION_PATH,\filenamePrefix=FILENAME_PREFIX
Replace the following:
JOB_NAME: a unique job name of your choiceVERSION: the version of the template that you want to useYou can use the following values:
latestto use the latest version of the template, which is available in thenon-dated parent folder in the bucket—gs://dataflow-templates-REGION_NAME/latest/- the version name, like
2023-09-12-00_RC00, to use a specific version of the template, which can be found nested in the respective dated parent folder in the bucket—gs://dataflow-templates-REGION_NAME/
REGION_NAME: theregion where you want todeploy your Dataflow job—for example,us-central1BIGTABLE_PROJECT_ID: the ID of the Google Cloud project of the Bigtable instance that you want to read data fromINSTANCE_ID: the ID of the Bigtable instance that contains the tableTABLE_ID: the ID of the Bigtable table to exportAPPLICATION_PROFILE_ID: the ID of the Bigtable application profile to be used for the exportDESTINATION_PATH: the Cloud Storage path where data is written, for example,gs://mybucket/somefolderFILENAME_PREFIX: the prefix of the SequenceFile filename, for example,output-
API
To run the template using the REST API, send an HTTP POST request. For more information on the API and its authorization scopes, seeprojects.templates.launch.
POSThttps://dataflow.googleapis.com/v1b3/projects/PROJECT_ID/locations/LOCATION/templates:launch?gcsPath=gs://dataflow-templates-LOCATION/VERSION/Cloud_Bigtable_to_GCS_SequenceFile{"jobName":"JOB_NAME","parameters":{"bigtableProject":"BIGTABLE_PROJECT_ID","bigtableInstanceId":"INSTANCE_ID","bigtableTableId":"TABLE_ID","bigtableAppProfileId":"APPLICATION_PROFILE_ID","destinationPath":"DESTINATION_PATH","filenamePrefix":"FILENAME_PREFIX",},"environment":{"zone":"us-central1-f"}}
Replace the following:
PROJECT_ID: the Google Cloud project ID where you want to run the Dataflow jobJOB_NAME: a unique job name of your choiceVERSION: the version of the template that you want to useYou can use the following values:
latestto use the latest version of the template, which is available in thenon-dated parent folder in the bucket—gs://dataflow-templates-REGION_NAME/latest/- the version name, like
2023-09-12-00_RC00, to use a specific version of the template, which can be found nested in the respective dated parent folder in the bucket—gs://dataflow-templates-REGION_NAME/
LOCATION: theregion where you want todeploy your Dataflow job—for example,us-central1BIGTABLE_PROJECT_ID: the ID of the Google Cloud project of the Bigtable instance that you want to read data fromINSTANCE_ID: the ID of the Bigtable instance that contains the tableTABLE_ID: the ID of the Bigtable table to exportAPPLICATION_PROFILE_ID: the ID of the Bigtable application profile to be used for the exportDESTINATION_PATH: the Cloud Storage path where data is written, for example,gs://mybucket/somefolderFILENAME_PREFIX: the prefix of the SequenceFile filename, for example,output-
Template source code
Java
This template's source code is in theGoogleCloudPlatform/cloud-bigtable-client repository on GitHub.
What's next
- Learn aboutDataflow templates.
- See the list ofGoogle-provided templates.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.