Export table data to Cloud Storage

This page describes how to export or extract data from BigQuery tablesto Cloud Storage.

After you'veloaded your data intoBigQuery, you can export the datain several formats. BigQuery can export up to 1 GB of logical data size to asingle file. If you are exporting more than 1 GB of data, you must export yourdata tomultiple files. When youexport your data to multiple files, the size of the files will vary.

You can also export the results of a query by using theEXPORT DATAstatement. You can useEXPORT DATA OPTIONSto specify the format of the exported data.

Finally, you can use a service such asDataflowto read data from BigQuery instead of exporting it from BigLake. Formore information about using Dataflow to read from and write toBigQuery, seeBigQuery I/O documentation.

Export limitations

When you export data from BigQuery, note the following:

Caution: If you are exporting data to a Cloud Storage bucket, we stronglyrecommend that you disable theBucket Lock andSoft Delete retention policies on your bucket. Whenyou export to a bucket with these retention policies, BigQueryattempts to rewrite files to the bucket, which can fail if the bucket'sretention policy prevents a file from being overwritten, resulting in additionalincurred charges. You can re-enable these policies after the export completes.
  • You cannot export table data to a local file, to Google Sheets, or toGoogle Drive. The only supported export location is Cloud Storage. For informationon saving query results, seeDownloading and saving query results.
  • You can export up to 1 GB of logical table data size to a single file. If you areexporting more than 1 GB of data, use awildcardto export the data into multiple files. When you export data to multiplefiles, the size of the files will vary. Tolimit the exported file size, you canpartition your data and export each partition.
  • The generated file size when using theEXPORT DATA statement is not guaranteed.
  • The number of files generated by an extract job can vary.
  • You cannot export nested and repeated data in CSV format. Nested and repeateddata are supported for Avro, JSON, and Parquet exports.
  • When you export data inJSON format,INT64(integer) data types are encoded as JSON strings to preserve 64-bit precisionwhen the data is read by other systems.
  • You cannot export data from multiple tables in a single extract job.
  • You cannot choose a compression type other thanGZIP when you export datausing the Google Cloud console.
  • When you export a table in JSON format, the symbols<,>, and& are convertedby using the unicode notation\uNNNN, whereN is a hexadecimal digit.For example,profit&loss becomesprofit\u0026loss.This unicode conversion is done to avoid security vulnerabilities.
  • The order of exported table data is not guaranteed unless you use theEXPORT DATAstatement and specify anORDER BY clause in thequery_statement.
  • BigQuery doesn't support Cloud Storage resource pathsthat include multiple consecutive slashes after the initial double slash.Cloud Storage object names can contain multiple consecutive slash("/") characters. However, BigQuery converts multipleconsecutive slashes into a single slash. For example, the following resourcepath, though valid in Cloud Storage, doesn't work inBigQuery:gs://bucket/my//object//name.
  • Any new data loaded into BigQuery while an extract job isrunning won't be included in that extract job. You must create a new extract jobto export the new data.

Before you begin

GrantIdentity and Access Management (IAM) roles that give users thenecessary permissions to perform each task in this document.

Required permissions

To perform the tasks in this document, you need the following permissions.

Permissions to export data from a BigQuery table

To export data from a BigQuery table, you need thebigquery.tables.export IAM permission.

Each of the following predefined IAM roles includes thebigquery.tables.export permission:

  • roles/bigquery.dataViewer
  • roles/bigquery.dataOwner
  • roles/bigquery.dataEditor
  • roles/bigquery.admin

Permissions to run an extract job

To run an extractjob, you need thebigquery.jobs.create IAM permission.

Each of the following predefined IAM roles includes the permissions that you need in order to run an extract job:

  • roles/bigquery.user
  • roles/bigquery.jobUser
  • roles/bigquery.admin

Permissions to write the data to the Cloud Storage bucket

To write the data to an existing Cloud Storage bucket, you need the following IAM permissions:

  • storage.objects.create
  • storage.objects.delete

Each of the following predefined IAM roles includes the permissions that you need in order to write the data to an existing Cloud Storage bucket:

  • roles/storage.objectAdmin
  • roles/storage.admin

For more information about IAM roles and permissions inBigQuery, seePredefined roles and permissions.

Export formats and compression types

BigQuery supports the following data formats and compression typesfor exported data.

Data formatSupported compression typesDetails
CSVGZIP

You can control the CSV delimiter in your exported data by using the--field_delimiter bq command-line tool flag or theconfiguration.extract.fieldDelimiter extract job property.

Nested and repeated data is not supported.

JSONGZIPNested and repeated data are supported.
AvroDEFLATE, SNAPPY

GZIP is not supported for Avro exports.

Nested and repeated data are supported. SeeAvro export details.

ParquetSNAPPY, GZIP, ZSTD

Nested and repeated data are supported. SeeParquet export details.

Export data

The following sections show you how to export your table data, table metadata,and query results to Cloud Storage.

Export table data

You can export table data by:

  • Using the Google Cloud console
  • Using thebq extract command in the bq command-line tool
  • Submitting anextract job using the API or client libraries

Select one of the following:

Console

  1. Open the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

    If you don't see the left pane, clickExpand left pane to open the pane.

  3. In theExplorer pane, expand your project, clickDatasets, andthen click your dataset.

  4. ClickOverview> Tables, and then select a table.

  5. In the details pane, clickuploadExport.

  6. In theExport to Google Cloud Storage dialog:

    • ForGCS Location, browse for the bucket,folder, or file where you want to export the data.
    • ForExport format, choose the format for your exported data: CSV,JSON (Newline Delimited), Avro, or Parquet.
    • ForCompression, select a compression format or selectNone forno compression.
  7. ClickSave to export the table.

To check on the progress of the job, in theExplorer pane, clickJob history, and look for anEXTRACT type job.

To export views to Cloud Storage, useEXPORT DATA OPTIONS statement.

SQL

Use theEXPORT DATA statement.The following example exportsselected fields from a table namedmydataset.table1:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following statement:

    EXPORTDATAOPTIONS(uri='gs://bucket/folder/*.csv',format='CSV',overwrite=true,header=true,field_delimiter=';')AS(SELECTfield1,field2FROMmydataset.table1ORDERBYfield1);

  3. ClickRun.

For more information about how to run queries, seeRun an interactive query.

bq

Use thebq extract command with the--destination_format flag.

(Optional) Supply the--location flag and set the value to yourlocation.

Other optional flags include:

  • --compression: The compression type to use for exported files.
  • --field_delimiter: The character that indicates the boundary betweencolumns in the output file for CSV exports. Both\t andtab are allowedfor tab delimiters.
  • --print_header: When specified, print header rows for formats that haveheaders such as CSV.
bq extract --location=location \--destination_formatformat \--compressioncompression_type \--field_delimiterdelimiter \--print_header=boolean \project_id:dataset.table \gs://bucket/filename.ext

Where:

  • location is the name of your location. The--location flag isoptional. For example, if you are using BigQuery in theTokyo region, you can set the flag's value toasia-northeast1. You canset a default value for the location using the.bigqueryrc file.
  • format is the format for the exported data:CSV,NEWLINE_DELIMITED_JSON,AVRO, orPARQUET.
  • compression_type is a supported compression type for your dataformat. SeeExport formats and compression types.
  • delimiter is the character that indicates the boundary betweencolumns in CSV exports.\t andtab are accepted names for tab.
  • boolean istrue orfalse. When set totrue, header rowsare printed to the exported data if the data format supports headers. Thedefault value istrue.
  • project_id is your project ID.
  • dataset is the name of the source dataset.
  • table is the table you're exporting. If you use apartition decorator,then you must surround the table path with single quotation marks orescape the$ character.
  • bucket is the name of the Cloud Storage bucket to which you'reexporting the data. The BigQuery dataset and theCloud Storage bucket must be in the samelocation.
  • filename.ext is the name and extension of the exported datafile. You can export to multiple files using awildcard.

Examples:

For example, the following command exportsmydataset.mytable into a gzipcompressed file namedmyfile.csv.myfile.csv is stored in aCloud Storage bucket namedexample-bucket.

bqextract\--compressionGZIP\'mydataset.mytable'\gs://example-bucket/myfile.csv

The default destination format is CSV. To export into JSON or Avro, use thedestination_format flag and set it to eitherNEWLINE_DELIMITED_JSONorAVRO. For example:

bqextract\--destination_formatNEWLINE_DELIMITED_JSON\'mydataset.mytable'\gs://example-bucket/myfile.json

The following command exportsmydataset.mytable into an Avro file that iscompressed using Snappy. The file is namedmyfile.avro.myfile.avro isexported to a Cloud Storage bucket namedexample-bucket.

bqextract\--destination_formatAVRO\--compressionSNAPPY\'mydataset.mytable'\gs://example-bucket/myfile.avro

The following command exports a single partition ofmydataset.my_partitioned_table into a CSV file in Cloud Storage:

bqextract\--destination_formatCSV\'mydataset.my_partitioned_table$0'\gs://example-bucket/single_partition.csv

API

To export data, create anextract job and populate the job configuration.

Note: If you are exporting data to the Parquet format, it might be fasterto use theBigQuery export to Parquet (via BigQuery Storage API) template rather than writing a custom solution.

(Optional) Specify your location in thelocation property in thejobReference section of thejob resource.

  1. Create an extract job that points to the BigQuery sourcedata and the Cloud Storage destination.

  2. Specify the source table by using thesourceTable configuration objectthat contains the project ID, dataset ID, and table ID.

  3. Thedestination URI(s) propertymust be fully-qualified, in the formatgs://bucket/filename.ext.Each URI can contain one '*' wildcard character and it must comeafter the bucket name.

  4. Specify the data format by setting theconfiguration.extract.destinationFormatproperty. For example, to export a JSON file, set this property to thevalueNEWLINE_DELIMITED_JSON.

  5. To check the job status, calljobs.get(job_id)with the ID of the job returned by the initial request.

    • Ifstatus.state = DONE, the job completed successfully.
    • If thestatus.errorResult property is present, the request failed,and that object will include information describing what went wrong.
    • Ifstatus.errorResult is absent, the job finished successfully,although there might have been some nonfatal errors. Nonfatal errorsare listed in the returned job object'sstatus.errors property.

API notes:

  • As a best practice, generate a unique ID and pass it asjobReference.jobId when callingjobs.insert to create a job. Thisapproach is more robust to network failure because the client can poll orretry on the known job ID.

  • Callingjobs.insert on a given job ID is idempotent; in other words, youcan retry as many times as you like on the same job ID, and at most one ofthose operations will succeed.

C#

Before trying this sample, follow theC# setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryC# API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

usingGoogle.Cloud.BigQuery.V2;usingSystem;publicclassBigQueryExtractTable{publicvoidExtractTable(stringprojectId="your-project-id",stringbucketName="your-bucket-name"){BigQueryClientclient=BigQueryClient.Create(projectId);// Define a destination URI. Use a single wildcard URI if you think// your exported data will be larger than the 1 GB maximum value.stringdestinationUri=$"gs://{bucketName}/shakespeare-*.csv";BigQueryJobjob=client.CreateExtractJob(projectId:"bigquery-public-data",datasetId:"samples",tableId:"shakespeare",destinationUri:destinationUri);job=job.PollUntilCompleted().ThrowOnAnyError();// Waits for the job to complete.Console.Write($"Exported table to {destinationUri}.");}}

Go

Before trying this sample, follow theGo setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryGo API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

import("context""fmt""cloud.google.com/go/bigquery")// exportTableAsCompressedCSV demonstrates using an export job to// write the contents of a table into Cloud Storage as CSV.funcexportTableAsCSV(projectID,gcsURIstring)error{// projectID := "my-project-id"// gcsUri := "gs://mybucket/shakespeare.csv"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()srcProject:="bigquery-public-data"srcDataset:="samples"srcTable:="shakespeare"gcsRef:=bigquery.NewGCSReference(gcsURI)gcsRef.FieldDelimiter=","extractor:=client.DatasetInProject(srcProject,srcDataset).Table(srcTable).ExtractorTo(gcsRef)extractor.DisableHeader=true// You can choose to run the job in a specific location for more complex data locality scenarios.// Ex: In this example, source dataset and GCS bucket are in the US.extractor.Location="US"job,err:=extractor.Run(ctx)iferr!=nil{returnerr}status,err:=job.Wait(ctx)iferr!=nil{returnerr}iferr:=status.Err();err!=nil{returnerr}returnnil}

Java

Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importcom.google.cloud.RetryOption;importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.Table;importcom.google.cloud.bigquery.TableId;importorg.threeten.bp.Duration;publicclassExtractTableToCsv{publicstaticvoidrunExtractTableToCsv(){// TODO(developer): Replace these variables before running the sample.StringprojectId="bigquery-public-data";StringdatasetName="samples";StringtableName="shakespeare";StringbucketName="my-bucket";StringdestinationUri="gs://"+bucketName+"/path/to/file";// For more information on export formats available see:// https://cloud.google.com/bigquery/docs/exporting-data#export_formats_and_compression_types// For more information on Job see:// https://googleapis.dev/java/google-cloud-clients/latest/index.html?com/google/cloud/bigquery/package-summary.htmlStringdataFormat="CSV";extractTableToCsv(projectId,datasetName,tableName,destinationUri,dataFormat);}// Exports datasetName:tableName to destinationUri as raw CSVpublicstaticvoidextractTableToCsv(StringprojectId,StringdatasetName,StringtableName,StringdestinationUri,StringdataFormat){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();TableIdtableId=TableId.of(projectId,datasetName,tableName);Tabletable=bigquery.getTable(tableId);Jobjob=table.extract(dataFormat,destinationUri);// Blocks until this job completes its execution, either failing or succeeding.JobcompletedJob=job.waitFor(RetryOption.initialRetryDelay(Duration.ofSeconds(1)),RetryOption.totalTimeout(Duration.ofMinutes(3)));if(completedJob==null){System.out.println("Job not executed since it no longer exists.");return;}elseif(completedJob.getStatus().getError()!=null){System.out.println("BigQuery was unable to extract due to an error: \n"+job.getStatus().getError());return;}System.out.println("Table export successful. Check in GCS bucket for the "+dataFormat+" file.");}catch(BigQueryException|InterruptedExceptione){System.out.println("Table extraction job was interrupted. \n"+e.toString());}}}

Node.js

Before trying this sample, follow theNode.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryNode.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

// Import the Google Cloud client librariesconst{BigQuery}=require('@google-cloud/bigquery');const{Storage}=require('@google-cloud/storage');constbigquery=newBigQuery();conststorage=newStorage();asyncfunctionextractTableToGCS(){// Exports my_dataset:my_table to gcs://my-bucket/my-file as raw CSV./**   * TODO(developer): Uncomment the following lines before running the sample.   */// const datasetId = "my_dataset";// const tableId = "my_table";// const bucketName = "my-bucket";// const filename = "file.csv";// Location must match that of the source table.constoptions={location:'US',};// Export data from the table into a Google Cloud Storage fileconst[job]=awaitbigquery.dataset(datasetId).table(tableId).extract(storage.bucket(bucketName).file(filename),options);console.log(`Job${job.id} created.`);// Check the job's status for errorsconsterrors=job.status.errors;if(errors &&errors.length >0){throwerrors;}}

PHP

Before trying this sample, follow thePHP setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPHP API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

use Google\Cloud\BigQuery\BigQueryClient;/** * Extracts the given table as json to given GCS bucket. * * @param string $projectId The project Id of your Google Cloud Project. * @param string $datasetId The BigQuery dataset ID. * @param string $tableId The BigQuery table ID. * @param string $bucketName Bucket name in Google Cloud Storage */function extract_table(    string $projectId,    string $datasetId,    string $tableId,    string $bucketName): void {    $bigQuery = new BigQueryClient([      'projectId' => $projectId,    ]);    $dataset = $bigQuery->dataset($datasetId);    $table = $dataset->table($tableId);    $destinationUri = "gs://{$bucketName}/{$tableId}.json";    // Define the format to use. If the format is not specified, 'CSV' will be used.    $format = 'NEWLINE_DELIMITED_JSON';    // Create the extract job    $extractConfig = $table->extract($destinationUri)->destinationFormat($format);    // Run the job    $job = $table->runJob($extractConfig);  // Waits for the job to complete    printf('Exported %s to %s' . PHP_EOL, $table->id(), $destinationUri);}

Python

Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

# from google.cloud import bigquery# client = bigquery.Client()# bucket_name = 'my-bucket'project="bigquery-public-data"dataset_id="samples"table_id="shakespeare"destination_uri="gs://{}/{}".format(bucket_name,"shakespeare.csv")dataset_ref=bigquery.DatasetReference(project,dataset_id)table_ref=dataset_ref.table(table_id)extract_job=client.extract_table(table_ref,destination_uri,# Location must match that of the source table.location="US",)# API requestextract_job.result()# Waits for job to complete.print("Exported{}:{}.{} to{}".format(project,dataset_id,table_id,destination_uri))

Ruby

Before trying this sample, follow theRuby setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryRuby API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

require"google/cloud/bigquery"defextract_tablebucket_name="my-bucket",dataset_id="my_dataset_id",table_id="my_table_id"bigquery=Google::Cloud::Bigquery.newdataset=bigquery.datasetdataset_idtable=dataset.tabletable_id# Define a destination URI. Use a single wildcard URI if you think# your exported data will be larger than the 1 GB maximum value.destination_uri="gs://#{bucket_name}/output-*.csv"extract_job=table.extract_jobdestination_urido|config|# Location must match that of the source table.config.location="US"endextract_job.wait_until_done!# Waits for the job to completeputs"Exported#{table.id} to#{destination_uri}"end

Export table metadata

To export table metadata fromIceberg tables, use thefollowing SQL statement:

EXPORTTABLEMETADATAFROM`[[PROJECT_NAME.]DATASET_NAME.]TABLE_NAME`;

Replace the following:

  • PROJECT_NAME: the name of the project for the table. The value defaultsto the project that runs this query.
  • DATASET_NAME: the name of the dataset for the table.
  • TABLE_NAME: the name of the table.

The exported metadata is located in theSTORAGE_URI/metadatafolder, whereSTORAGE_URI is the table's storage location set in theoptions.

Export query results

You can export your query results to Cloud Storage in the Google Cloud consolewith the following steps:

  1. Open the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. ClickSQL query.

  3. Enter a valid GoogleSQL query in theQuery editor text area.

  4. ClickRun.

  5. When the results are returned, clickSave results>Cloud Storage.

  6. In theExport to Google Cloud Storage dialog:

    • ForGCS Location, browse for the bucket,folder, or file where you want to export the data.
    • ForExport format, choose the format for your exported data: CSV,JSON (Newline Delimited), Avro, or Parquet.
    • ForCompression, select a compression format or selectNone forno compression.
  7. ClickSave to export the query results.

To check on the progress of the job, expand theJob history pane andlook for the job with theEXTRACT type.

Avro export details

BigQuery expresses Avro formatted data in the following ways:

  • The resulting export files are Avro container files.
  • Each BigQuery row is represented as an Avro record. Nesteddata is represented by nested record objects.
  • REQUIRED fields are represented as the corresponding Avro types. Forexample, a BigQueryINTEGER type maps to an AvroLONG type.
  • NULLABLE fields are represented as an Avro union of the corresponding typeand "null".
  • REPEATED fields are represented as Avro arrays.
  • TIMESTAMP data types are represented astimestamp-micros logicaltype (it annotates an AvroLONG type) by default in both Extract jobs andExport Data SQL. (Caution: you can adduse_avro_logical_types=False toExport Data Options to disable the logical type so it usesstring typeinstead on timestamp column, but in Extract Jobs, it always uses the Avrological type.)
  • DATE data types are represented asdate logical type (it annotates an AvroINT types) by default in Export Data SQL, but are represented asstring typeby default in Extract jobs. (Note: you can adduse_avro_logical_types=FalsetoExport Data Options to disable the logical type, or use the flag--use_avro_logical_types=True to enable the logical type in Extract jobs.)
  • TIME data types are represented astimestamp-micro logical type (itannotates an AvroLONG types) by default in Export Data SQL, but arerepresented asstring type by default in Extract jobs. (Note: you can adduse_avro_logical_types=False toExport Data Options to disable the logicaltype, or use the flag--use_avro_logical_types=True to enable the logical typein Extract jobs.)
  • DATETIME data types are represented as AvroSTRING types (a string typewith custom named logical typedatetime) by default in Export Data SQL, butare represented asstring type by default in Extract jobs. (Note: you can adduse_avro_logical_types=False toExport Data Options to disable the logicaltype, or use the flag--use_avro_logical_types=True to enable logical type inExtract jobs.)
  • RANGE typesaren't supported in Avro export.
Note: The encoding of string type follows the Internet Engineering Task ForceRFC 3339 spec.

ParameterizedNUMERIC(P[, S]) andBIGNUMERIC(P[, S]) data types transfertheir precision and scale type parameters to the Avro decimal logical type.

Caution:

The Avro format can't be used in combination with GZIP compression. To compressAvro data, use the bq command-line tool or the API and specify one of thesupported compression types for Avro data:DEFLATE orSNAPPY.

Parquet export details

BigQuery converts GoogleSQL data types to the followingParquet data types:

BigQuery data typeParquet primitive typeParquet logical type
IntegerINT64NONE
NumericFIXED_LEN_BYTE_ARRAYDECIMAL (precision = 38, scale = 9)
Numeric(P[, S])FIXED_LEN_BYTE_ARRAYDECIMAL (precision = P, scale = S)
BigNumericFIXED_LEN_BYTE_ARRAYDECIMAL (precision = 76, scale = 38)
BigNumeric(P[, S])FIXED_LEN_BYTE_ARRAYDECIMAL (precision = P, scale = S)
Floating pointFLOATNONE
BooleanBOOLEANNONE
StringBYTE_ARRAYSTRING(UTF8)
BytesBYTE_ARRAYNONE
DateINT32DATE
DatetimeINT64TIMESTAMP (isAdjustedToUTC = false, unit = MICROS)
TimeINT64TIME (isAdjustedToUTC = true, unit = MICROS)
TimestampINT64TIMESTAMP (isAdjustedToUTC = false, unit = MICROS)
GeographyBYTE_ARRAYGEOGRAPHY (edges = spherical)

The Parquet schema represents nested data as a group and repeated records asrepeated groups. For more information about using nested and repeated data inBigQuery, seeSpecifying nested and repeated columns.

Caution: If you export aDATETIME type to Parquet, you cannot load theParquet file directly back into the same table schema, because the convertedvalue won't match the schema.

You can use the following workarounds forDATETIME types:

TheGEOGRAPHY logical type is represented withGeoParquet metadata added to the exported file(s).

Exporting data into one or more files

ThedestinationUris property indicates the one or more locations and filenames whereBigQuery should export your files.

BigQuery supports a single wildcard operator (*) in each URI. Thewildcard can appear anywhere in the filename component. Usingthe wildcard operator instructs BigQuery to create multiplesharded files based on the supplied pattern. The wildcard operator is replacedwith a number (starting at 0), left-padded to 12 digits. For example, a URI witha wildcard at the end of the filename would create files with000000000000appended to the first file, and000000000001 appended to the second file,continuing in that pattern.

The following table describes several possible options for thedestinationUrisproperty:

destinationUris options
Single URI

Use a single URI if you are exporting table data that is 1 GB or less. This option is the most common use case, as exported data is generally less than the 1 GB maximum value. This option is not supported for theEXPORT DATA statement; you must use a single wildcard URI.

Property definition:

['gs://my-bucket/file-name.json']

Creates:

gs://my-bucket/file-name.json
Single wildcard URI

A single wildcard can be used only in the filename component of the URI.

Use a single wildcard URI if you think your exported data will be larger than the 1 GB maximum value. BigQuery shards your data into multiple files based on the provided pattern. The size of the exported files will vary.

Property definition:

['gs://my-bucket/file-name-*.json']

Creates:

gs://my-bucket/file-name-000000000000.jsongs://my-bucket/file-name-000000000001.jsongs://my-bucket/file-name-000000000002.json...

['gs://my-bucket/*']

Creates:

gs://my-bucket/000000000000gs://my-bucket/000000000001gs://my-bucket/000000000002...

Limit the exported file size

When you export more than 1 GB of data in a single export, you must use awildcard to export the data into multiple files and the size of the filesvaries. If you need to limit the maximum size of each exported file, one optionis to randomly partition your data and then export each partition to a file:

  1. Determine the number of partitions you need, which is equal to the totalsize of your data divided by the chosen exported file size. For example,if you have 8,000 MB of data and you want each exported file to beapproximately 20 MB, then you need 400 partitions.
  2. Create a new table that is partitioned and clustered by a new randomlygenerated column calledexport_id. The following example shows how tocreate a newprocessed_table from an existing table calledsource_tablewhich requiresn partitions to achieve the chosen file size:

    CREATETABLEmy_dataset.processed_tablePARTITIONBYRANGE_BUCKET(export_id,GENERATE_ARRAY(0,n,1))CLUSTERBYexport_idAS(SELECT*,CAST(FLOOR(n*RAND())ASINT64)ASexport_idFROMmy_dataset.source_table);
  3. For each integeri between 0 andn-1, run anEXPORT DATA statementon the following query:

    SELECT*EXCEPT(export_id)FROMmy_dataset.processed_tableWHEREexport_id=i;

Extract compressed table

Go

Before trying this sample, follow theGo setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryGo API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

import("context""fmt""cloud.google.com/go/bigquery")// exportTableAsCompressedCSV demonstrates using an export job to// write the contents of a table into Cloud Storage as compressed CSV.funcexportTableAsCompressedCSV(projectID,gcsURIstring)error{// projectID := "my-project-id"// gcsURI := "gs://mybucket/shakespeare.csv"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %w",err)}deferclient.Close()srcProject:="bigquery-public-data"srcDataset:="samples"srcTable:="shakespeare"gcsRef:=bigquery.NewGCSReference(gcsURI)gcsRef.Compression=bigquery.Gzipextractor:=client.DatasetInProject(srcProject,srcDataset).Table(srcTable).ExtractorTo(gcsRef)extractor.DisableHeader=true// You can choose to run the job in a specific location for more complex data locality scenarios.// Ex: In this example, source dataset and GCS bucket are in the US.extractor.Location="US"job,err:=extractor.Run(ctx)iferr!=nil{returnerr}status,err:=job.Wait(ctx)iferr!=nil{returnerr}iferr:=status.Err();err!=nil{returnerr}returnnil}

Java

Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.ExtractJobConfiguration;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.TableId;// Sample to extract a compressed tablepublicclassExtractTableCompressed{publicstaticvoidmain(String[]args){// TODO(developer): Replace these variables before running the sample.StringprojectName="MY_PROJECT_NAME";StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";StringbucketName="MY-BUCKET-NAME";StringdestinationUri="gs://"+bucketName+"/path/to/file";// For more information on export formats available see:// https://cloud.google.com/bigquery/docs/exporting-data#export_formats_and_compression_typesStringcompressed="gzip";// For more information on Job see:// https://googleapis.dev/java/google-cloud-clients/latest/index.html?com/google/cloud/bigquery/package-summary.htmlStringdataFormat="CSV";extractTableCompressed(projectName,datasetName,tableName,destinationUri,dataFormat,compressed);}publicstaticvoidextractTableCompressed(StringprojectName,StringdatasetName,StringtableName,StringdestinationUri,StringdataFormat,Stringcompressed){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();TableIdtableId=TableId.of(projectName,datasetName,tableName);ExtractJobConfigurationextractConfig=ExtractJobConfiguration.newBuilder(tableId,destinationUri).setCompression(compressed).setFormat(dataFormat).build();Jobjob=bigquery.create(JobInfo.of(extractConfig));// Blocks until this job completes its execution, either failing or succeeding.JobcompletedJob=job.waitFor();if(completedJob==null){System.out.println("Job not executed since it no longer exists.");return;}elseif(completedJob.getStatus().getError()!=null){System.out.println("BigQuery was unable to extract due to an error: \n"+job.getStatus().getError());return;}System.out.println("Table extract compressed successful");}catch(BigQueryException|InterruptedExceptione){System.out.println("Table extraction job was interrupted. \n"+e.toString());}}}

Node.js

Before trying this sample, follow theNode.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryNode.js API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

// Import the Google Cloud client librariesconst{BigQuery}=require('@google-cloud/bigquery');const{Storage}=require('@google-cloud/storage');constbigquery=newBigQuery();conststorage=newStorage();asyncfunctionextractTableCompressed(){// Exports my_dataset:my_table to gcs://my-bucket/my-file as a compressed file./**   * TODO(developer): Uncomment the following lines before running the sample.   */// const datasetId = "my_dataset";// const tableId = "my_table";// const bucketName = "my-bucket";// const filename = "file.csv";// Location must match that of the source table.constoptions={location:'US',gzip:true,};// Export data from the table into a Google Cloud Storage fileconst[job]=awaitbigquery.dataset(datasetId).table(tableId).extract(storage.bucket(bucketName).file(filename),options);console.log(`Job${job.id} created.`);// Check the job's status for errorsconsterrors=job.status.errors;if(errors &&errors.length >0){throwerrors;}}

Python

Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

# from google.cloud import bigquery# client = bigquery.Client()# bucket_name = 'my-bucket'destination_uri="gs://{}/{}".format(bucket_name,"shakespeare.csv.gz")dataset_ref=bigquery.DatasetReference(project,dataset_id)table_ref=dataset_ref.table("shakespeare")job_config=bigquery.job.ExtractJobConfig()job_config.compression=bigquery.Compression.GZIPextract_job=client.extract_table(table_ref,destination_uri,# Location must match that of the source table.location="US",job_config=job_config,)# API requestextract_job.result()# Waits for job to complete.

Example use case

This example shows how you can export data to Cloud Storage.

Suppose you are streaming data to Cloud Storage from endpoint logscontinuously. A daily snapshot is to be exported toCloud Storage for backup and archival purposes. The best choice is anextract jobsubject to certainquotas andlimitations.

Submit an extract job with theAPI orclient libraries,passing in a unique ID asjobReference.jobId. Extract Jobs areasynchronous.Check the job status using the unique job ID used to create the job. The job completed successfullyifstatus.status isDONE. Ifstatus.errorResult is present,the job failed and needs to be retried.

Batch data processing

Suppose a nightly batch job is used to load data by a fixed deadline.After this load job completes, a table with statistics is materialized from aquery as described in the preceding section. Data from this table isretrieved and compiled into a PDF report and sent to a regulator.

Since the amount of data that needs to be read is small, use thetabledata.list API to retrieve all rows of the table in JSON dictionary format. If there ismore than one page of data, the results have thepageToken propertyset. To retrieve the next page of results, make anothertabledata.list calland include the token value as thepageToken parameter. If the API callfails with a5xx error,retry with exponential backoff. Most 4xx errors cannot be retried. Forbetter decoupling of BigQuery export and report generation,results should be persisted to disk.

Quota policy

For information on extract job quotas, seeExtract jobson the Quotas and limits page.

Usage for extract jobs are available in theINFORMATION_SCHEMA.The job entry in theJOBS_BY_* system tables for the extract job contains atotal_bytes_processed value that can be used to monitor the aggregate usage toensure that it stays under 50 TiB per-day. To learn how to query theINFORMATION_SCHEMA.JOBS view to get thetotal_bytes_processed value, seeINFORMATION_SCHEMA.JOBS schema

View current quota usage

You can view your current usage of query, load, extract, or copy jobs by runninganINFORMATION_SCHEMA query to view metadata about the jobs ran over aspecified time period. You can compare your current usage against thequotalimit to determine your quota usage for aparticular type of job. The following example query uses theINFORMATION_SCHEMA.JOBS view to list the number of query, load, extract, andcopy jobs by project:

SELECTsum(casewhenjob_type="QUERY"then1else0end)asQRY_CNT,sum(casewhenjob_type="LOAD"then1else0end)asLOAD_CNT,sum(casewhenjob_type="EXTRACT"then1else0end)asEXT_CNT,sum(casewhenjob_type="COPY"then1else0end)asCPY_CNTFROM`region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_PROJECTWHEREdate(creation_time)=CURRENT_DATE()

You can set up aCloud Monitoring alertingpolicy that monitors the number of bytes exported.

  1. In the Google Cloud console, go to the Alerting page:

    Go toAlerting

    If you use the search bar to find this page, then select the result whose subheading isMonitoring.

  2. On theAlerting page, clickCreate Policy.

  3. UnderPolicy configuration mode, selectCode editor (MQL or PromQL).

  4. In thePromQL query editor, enter the following query:

    (  sum by (project_id, quota_metric, location) (increase({"serviceruntime.googleapis.com/quota/rate/net_usage", monitored_resource="consumer_quota", service="bigquery.googleapis.com"}[1m]))  /  max by (project_id, quota_metric, location) ({"serviceruntime.googleapis.com/quota/limit", monitored_resource="consumer_quota", service="bigquery.googleapis.com", limit_name="ExtractBytesPerDay"})) > 0.01

    IfAuto-run isn't enabled, then clickRun Query.

  5. Configure the rest of your alert and clickCreate policy.

For detailed steps about creating PromQL-based alerting policies, seeCreate PromQL-based alerting policies (Console).

Troubleshooting

Diagnose and troubleshoot issues with extract jobs.

Diagnose issues using Logs Explorer

To diagnose issues with extract jobs, you can use theLogs Explorerto review the logs for a specific extract job and identify possible errors.The following Logs Explorer filter returns information about yourextract jobs:

resource.type="bigquery_resource"protoPayload.methodName="jobservice.insert"(protoPayload.serviceData.jobInsertRequest.resource.jobConfiguration.query.query=~"EXPORT" ORprotoPayload.serviceData.jobCompletedEvent.eventName="extract_job_completed" ORprotoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.query=~"EXPORT")

Exceeded extract bytes per day quota error

BigQuery returns this error when the extraction exceeds thedefault 50 TiB daily limit in a project.For more information about extract job limits, seeExtract jobs.

Error message

Your usage exceeded quota for ExtractBytesPerDay

Diagnosis

If you are exporting a table that is larger than 50 TiB, the export fails becauseitexceeds the extraction limit.If you want toexport table datafor specific table partitions, you can use apartition decorator toidentify the partitions to export.

If you would like to gather usages of exports data over recent days, you cantry the following:

  • View the quotas for your projectwith filter criteria such asName: Extract bytes per day orMetric: bigquery.googleapis.com/quota/extract/bytes along with the Show usagechart to see your usage trend over a few days.

  • Alternatively you can queryINFORMATION_SCHEMA.JOBS_BY_PROJECTto see your total extract bytes over a few days. For example, the followingquery returns the daily total bytes processed byEXTRACT jobs in the pastseven days.

    SELECTTIMESTAMP_TRUNC(creation_time,DAY)ASday,SUM(total_bytes_processed)/POW(1024,3)AStotal_gibibytes_processedFROM`region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_PROJECTWHEREcreation_timeBETWEENTIMESTAMP_SUB(CURRENT_TIMESTAMP(),INTERVAL7DAY)ANDCURRENT_TIMESTAMP()ANDjob_type="EXTRACT"GROUPBY1ORDERBY2DESC
  • You can then further refine the results by identifying the specific jobs thatare consuming more bytes than expected. The following example returns the top100EXTRACT jobs which are consuming more than 100 GB processed over thepast seven days.

    SELECTcreation_time,job_id,total_bytes_processed/POW(1024,3)AStotal_gigabytes_processedFROM`region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_PROJECTWHEREcreation_timeBETWEENTIMESTAMP_SUB(CURRENT_TIMESTAMP(),INTERVAL7DAY)ANDCURRENT_TIMESTAMP()ANDjob_type="EXTRACT"ANDtotal_bytes_processed>(POW(1024,3)*100)ORDERBYtotal_bytes_processedDESCLIMIT100

You can alternatively use thejobs explorerwith filters likeBytes processed more than to filter for high processing jobsfor a specified period of time.

Resolution

One method of resolving this quota error is to create a slotreservation andassign yourproject into the reservation with thePIPELINE job type. This method canbypass the limit check since it uses your dedicated reservations rather thana free shared slot pool. If needed, the reservation can be deleted if you wantto use a shared slot pool later on.

For alternative approaches that allow exporting morethan 50 TiB,see the notes section inExtract jobs.

Pricing

For information on data export pricing, see theBigQuery pricing page.

Once the data is exported, you are charged for storing the data inCloud Storage. For more information,seeCloud Storage pricing.

Table security

To control access to tables in BigQuery, seeControl access to resources with IAM.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.