Load Blob Storage data into BigQuery

You can load data from Blob Storage to BigQuery using theBigQuery Data Transfer Service for Blob Storage connector. With theBigQuery Data Transfer Service, you can schedule recurring transfer jobs thatadd your latest data from Blob Storage toBigQuery.

Before you begin

Before you create a Blob Storage data transfer, do the following:

Required permissions

Ensure that you have granted the following permissions.

Required BigQuery roles

To get the permissions that you need to create a BigQuery Data Transfer Service data transfer, ask your administrator to grant you theBigQuery Admin (roles/bigquery.admin) IAM role on your project. For more information about granting roles, seeManage access to projects, folders, and organizations.

This predefined role contains the permissions required to create a BigQuery Data Transfer Service data transfer. To see the exact permissions that are required, expand theRequired permissions section:

Required permissions

The following permissions are required to create a BigQuery Data Transfer Service data transfer:

  • BigQuery Data Transfer Service permissions:
    • bigquery.transfers.update
    • bigquery.transfers.get
  • BigQuery permissions:
    • bigquery.datasets.get
    • bigquery.datasets.getIamPolicy
    • bigquery.datasets.update
    • bigquery.datasets.setIamPolicy
    • bigquery.jobs.create

You might also be able to get these permissions withcustom roles or otherpredefined roles.

For more information, seeGrantbigquery.admin access.

Required Blob Storage roles

For information about the required permissions in Blob Storage toenable the data transfer, seeShared access signature (SAS).

Limitations

Blob Storage data transfers are subject to the following limitations:

Set up a Blob Storage data transfer

Select one of the following options:

Console

  1. Go to the Data transfers page in the Google Cloud console.

    Go to Data transfers

  2. ClickCreate transfer.

  3. On theCreate transfer page, do the following:

    • In theSource type section, forSource, selectAzure Blob Storage & ADLS:

      Transfer source type

    • In theTransfer config name section, forDisplay name, enter aname for the data transfer.

    • In theSchedule options section:

      • Select aRepeat frequency. If you selectHours,Days,Weeks, orMonths, you must also specify a frequency. Youcan also selectCustom to specify a custom repeat frequency. Ifyou selectOn-demand, then this data transfer runs whenyoumanually trigger the transfer.
      • If applicable, select eitherStart now orStart at set timeand provide a start date and run time.
    • In theDestination settings section:

      • ForDataset, select the dataset that you created to store your data.
      • SelectNative table if you want to transfer to a BigQuery table.
      • SelectApache Iceberg if you want to transfer to a BigLake Iceberg table in BigQuery.
    • In theData source details section, do the following:

      • ForDestination table, enter the name of the table you createdto store the data in BigQuery. Destination tablenames supportparameters.
      • ForAzure storage account name, enter theBlob Storage account name.
      • ForContainer name, enter the Blob Storagecontainer name.
      • ForData path, enter the path to filter files to be transferred.Seeexamples.
      • ForSAS token, enter the Azure SAS token.
      • ForFile format, choose your source data format.
      • ForWrite disposition, selectWRITE_APPEND toincrementally append new data to the destination table, orWRITE_TRUNCATE to overwrite data in the destination tableduring each transfer run.WRITE_APPEND is the default valueforWrite disposition.

      For more information about how BigQuery Data Transfer Service ingestsdata using eitherWRITE_APPEND orWRITE_TRUNCATE, seeData ingestion for Azure Blob transfers. For more informationabout thewriteDisposition field, seeJobConfigurationLoad.

      Data source details

    • In theTransfer options section, do the following:

      • ForNumber of errors allowed, enter an integer value for themaximum number of bad records that can be ignored. The default valueis 0.
      • (Optional) ForDecimal target types, enter a comma-separatedlist of possible SQL data types that decimal values in the sourcedata are converted to. Which SQL data type is selected forconversion depends on the following conditions:
        • In the order ofNUMERIC,BIGNUMERIC, andSTRING, a type ispicked if it is in your specified list and if it supports theprecision and the scale.
        • If none of your listed data types support the precision and thescale, the data type supporting the widest range in yourspecified list is selected. If a value exceeds the supportedrange when reading the source data, an error is thrown.
        • The data typeSTRING supports all precision and scale values.
        • If this field is left empty, the data type defaults toNUMERIC,STRING for ORC, andNUMERIC for other file formats.
        • This field cannot contain duplicate data types.
        • The order of the data types that you list is ignored.
    • If you chose CSV or JSON as your file format, in theJSON, CSVsection, checkIgnore unknown values to accept rows that containvalues that don't match the schema.

    • If you chose CSV as your file format, in theCSV section enter anyadditional CSV optionsfor loading data.

      CSV options

    • In theNotification options section, you can choose to enableemail notifications and Pub/Sub notifications.

      • When you enable email notifications, the transfer administratorreceives an email notification when a transfer run fails.
      • When you enablePub/Sub notifications,choose atopic name to publish to or clickCreate a topic to create one.
    • If you useCMEKs, in theAdvanced options section, selectCustomer-managed key. A list ofyour available CMEKs appears for you to choose from. For informationabout how CMEKs work with the BigQuery Data Transfer Service, seeSpecify encryption key with transfers.

  4. ClickSave.

bq

Use thebq mk --transfer_config commandto create a Blob Storage transfer:

bqmk\--transfer_config\--project_id=PROJECT_ID\--data_source=DATA_SOURCE\--display_name=DISPLAY_NAME\--target_dataset=DATASET\--destination_kms_key=DESTINATION_KEY\--params=PARAMETERS

Replace the following:

  • PROJECT_ID: (Optional) the project ID containingyour target dataset. If not specified, your default project is used.
  • DATA_SOURCE:azure_blob_storage.
  • DISPLAY_NAME: the display name for the data transferconfiguration. The transfer name can be any value that lets youidentify the transfer if you need to modify it later.
  • DATASET: the target dataset for the data transferconfiguration.
  • DESTINATION_KEY: (Optional) theCloud KMS key resource ID—for example,projects/project_name/locations/us/keyRings/key_ring_name/cryptoKeys/key_name.
  • PARAMETERS: the parameters for the data transferconfiguration, listed in JSON format. For example,--params={"param1":"value1", "param2":"value2"}. The following are theparameters for a Blob Storage data transfer:
    • destination_table_name_template: Required. The name of yourdestination table.
    • storage_account: Required. The Blob Storage account name.
    • container: Required. The Blob Storage container name.
    • data_path: Optional. The path to filter files to be transferred. Seeexamples.
    • sas_token: Required. The Azure SAS token.
    • file_format: Optional. The type of files you want to transfer:CSV,JSON,AVRO,PARQUET, orORC. The default value isCSV.
    • write_disposition: Optional. SelectWRITE_APPEND to append data tothe destination table, orWRITE_TRUNCATE, to overwrite data in thedestination table. The default value isWRITE_APPEND.
    • max_bad_records: Optional. The number of allowed bad records. Thedefault value is 0.
    • decimal_target_types: Optional. A comma-separated list of possible SQLdata types that decimal values in the source data are converted to. Ifthis field is not provided, the data type defaults toNUMERIC,STRING for ORC, andNUMERIC for the other file formats.
    • ignore_unknown_values: Optional, and ignored iffile_format is notJSON orCSV. Set totrue to accept rows that contain values thatdon't match the schema.
    • field_delimiter: Optional, and applies only whenfile_format isCSV. The character that separates fields. The default value is,.
    • skip_leading_rows: Optional, and applies only whenfile_format isCSV. Indicates the number of header rows that you don't want toimport. The default value is 0.
    • allow_quoted_newlines: Optional, and applies only whenfile_formatisCSV. Indicates whether to allow newlines within quoted fields.
    • allow_jagged_rows: Optional, and applies only whenfile_format isCSV. Indicates whether to accept rows that are missing trailingoptional columns. The missing values are filled in withNULL.

For example, the following creates a Blob Storage data transfercalledmytransfer:

bqmk\--transfer_config\--data_source=azure_blob_storage\--display_name=mytransfer\--target_dataset=mydataset\--destination_kms_key=projects/myproject/locations/us/keyRings/mykeyring/cryptoKeys/key1--params={"destination_table_name_template":"mytable","storage_account":"myaccount","container":"mycontainer","data_path":"myfolder/*.csv","sas_token":"my_sas_token_value","file_format":"CSV","max_bad_records":"1","ignore_unknown_values":"true","field_delimiter":"|","skip_leading_rows":"1","allow_quoted_newlines":"true","allow_jagged_rows":"false"}

API

Use theprojects.locations.transferConfigs.createmethod and supply an instance of theTransferConfigresource.

Java

Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importcom.google.api.gax.rpc.ApiException;importcom.google.cloud.bigquery.datatransfer.v1.CreateTransferConfigRequest;importcom.google.cloud.bigquery.datatransfer.v1.DataTransferServiceClient;importcom.google.cloud.bigquery.datatransfer.v1.ProjectName;importcom.google.cloud.bigquery.datatransfer.v1.TransferConfig;importcom.google.protobuf.Struct;importcom.google.protobuf.Value;importjava.io.IOException;importjava.util.HashMap;importjava.util.Map;// Sample to create azure blob storage transfer config.publicclassCreateAzureBlobStorageTransfer{publicstaticvoidmain(String[]args)throwsIOException{// TODO(developer): Replace these variables before running the sample.finalStringprojectId="MY_PROJECT_ID";finalStringdisplayName="MY_TRANSFER_DISPLAY_NAME";finalStringdatasetId="MY_DATASET_ID";StringtableId="MY_TABLE_ID";StringstorageAccount="MY_AZURE_STORAGE_ACCOUNT_NAME";StringcontainerName="MY_AZURE_CONTAINER_NAME";StringdataPath="MY_AZURE_FILE_NAME_OR_PREFIX";StringsasToken="MY_AZURE_SAS_TOKEN";StringfileFormat="CSV";StringfieldDelimiter=",";StringskipLeadingRows="1";Map<String,Value>params=newHashMap<>();params.put("destination_table_name_template",Value.newBuilder().setStringValue(tableId).build());params.put("storage_account",Value.newBuilder().setStringValue(storageAccount).build());params.put("container",Value.newBuilder().setStringValue(containerName).build());params.put("data_path",Value.newBuilder().setStringValue(dataPath).build());params.put("sas_token",Value.newBuilder().setStringValue(sasToken).build());params.put("file_format",Value.newBuilder().setStringValue(fileFormat).build());params.put("field_delimiter",Value.newBuilder().setStringValue(fieldDelimiter).build());params.put("skip_leading_rows",Value.newBuilder().setStringValue(skipLeadingRows).build());createAzureBlobStorageTransfer(projectId,displayName,datasetId,params);}publicstaticvoidcreateAzureBlobStorageTransfer(StringprojectId,StringdisplayName,StringdatasetId,Map<String,Value>params)throwsIOException{TransferConfigtransferConfig=TransferConfig.newBuilder().setDestinationDatasetId(datasetId).setDisplayName(displayName).setDataSourceId("azure_blob_storage").setParams(Struct.newBuilder().putAllFields(params).build()).setSchedule("every 24 hours").build();// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(DataTransferServiceClientclient=DataTransferServiceClient.create()){ProjectNameparent=ProjectName.of(projectId);CreateTransferConfigRequestrequest=CreateTransferConfigRequest.newBuilder().setParent(parent.toString()).setTransferConfig(transferConfig).build();TransferConfigconfig=client.createTransferConfig(request);System.out.println("Azure Blob Storage transfer created successfully: "+config.getName());}catch(ApiExceptionex){System.out.print("Azure Blob Storage transfer was not created."+ex.toString());}}}

Specify encryption key with transfers

You can specifycustomer-managed encryption keys (CMEKs)to encrypt data for a transfer run. You can use a CMEK to support transfers fromAzure Blob Storage.

When you specify a CMEK with a transfer, the BigQuery Data Transfer Service applies theCMEK to any intermediate on-disk cache of ingested data so that the entiredata transfer workflow is CMEK compliant.

You cannot update an existing transfer to add a CMEK if the transfer was notoriginally created with a CMEK. For example, you cannot change a destinationtable that was originally default encrypted to now be encrypted with CMEK.Conversely, you also cannot change a CMEK-encrypted destination tableto have a different type of encryption.

You can update a CMEK for a transfer if the transfer configuration wasoriginally created with a CMEK encryption. When you update a CMEK for a transferconfiguration, the BigQuery Data Transfer Service propagates the CMEK to the destinationtables at the next run of the transfer, where the BigQuery Data Transfer Servicereplaces any outdated CMEKs with the new CMEK during the transfer run.For more information, seeUpdate a transfer.

You can also useproject default keys.When you specify a project default key with a transfer, the BigQuery Data Transfer Serviceuses the project default key as the default key for any new transferconfigurations.

Troubleshoot transfer setup

If you are having issues setting up your data transfer, seeBlob Storage transfer issues.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.