Create BigLake external tables for Delta Lake

Important: The term "BigLake" on this page refers to an accessdelegation functionality for external tables in BigQuery. Forinformation about BigLake, the stand-alone Google Cloudproduct that includes BigLake metastore, the Apache Iceberg REST catalog,and BigLake tables for Apache Iceberg seeBigLake overview.

BigLake lets you access Delta Lake tables withmore granular access control.Delta Lake is anopen source, tabular data storage format developed by Databricks that supportspetabyte scale data tables.

BigQuery supports the following features with Delta Lake tables:

  • Access delegation: Query structureddata in external data stores with access delegation. Access delegationdecouples access to the Delta Lake table from access to theunderlying datastore.
  • Fine-grained access control: Enforce fine-grained security atthe table level, includingrow-levelandcolumn-level security. ForDelta Lake tables based on Cloud Storage, you can also usedynamic data masking.
  • Schema evolution: Schema changesin the Delta Lake tables are autodetected. Changes to theschema are reflected in the BigQuery table.

Delta Lake tables also support all BigLake features whenyou configure them asBigLake tables.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  2. Verify that billing is enabled for your Google Cloud project.

  3. Enable the BigQuery Connection and BigQuery Reservation APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the APIs

  4. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

  5. Ensure that you have a BigQuerydataset.

  6. Ensure that your version of the Google Cloud SDK is 366.0.0 or later:

    gcloudversion

    If needed,update the Google Cloud SDK.

  7. Create aCloud resource connectionbased on your external data source, and grant that connectionaccess to Cloud Storage.If you don't have the appropriate permissions to create a connection, askyour BigQuery administrator to create a connectionand share it with you.

Required roles

The following permissions are required to create a Delta Lake table:

  • bigquery.tables.create
  • bigquery.connections.delegate

The BigQuery Admin (roles/bigquery.admin) predefinedIdentity and Access Management role includes these permissions.

If you are not a principal in this role, ask your administratorto grant you these permissions or to create the Delta Lake table for you.

Additionally, to allow BigQuery users to query the table, theservice account associated with the connection must have the followingpermission and access:

  • BigQuery Viewer (roles/bigquery.viewer) role
  • BigQuery Connection User (roles/bigquery.connectionUser) role
  • Access to the Cloud Storage bucket that contains that data

For more information on Identity and Access Management roles and permissions inBigQuery, seePredefined roles and permissions.

Create tables with Delta Lake

To create Delta Lake tables, follow these steps.

SQL

Use theCREATE EXTERNAL TABLE statementto create the Delta Lake table:

CREATEEXTERNALTABLE`PROJECT_ID.DATASET.DELTALAKE_TABLE_NAME`WITHCONNECTION`PROJECT_ID.REGION.CONNECTION_ID`OPTIONS(format="DELTA_LAKE",uris=['DELTA_TABLE_GCS_BASE_PATH']);

Replace the following values:

  • PROJECT_ID: the ID of the project that you want to create the Delta Lake table in
  • DATASET: the BigQuery dataset to contain the Delta Lake table
  • DELTALAKE_TABLE_NAME: the name of your Delta Lake table
  • REGION: the region that contains the connection to create the Delta Lake table—for example,us
  • CONNECTION_ID: the connection ID—for example,myconnection

    When youview the connection detailsin the Google Cloud console, the connection ID is the value in the last sectionof the fully qualified connection ID that is shown in Connection ID—forexampleprojects/myproject/locations/connection_location/connections/myconnection.

  • DELTA_TABLE_GCS_BASE_PATH: the Delta Lake table prefix

bq

In a command-line environment, use thebq mk commandto create the Delta Lake table:

bqmk--table--external_table_definition=DEFINITION_FILEPROJECT_ID:DATASET.DELTALAKE_TABLE_NAME

Replace the following values:

  • DEFINITION_FILE: the path to a table definition file
  • PROJECT_ID: the ID of the project that you want to create the Delta Lake table in
  • DATASET: the BigQuery dataset to contain the Delta Lake table
  • DELTALAKE_TABLE_NAME: the name of your Delta Lake table

REST

Use theBigQuery API tocreate a Delta Lake table by calling thetables.insert API method:

REQUEST='{  "autodetect": true,  "externalDataConfiguration": {  "sourceFormat": "DELTA_LAKE",  "connectionId": "PROJECT_ID.REGION.CONNECTION_ID",  "sourceUris": [    "DELTA_TABLE_GCS_BASE_PATH"  ], },"tableReference": {"tableId": "DELTALAKE_TABLE_NAME"}}'echo$REQUEST|curl-XPOST-d@--H"Content-Type: application/json"-H"Authorization: Bearer$(gcloudauthprint-access-token)"https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT_ID/datasets/DATASET/tables?autodetect_schema=true

Replace the following values:

  • PROJECT_ID: the ID of the project that you want to create the Delta Lake table in
  • REGION: the region that contains the connection to create the Delta Lake table—for example,us
  • CONNECTION_ID: the connection ID—for example,myconnection

    When youview the connection detailsin the Google Cloud console, the connection ID is the value in the last sectionof the fully qualified connection ID that is shown in Connection ID—forexampleprojects/myproject/locations/connection_location/connections/myconnection.

  • DELTA_TABLE_GCS_BASE_PATH: the Delta Lake table prefix

  • DELTALAKE_TABLE_NAME: the name of your Delta Lake table

  • DATASET: the BigQuery dataset to contain the Delta Lake table

When you create Delta Lake tables, the Delta Lake prefix isused as the URI for the table. For example, for a table that has logs in thebucketgs://bucket/warehouse/basictable/_delta_log, the table URI isgs://bucket/warehouse/basictable. When you run queries on theDelta Lake table, BigQuery reads data under the prefix to identify the current version of the table and then computes the metadata andthe files for the table.

Although you can create Delta Lake external tables without a connection, it is not recommended for the following reasons:

  • Users might experienceACCESS_DENIED errors when trying to access files onCloud Storage.
  • Features such as fine-grained access control are only available in Delta Lake BigLake tables.

Update Delta Lake tables

To update (refresh) the schema of Delta Lake tables, follow these steps.

bq

In a command-line environment, use thebq update commandto update (refresh) the schema of the Delta Lake table:

bqupdate--autodetect_schemaPROJECT_ID:DATASET.DELTALAKE_TABLE_NAME

Replace the following values:

  • PROJECT_ID: the ID of the project that you want to create the Delta Lake table in
  • DATASET: the BigQuery dataset to contain the Delta Lake table
  • DELTALAKE_TABLE_NAME: the name of your Delta Lake table

REST

Use theBigQuery API toupdate a Delta Lake table by calling thetables.patch API method:

REQUEST='{  "externalDataConfiguration": {    "sourceFormat": "DELTA_LAKE",    "sourceUris": [      "DELTA_TABLE_GCS_BASE_PATH"    ],    "connectionId": "PROJECT_ID.REGION.CONNECTION_ID",    "autodetect": true  }}'echo$REQUEST|curl-XPATCH-d@--H"Content-Type: application/json"-H"Authorization: Bearer$(gcloudauthprint-access-token)"https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT_ID/datasets/DATASET/tables/DELTALAKE_TABLE_NAME?autodetect_schema=true

Replace the following values:

  • DELTA_TABLE_GCS_BASE_PATH: the Delta Lake table prefix
  • PROJECT_ID: the ID of the project that you want to create the Delta Lake table in
  • REGION: the region that contains the connection to create the Delta Lake table—for example,us
  • CONNECTION_ID: the connection ID—for example,myconnection

    When youview the connection detailsin the Google Cloud console, the connection ID is the value in the last sectionof the fully qualified connection ID that is shown in Connection ID—forexampleprojects/myproject/locations/connection_location/connections/myconnection.

  • DELTALAKE_TABLE_NAME: the name of your Delta Lake table

  • DATASET: the BigQuery dataset to contain the Delta Lake table

Query Delta Lake tables

After creating a Delta Lake BigLake table, you canquery it using GoogleSQL syntax, thesame as you would a standard BigQuery table. For example:

SELECTfield1,field2FROMmydataset.my_cloud_storage_table;

For more information, seeQuery Cloud Storage data in BigLake tables.

Anexternal connection associated witha service account is used to connect to the datastore. Because the serviceaccount retrieves data from the datastore, users only need access to theDelta Lake table.

Data mapping

BigQuery converts Delta Lake data types toBigQuery data types as shown in the following table:

Delta Lake TypeBigQuery Type
booleanBOOL
byteINT64
intINT64
longINT64
floatFLOAT64
doubleFLOAT64
Decimal(P/S)NUMERIC orBIG_NUMERIC depending on precision
dateDATE
timeTIME
timestamp (not partition column)TIMESTAMP
timestamp (partition column)DATETIME
stringSTRING
binaryBYTES
array<Type>ARRAY<Type>
structSTRUCT
map<KeyType, ValueType>ARRAY<Struct<key KeyType, value ValueType>>

Limitations

Delta Lake tables haveBigLake table limitations and also the following limitations:

  • Supports Delta Lakereader version3 with relative path deletion vectors and column mapping.
  • Doesn't support Delta Lake V2 checkpoints.
  • You must list the reader version in the last log entry file. For example,new tables must include00000..0.json.
  • Change data capture (CDC) operations aren't supported.Any existing CDC operations are ignored.
  • The schema is autodetected. Modifying the schema by usingBigQuery isn't supported.
  • Table column names must adhere to BigQuerycolumn name restrictions.
  • Materialized views aren't supported.
  • The Read API isn't supported for Delta Lake.
  • Thetimestamp_ntz data type isn't supported forDelta Lake BigLake tables.

Troubleshooting

This section provides help with Delta Lake BigLake tables.For more general help with troubleshooting BigQuery queries,seeTroubleshoot query issues.

Query timeout and resource errors

Check the logs directory (gs://bucket/warehouse/basictable/_delta_log) of theDelta Lake table and look for JSON files with a version number greater than the previouscheckpoint. You can obtain the version numberby listing the directory or inspecting the_delta_log/_last_checkpoint file. JSON files larger than 10 MiB can slow down table expansion which can lead totimeout and resource issues. To resolve this issue, use the following commandto create a new checkpoint so that queries skip reading the JSON files:

spark.sql("ALTER TABLE delta.`gs://bucket/mydeltatabledir` SET TBLPROPERTIES ('delta.checkpointInterval' = '1')");

Users can then use the same command to reset the checkpoint interval to eitherthe default value of 10 or to a value that avoids having more than 50 MB of JSON files between checkpoints.

Invalid column name

Ensure that column mapping is turned on for the Delta Lake table. Column mapping is supported withReader version 2 or greater.For Reader version 1, set 'delta.columnMapping.mode' to 'name'by using the following command:

spark.sql("ALTER TABLE delta.`gs://bucket/mydeltatabledir` SET TBLPROPERTIES ('delta.columnMapping.mode' = 'name', 'delta.minReaderVersion' = '3', 'delta.minWriterVersion' = '7')");

If the invalid column name adheres toflexible column name restrictions, contactCloud Customer Care orbiglake-help@google.com.

Access denied errors

To diagnose issues with Delta Lake BigLake tables, check the following:

Performance

To improve query performance, try the following steps:

  • UseDelta Lake utilities to compact the underlying data files and remove redundant files, such as data and metadata.

  • Ensure thatdelta.checkpoint.writeStatsAsStruct is set totrue.

  • Ensure that variables that are frequently used in predicate clauses are in partition columns.

Large datasets (greater than 100 TB) might benefit from additionalconfigurations and features. If the preceding steps do not resolve your issues,consider contactingCustomer Care orbiglake-help@google.com,especially for datasets larger than 100 TB.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.