Use externally partitioned data

Important: The term "BigLake" on this page refers to an accessdelegation functionality for external tables in BigQuery. Forinformation about BigLake, the stand-alone Google Cloudproduct that includes BigLake metastore, the Apache Iceberg REST catalog,and BigLake tables for Apache Iceberg see BigLake overview.

You can use BigQuery external tables to query partitioned data inthe following data stores:

The external partitioned data must use adefault Hive partitioning layoutand be in one of the following formats:

Avro
CSV
JSON
ORC
Parquet

To query externally partitioned data,you must create aBigLake tableor anexternal table.We recommend using BigLake tables because they let you enforcefine-grained security at the table level.For information about BigLake and external tables, seeIntroduction to BigLake tablesandIntroduction to external tables.

You enable Hive partitioning support by setting the appropriateoptions in thetable definition file.For instructions about querying managed partitioned tables, seeIntroduction to partitioned tables.

Partition schema

The following sections explain thedefault Hive partitioned layout and theschema detection modes that BigQuery supports.

To avoid reading unnecessary files and to improve performance, you can usepredicate filters on partition keys in queries.

Supported data layouts

Hive partition keys appear as normal columns when you query data fromCloud Storage.The data must follow a default Hive partitioned layout.For example, the following files follow the default layout—thekey-value pairs are configured as directories with an equal sign (=) as a separator,and the partition keys are always in the same order:

gs://my_bucket/my_table/dt=2019-10-31/lang=en/my_filenamegs://my_bucket/my_table/dt=2018-10-31/lang=fr/my_filename

The common source URI prefix in this example isgs://my_bucket/my_table.

Unsupported data layouts

If the partition key names are not encoded in the directory path,partition schema detection fails. For example, consider the following path,which does not encode the partition key names:

gs://my_bucket/my_table/2019-10-31/en/my_filename

Files where the schema is not in a consistent order also fail detection.For example, consider the following two files with inverted partitionkey encodings:

gs://my_bucket/my_table/dt=2019-10-31/lang=en/my_filenamegs://my_bucket/my_table/lang=fr/dt=2018-10-31/my_filename

Detection modes

BigQuery supports three modes of Hive partition schema detection:

AUTO: Key names and types are automatically detected. The following typescan be detected:
- STRING
- INTEGER
- DATE
  For example,/date=2018-10-18/.
- TIMESTAMP
  For example,/time=2018-10-18 16:00:00+00/.
STRINGS: Key names are automatically converted toSTRING type.
CUSTOM: Partition key schema is encoded as specified in the source URIprefix.

Custom partition key schema

To use aCUSTOM schema, you must specify the schema in the source URI prefixfield. Using aCUSTOM schema lets you specify the type for each partition key.The values must validly parse as the specified type or the query fails.

For example, if you set thesource_uri_prefix flag togs://my_bucket/my_table/{dt:DATE}/{val:STRING},BigQuery treatsval as a STRING,dt as a DATE, andusesgs://my_bucket/my_table as the source URI prefix for the matched files.

Partition pruning

BigQuery prunes partitions when possible using query predicates onthe partition keys. This lets BigQuery avoid readingunnecessary files, which helps improve performance.

Predicate filters on partition keys in queries

When you create an externally partitioned table, you can require the use ofpredicate filters on partition keys by enabling therequirePartitionFilteroption underHivePartitioningOptions.

When this option is enabled, attempts to query the externally partitioned tablewithout specifying aWHERE clause produce the following error:Cannot query over table <table_name> without a filter over column(s)<partition key names> that can be used for partition elimination.

Note: There must be at least one predicate thatonly references one or more partition keys for the filter to be consideredeligible for partition elimination. For example, for a table with partition keyval and columnf in the file, both of the followingWHERE clauses satisfy the requirement:
WHERE val = "key"
WHERE val = "key" AND f = "column"

However,WHERE (val = "key" OR f = "column") is not sufficient.

Limitations

Hive partitioning support is built assuming a common source URI prefix forall URIs that ends immediately before partition encoding, as follows:gs://BUCKET/PATH_TO_TABLE/.

The directory structure of a Hive partitioned table is assumed to have the samepartitioning keys appear in the same order, with a maximum of tenpartition keys per table.

The data must follow a default Hive partitioning layout.
The Hive partitioning keys and the columns in the underlying files cannot overlap.
Support is forGoogleSQL only.
Alllimitationsfor querying external data sources stored on Cloud Storage apply.

What's next

Learn aboutpartitioned tables.
Learn how touse SQL in BigQuery.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換