Create clustered tables

You can reduce the amount of data processed by a query by using clustered tablesin BigQuery.

With clustered tables, table data is organized based on the values of specifiedcolumns, also called theclustering columns. BigQuery sorts thedata by the clustered columns, then stores the rows that have similar values inthe same or nearby physical blocks. When a query filters on a clustered column,BigQuery efficiently scans only the relevant blocks and skips thedata that doesn't match the filter.

For more information, see the following:

To learn more about clustered tables in BigQuery, seeIntroduction to clustered tables.
To learn about working with and controlling access to clustered tables, seeManage clustered tables.

Before you begin

Required roles

To get the permissions that you need to create a table, ask your administrator to grant you the following IAM roles:

BigQuery Job User (roles/bigquery.jobUser) on the project if you're creating a table by loading data or by saving query results to a table.
BigQuery Data Editor (roles/bigquery.dataEditor) on the dataset where you're creating the table.

For more information about granting roles, seeManage access to projects, folders, and organizations.

These predefined roles contain the permissions required to create a table. To see the exact permissions that are required, expand theRequired permissions section:

Required permissions

The following permissions are required to create a table:

bigquery.tables.create on the dataset where you're creating the table.
bigquery.tables.getData on all tables and views that your query references if you're saving query results as a table.
bigquery.jobs.create on the project if you're creating the table by loading data or by saving query results to a table.
bigquery.tables.updateData on the table if you're appending to or overwriting a table with query results.

You might also be able to get these permissions withcustom roles or otherpredefined roles.

Table naming requirements

When you create a table in BigQuery, the table name mustbe unique per dataset. The table name can:

Contain characters with a total of up to 1,024 UTF-8 bytes.
Contain Unicode characters in category L (letter), M (mark), N (number),Pc (connector, including underscore), Pd (dash), Zs (space). For moreinformation, seeGeneral Category.

The following are all examples of valid table names:table 01,ग्राहक,00_お客様,étudiant-01.

Caveats:

Table names are case-sensitive by default.mytable andMyTable cancoexist in the same dataset, unless they are part of adataset withcase-sensitivity turned off.
Some table names and table name prefixes are reserved. Ifyou receive an error saying that your table name or prefix isreserved, then select a different name and try again.
If you include multiple dot operators (.) in a sequence, the duplicateoperators are implicitly stripped.
For example, this:project_name....dataset_name..table_name
Becomes this:project_name.dataset_name.table_name

Clustered column requirements

You can specify the columns used to create the clustered table when you create atable in BigQuery. After the table is created, you can modify thecolumns used to create the clustered table. For details, seeModifying the clustering specification.

Clustering columns must be top-level, non-repeated columns, and they must be oneof the following data types:

BIGNUMERIC
BOOL
DATE
DATETIME
GEOGRAPHY
INT64
NUMERIC
RANGE
STRING
TIMESTAMP

You can specify up to four clustering columns. When you specify multiplecolumns, the order of the columns determines how the data is sorted. Forexample, if the table is clustered by columns a, b and c, the data is sorted inthe same order: first by column a, then by column b, and then by columnc. As a best practice, place the most frequently filtered or aggregated columnfirst.

The order of your clustering columns also affects query performance and pricing.For more information about query best practices for clustered tables, seeQuerying clustered tables.

Create an empty clustered table with a schema definition

To create an empty clustered table with a schema definition:

Console

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, clickExplorer.
In theExplorer pane, expand your project, clickDatasets, and then select a dataset.
In theDataset info section, clickCreate table.
In theCreate table pane, specify the following details:

In theSource section, selectEmpty table in theCreate table from list.
In theDestination section, specify the following details:
1. ForDataset, select the dataset in which you want to create the table.
2. In theTable field, enter the name of the table that you want to create.
3. Verify that theTable type field is set toNative table.
In theSchema section, enter theschema definition. You can enter schema information manually by using one of the following methods:
- Option 1: ClickEdit as text and paste the schema in the form of a JSON array. When you use a JSON array, you generate the schema using the same process ascreating a JSON schema file. You can view the schema of an existing table in JSON format by entering the following command:
```
bqshow--format=prettyjsondataset.table
```
- Option 2: ClickAdd field and enter the table schema. Specify each field'sName,Type, andMode.
ForClustering order, enter between one and four comma-separated column names.
Optional: In theAdvanced options section, if you want to use a customer-managed encryption key, then select theUse a customer-managed encryption key (CMEK) option. By default, BigQueryencrypts customer content stored at rest by using a Google-owned and Google-managed encryption key.
ClickCreate table.

SQL

Use theCREATE TABLE DDL statementcommand with theCLUSTER BY option. The following example creates aclustered table namedmyclusteredtable inmydataset:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery

In the query editor, enter the following statement:

CREATETABLEmydataset.myclusteredtable(customer_idSTRING,transaction_amountNUMERIC)CLUSTERBYcustomer_idOPTIONS(description='a table clustered by customer_id');

ClickRun.

For more information about how to run queries, seeRun an interactive query.

bq

Use thebq mk commandwith the following flags:

--table (or the-t shortcut).
--schema. You can supply the table's schema definition inline oruse a JSON schema file.
--clustering_fields. You can specify up to four clustering columns.

Optional parameters include--expiration,--description,--time_partitioning_type,--time_partitioning_field,--time_partitioning_expiration,--destination_kms_key, and--label.

If you are creating a table in a project other than your default project,add the project ID to the dataset in the following format:project_id:dataset.

--destination_kms_key is not demonstrated here. For information aboutusing--destination_kms_key, seecustomer-managed encryption keys.

Enter the following command to create an empty clustered table with aschema definition:

bqmk\--table\--expirationINTEGER1\--schemaSCHEMA\--clustering_fieldsCLUSTER_COLUMNS\--description"DESCRIPTION"\--labelKEY:VALUE,KEY:VALUE\PROJECT_ID:DATASET.TABLE

Replace the following:

INTEGER1: the default lifetime, in seconds, forthe table.The minimum value is 3,600 seconds (one hour). The expirationtime evaluates to the current UTC time plus the integer value. If you setthe table's expiration time when you create a table, the dataset's defaulttable expiration setting is ignored. Setting this value deletes the tableafter the specified time.
SCHEMA: an inline schema definition in the formatCOLUMN:DATA_TYPE,COLUMN:DATA_TYPE or the path tothe JSON schema file on your local machine.
CLUSTER_COLUMNS: a comma-separated list of up tofour clustering columns. The list cannot contain any spaces.
DESCRIPTION: a description of the table, in quotes.
KEY:VALUE: the key-value pair that represents alabel. You can enter multiple labels usinga comma-separated list.
PROJECT_ID: your project ID.
DATASET: a dataset in your project.
TABLE: the name of the table you're creating.

When you specify the schema on the command line, you cannot include aRECORD (STRUCT)type, you cannot include a column description, and youcannot specify the column's mode. All modes default toNULLABLE. Toinclude descriptions, modes, andRECORD types,supply a JSON schemafile instead.

Examples:

Enter the following command to create a clustered tablenamedmyclusteredtable inmydataset in your default project. The table'sexpiration is set to 2,592,000 (1 30-day month), the description is set toThis is my clustered table, and the label is set toorganization:development. The command uses the-t shortcut instead of--table.

The schema is specified inline as:timestamp:timestamp,customer_id:string,transaction_amount:float. Thespecified clustering fieldcustomer_id is used to cluster the table.

bq mk \    -t \    --expiration 2592000 \    --schema 'timestamp:timestamp,customer_id:string,transaction_amount:float' \    --clustering_fields customer_id \    --description "This is my clustered table" \    --label org:dev \    mydataset.myclusteredtable

Enter the following command to create a clustered table namedmyclusteredtable inmyotherproject, not your default project. Thedescription is set toThis is my clustered table, and the label is settoorganization:development. The command uses the-t shortcut instead of--table. This command does not specify a table expiration. If the datasethas a default table expiration, it is applied. If the dataset has no defaulttable expiration, the table never expires.

The schema is specified in a local JSON file:/tmp/myschema.json. Thecustomer_id field is used to cluster the table.

bq mk \    -t \    --expiration 2592000 \    --schema /tmp/myschema.json \    --clustering_fields=customer_id \    --description "This is my clustered table" \    --label org:dev \    myotherproject:mydataset.myclusteredtable

After the table is created, you can update the table'sdescriptionandlabels.

Terraform

Use thegoogle_bigquery_tableresource.

Note: To create BigQuery objects using Terraform, you mustenable the Cloud Resource Manager API.

To authenticate to BigQuery, set up Application DefaultCredentials. For more information, see Set up authentication for client libraries.

The following example creates a table namedmytable that is clusteredon theID andCreated columns:

resource "google_bigquery_dataset" "default" {  dataset_id                      = "mydataset"  default_partition_expiration_ms = 2592000000  # 30 days  default_table_expiration_ms     = 31536000000 # 365 days  description                     = "dataset description"  location                        = "US"  max_time_travel_hours           = 96 # 4 days  labels = {    billing_group = "accounting",    pii           = "sensitive"  }}resource "google_bigquery_table" "default" {  dataset_id          = google_bigquery_dataset.default.dataset_id  table_id            = "mytable"  deletion_protection = false # set to "true" in production  clustering = ["ID", "Created"]  schema = <<EOF[  {    "name": "ID",    "type": "INT64",    "description": "Item ID"  },  {    "name": "Item",    "type": "STRING",    "mode": "NULLABLE"  }, {   "name": "Created",   "type": "TIMESTAMP" }]EOF}

To apply your Terraform configuration in a Google Cloud project, complete the steps in the following sections.

Prepare Cloud Shell

LaunchCloud Shell.
Set the default Google Cloud project where you want to apply your Terraform configurations.
You only need to run this command once per project, and you can run it in any directory.
```
export GOOGLE_CLOUD_PROJECT=PROJECT_ID
```
Environment variables are overridden if you set explicit values in the Terraform configuration file.

Prepare the directory

Each Terraform configuration file must have its own directory (alsocalled aroot module).

InCloud Shell, create a directory and a new file within that directory. The filename must have the.tf extension—for examplemain.tf. In this tutorial, the file is referred to asmain.tf.
```
mkdirDIRECTORY && cdDIRECTORY && touch main.tf
```
If you are following a tutorial, you can copy the sample code in each section or step.
Copy the sample code into the newly createdmain.tf.
Optionally, copy the code from GitHub. This is recommended when the Terraform snippet is part of an end-to-end solution.
Review and modify the sample parameters to apply to your environment.
Save your changes.
Initialize Terraform. You only need to do this once per directory.
```
terraform init
```
Optionally, to use the latest Google provider version, include the-upgrade option:
```
terraform init -upgrade
```

Apply the changes

Review the configuration and verify that the resources that Terraform is going to create or update match your expectations:
```
terraform plan
```
Make corrections to the configuration as necessary.
Apply the Terraform configuration by running the following command and enteringyes at the prompt:
```
terraform apply
```
Wait until Terraform displays the "Apply complete!" message.
Open your Google Cloud project to view the results. In the Google Cloud console, navigate to your resources in the UI to make sure that Terraform has created or updated them.

Note: Terraform samples typically assume that the required APIs are enabled in your Google Cloud project.

API

Call the tables.insertmethod with a definedtable resourcethat specifies theclustering.fields property and theschema property.

Python

Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

fromgoogle.cloudimportbigquery# Construct a BigQuery client object.client=bigquery.Client()# TODO(developer): Set table_id to the ID of the table to create.# table_id = "your-project.your_dataset.your_table_name"schema=[bigquery.SchemaField("full_name","STRING"),bigquery.SchemaField("city","STRING"),bigquery.SchemaField("zipcode","INTEGER"),]table=bigquery.Table(table_id,schema=schema)table.clustering_fields=["city","zipcode"]table=client.create_table(table)# Make an API request.print("Created clustered table{}.{}.{}".format(table.project,table.dataset_id,table.table_id))

Go

Before trying this sample, follow theGo setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryGo API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

import("context""fmt""time""cloud.google.com/go/bigquery")// createTableClustered demonstrates creating a BigQuery table with advanced properties like// partitioning and clustering features.funccreateTableClustered(projectID,datasetID,tableIDstring)error{// projectID := "my-project-id"// datasetID := "mydatasetid"// tableID := "mytableid"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()sampleSchema:=bigquery.Schema{{Name:"timestamp",Type:bigquery.TimestampFieldType},{Name:"origin",Type:bigquery.StringFieldType},{Name:"destination",Type:bigquery.StringFieldType},{Name:"amount",Type:bigquery.NumericFieldType},}metaData:=&bigquery.TableMetadata{Schema:sampleSchema,TimePartitioning:&bigquery.TimePartitioning{Field:"timestamp",Expiration:90*24*time.Hour,},Clustering:&bigquery.Clustering{Fields:[]string{"origin","destination"},},}tableRef:=client.Dataset(datasetID).Table(tableID)iferr:=tableRef.Create(ctx,metaData);err!=nil{returnerr}returnnil}

Java

Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Clustering;importcom.google.cloud.bigquery.Field;importcom.google.cloud.bigquery.Schema;importcom.google.cloud.bigquery.StandardSQLTypeName;importcom.google.cloud.bigquery.StandardTableDefinition;importcom.google.cloud.bigquery.TableId;importcom.google.cloud.bigquery.TableInfo;importcom.google.cloud.bigquery.TimePartitioning;importcom.google.common.collect.ImmutableList;publicclassCreateClusteredTable{publicstaticvoidrunCreateClusteredTable(){// TODO(developer): Replace these variables before running the sample.StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";createClusteredTable(datasetName,tableName);}publicstaticvoidcreateClusteredTable(StringdatasetName,StringtableName){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();TableIdtableId=TableId.of(datasetName,tableName);TimePartitioningpartitioning=TimePartitioning.of(TimePartitioning.Type.DAY);Schemaschema=Schema.of(Field.of("name",StandardSQLTypeName.STRING),Field.of("post_abbr",StandardSQLTypeName.STRING),Field.of("date",StandardSQLTypeName.DATE));Clusteringclustering=Clustering.newBuilder().setFields(ImmutableList.of("name","post_abbr")).build();StandardTableDefinitiontableDefinition=StandardTableDefinition.newBuilder().setSchema(schema).setTimePartitioning(partitioning).setClustering(clustering).build();TableInfotableInfo=TableInfo.newBuilder(tableId,tableDefinition).build();bigquery.create(tableInfo);System.out.println("Clustered table created successfully");}catch(BigQueryExceptione){System.out.println("Clustered table was not created. \n"+e.toString());}}}

Create a clustered table from a query result

There are two ways to create a clustered table from a query result:

Write the results to a new destination table and specify the clusteringcolumns.
By using a DDLCREATE TABLE AS SELECT statement. For more information aboutthis method, seeCreating a clustered table from the result of a queryon theUsing data definition language statementspage.

You can create a clustered table by querying either a partitioned table or anon-partitioned table. You cannot change an existing table to a clustered tableby using query results.

When you create a clustered table from a query result, you must use standardSQL. Legacy SQL is not supported for querying clustered tables orfor writing query results to clustered tables.

SQL

To create a clustered table from a query result, use theCREATE TABLE DDL statementwith theCLUSTER BY option. The following example creates a newtable clustered bycustomer_id by querying an existing unclustered table:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery

In the query editor, enter the following statement:

CREATETABLEmydataset.clustered_table(customer_idSTRING,transaction_amountNUMERIC)CLUSTERBYcustomer_idAS(SELECT*FROMmydataset.unclustered_table);

ClickRun.

For more information about how to run queries, seeRun an interactive query.

bq

Enter the following command to create a new, clustered destinationtable from a query result:

bq--location=LOCATIONquery\--use_legacy_sql=false'QUERY'

Replace the following:

LOCATION: the name of your location. The--location flag isoptional. For example, if you are using BigQuery in theTokyo region, you can set the flag's value toasia-northeast1. Youcan set a default value for the location using the.bigqueryrc file.
QUERY: a query in GoogleSQL syntax. You cannotuse legacy SQL to query clustered tables or to write query results toclustered tables. The query can contain aCREATE TABLEDDLstatement that specifies the options for creating your clustered table.You can use DDL rather than specifying the individual command-lineflags.

Examples:

Enter the following command to write query results to a clustereddestination table namedmyclusteredtable inmydataset.mydataset is inyour default project. The query retrieves data from a non-partitioned table:mytable. The table'scustomer_id column is used to cluster thetable. The table'stimestamp column is used to create a partitioned table.

bq query --use_legacy_sql=false \    'CREATE TABLE       mydataset.myclusteredtable     PARTITION BY       DATE(timestamp)     CLUSTER BY       customer_id     AS (       SELECT*       FROM`mydataset.mytable`     );'

API

To save query results to a clustered table,call thejobs.insert method,configure aquery job,and include aCREATE TABLEDDLstatement that creates your clustered table.

Specify your location in thelocation property in thejobReference section of thejob resource.

Create a clustered table when you load data

You can create a clustered table by specifying clustering columns when you loaddata into a new table. You do not need to create an empty table before loadingdata into it. You can create the clustered table and load your data at the sametime.

For more information about loading data, seeIntroduction to loading data into BigQuery.

To define clustering when defining a load job:

SQL

Use theLOAD DATA statement.The following example loads AVRO data to create a table that is partitionedby thetransaction_date field and clustered by thecustomer_id field.It also configures the partitions to expire after three days.

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery

In the query editor, enter the following statement:

LOADDATAINTOmydataset.mytablePARTITIONBYtransaction_dateCLUSTERBYcustomer_idOPTIONS(partition_expiration_days=3)FROMFILES(format='AVRO',uris=['gs://bucket/path/file.avro']);

ClickRun.

For more information about how to run queries, seeRun an interactive query.

API

To define a clustering configuration when creating a table through aload job, you can populate theClusteringproperties for the table.

Go

Before trying this sample, follow theGo setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryGo API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

import("context""fmt""cloud.google.com/go/bigquery")// importClusteredTable demonstrates creating a table from a load job and defining partitioning and clustering// properties.funcimportClusteredTable(projectID,destDatasetID,destTableIDstring)error{// projectID := "my-project-id"// datasetID := "mydataset"// tableID := "mytable"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()gcsRef:=bigquery.NewGCSReference("gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv")gcsRef.SkipLeadingRows=1gcsRef.Schema=bigquery.Schema{{Name:"timestamp",Type:bigquery.TimestampFieldType},{Name:"origin",Type:bigquery.StringFieldType},{Name:"destination",Type:bigquery.StringFieldType},{Name:"amount",Type:bigquery.NumericFieldType},}loader:=client.Dataset(destDatasetID).Table(destTableID).LoaderFrom(gcsRef)loader.TimePartitioning=&bigquery.TimePartitioning{Field:"timestamp",}loader.Clustering=&bigquery.Clustering{Fields:[]string{"origin","destination"},}loader.WriteDisposition=bigquery.WriteEmptyjob,err:=loader.Run(ctx)iferr!=nil{returnerr}status,err:=job.Wait(ctx)iferr!=nil{returnerr}ifstatus.Err()!=nil{returnfmt.Errorf("job completed with error: %v",status.Err())}returnnil}

Java

Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Clustering;importcom.google.cloud.bigquery.Field;importcom.google.cloud.bigquery.FormatOptions;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.LoadJobConfiguration;importcom.google.cloud.bigquery.Schema;importcom.google.cloud.bigquery.StandardSQLTypeName;importcom.google.cloud.bigquery.TableId;importcom.google.cloud.bigquery.TimePartitioning;importcom.google.common.collect.ImmutableList;publicclassLoadTableClustered{publicstaticvoidrunLoadTableClustered()throwsException{// TODO(developer): Replace these variables before running the sample.StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";StringsourceUri="/path/to/file.csv";loadTableClustered(datasetName,tableName,sourceUri);}publicstaticvoidloadTableClustered(StringdatasetName,StringtableName,StringsourceUri)throwsException{try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();TableIdtableId=TableId.of(datasetName,tableName);Schemaschema=Schema.of(Field.of("name",StandardSQLTypeName.STRING),Field.of("post_abbr",StandardSQLTypeName.STRING),Field.of("date",StandardSQLTypeName.DATE));TimePartitioningpartitioning=TimePartitioning.of(TimePartitioning.Type.DAY);Clusteringclustering=Clustering.newBuilder().setFields(ImmutableList.of("name","post_abbr")).build();LoadJobConfigurationloadJobConfig=LoadJobConfiguration.builder(tableId,sourceUri).setFormatOptions(FormatOptions.csv()).setSchema(schema).setTimePartitioning(partitioning).setClustering(clustering).build();JobloadJob=bigquery.create(JobInfo.newBuilder(loadJobConfig).build());// Load data from a GCS parquet file into the table// Blocks until this load table job completes its execution, either failing or succeeding.JobcompletedJob=loadJob.waitFor();// Check for errorsif(completedJob==null){thrownewException("Job not executed since it no longer exists.");}elseif(completedJob.getStatus().getError()!=null){// You can also look at queryJob.getStatus().getExecutionErrors() for all// errors, not just the latest one.thrownewException("BigQuery was unable to load into the table due to an error: \n"+loadJob.getStatus().getError());}System.out.println("Data successfully loaded into clustered table during load job");}catch(BigQueryException|InterruptedExceptione){System.out.println("Data not loaded into clustered table during load job \n"+e.toString());}}}

Python

Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

fromgoogle.cloudimportbigquery# Construct a BigQuery client object.client=bigquery.Client()# TODO(developer): Set table_id to the ID of the table to create.# table_id = "your-project.your_dataset.your_table_name"job_config=bigquery.LoadJobConfig(skip_leading_rows=1,source_format=bigquery.SourceFormat.CSV,schema=[bigquery.SchemaField("timestamp",bigquery.SqlTypeNames.TIMESTAMP),bigquery.SchemaField("origin",bigquery.SqlTypeNames.STRING),bigquery.SchemaField("destination",bigquery.SqlTypeNames.STRING),bigquery.SchemaField("amount",bigquery.SqlTypeNames.NUMERIC),],time_partitioning=bigquery.TimePartitioning(field="timestamp"),clustering_fields=["origin","destination"],)job=client.load_table_from_uri(["gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv"],table_id,job_config=job_config,)job.result()# Waits for the job to complete.table=client.get_table(table_id)# Make an API request.print("Loaded{} rows and{} columns to{}".format(table.num_rows,len(table.schema),table_id))

What's next

For information about working with clustered tables, seeManage clustered tables.
For information about querying clustered tables, seeQuerying clustered tables.
For an overview of partitioned table support in BigQuery, seeIntroduction to partitioned tables.
To learn how to create partitioned tables, seeCreating partitioned tables.
To see an overview ofINFORMATION_SCHEMA, seeIntroduction to BigQueryINFORMATION_SCHEMA.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.

Movatterモバイル変換

Create clustered tables

Before you begin

Required roles

Required permissions

Table naming requirements

Clustered column requirements

Create an empty clustered table with a schema definition

Console

SQL

bq

Terraform

Prepare Cloud Shell

Prepare the directory

Apply the changes

API

Python

Go

Java

Create a clustered table from a query result

SQL

bq

API

Create a clustered table when you load data

SQL

API

Go

Java

Python

What's next