Create external table with hive partitioning

Create an external table using hive partitioning.

Code sample

Go

Before trying this sample, follow theGo setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryGo API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

import("context""fmt""cloud.google.com/go/bigquery")// createTableExternalHivePartitioned demonstrates creating an external table with hive partitioning.funccreateTableExternalHivePartitioned(projectID,datasetID,tableIDstring)error{// projectID := "my-project-id"// datasetID := "mydatasetid"// tableID := "mytableid"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %w",err)}deferclient.Close()// First, we'll define table metadata to represent a table that's backed by parquet files held in// Cloud Storage.//// Example file:// gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/dt=2020-11-15/file1.parquetmetadata:=&bigquery.TableMetadata{Description:"An example table that demonstrates hive partitioning against external parquet files",ExternalDataConfig:&bigquery.ExternalDataConfig{SourceFormat:bigquery.Parquet,SourceURIs:[]string{"gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/*"},AutoDetect:true,},}// The layout of the files in here is compatible with the layout requirements for hive partitioning,// so we can add an optional Hive partitioning configuration to leverage the object paths for deriving// partitioning column information.//// For more information on how partitions are extracted, see:// https://cloud.google.com/bigquery/docs/hive-partitioned-queries-gcs//// We have a "/dt=YYYY-MM-DD/" path component in our example files as documented above.  Autolayout will// expose this as a column named "dt" of type DATE.metadata.ExternalDataConfig.HivePartitioningOptions=&bigquery.HivePartitioningOptions{Mode:bigquery.AutoHivePartitioningMode,SourceURIPrefix:"gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/",RequirePartitionFilter:true,}// Create the external table.tableRef:=client.Dataset(datasetID).Table(tableID)iferr:=tableRef.Create(ctx,metadata);err!=nil{returnfmt.Errorf("table creation failure: %w",err)}returnnil}

Java

Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.ExternalTableDefinition;importcom.google.cloud.bigquery.FormatOptions;importcom.google.cloud.bigquery.HivePartitioningOptions;importcom.google.cloud.bigquery.TableId;importcom.google.cloud.bigquery.TableInfo;// Sample to create external table using hive partitioningpublicclassCreateTableExternalHivePartitioned{publicstaticvoidmain(String[]args){// TODO(developer): Replace these variables before running the sample.StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";StringsourceUri="gs://cloud-samples-data/bigquery/hive-partitioning-samples/customlayout/*";StringsourceUriPrefix="gs://cloud-samples-data/bigquery/hive-partitioning-samples/customlayout/{pkey:STRING}/";createTableExternalHivePartitioned(datasetName,tableName,sourceUriPrefix,sourceUri);}publicstaticvoidcreateTableExternalHivePartitioned(StringdatasetName,StringtableName,StringsourceUriPrefix,StringsourceUri){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();// Configuring partitioning optionsHivePartitioningOptionshivePartitioningOptions=HivePartitioningOptions.newBuilder().setMode("CUSTOM").setRequirePartitionFilter(true).setSourceUriPrefix(sourceUriPrefix).build();TableIdtableId=TableId.of(datasetName,tableName);ExternalTableDefinitioncustomTable=ExternalTableDefinition.newBuilder(sourceUri,FormatOptions.parquet()).setAutodetect(true).setHivePartitioningOptions(hivePartitioningOptions).build();bigquery.create(TableInfo.of(tableId,customTable));System.out.println("External table created using hivepartitioningoptions");}catch(BigQueryExceptione){System.out.println("External table was not created"+e.toString());}}}

Python

Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

# Demonstrates creating an external table with hive partitioning.# TODO(developer): Set table_id to the ID of the table to create.table_id="your-project.your_dataset.your_table_name"# TODO(developer): Set source uri.# Example file:# gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/dt=2020-11-15/file1.parqueturi="gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/*"# TODO(developer): Set source uri prefix.source_uri_prefix=("gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/")fromgoogle.cloudimportbigquery# Construct a BigQuery client object.client=bigquery.Client()# Configure the external data source.external_config=bigquery.ExternalConfig("PARQUET")external_config.source_uris=[uri]external_config.autodetect=True# Configure partitioning options.hive_partitioning_opts=bigquery.HivePartitioningOptions()# The layout of the files in here is compatible with the layout requirements for hive partitioning,# so we can add an optional Hive partitioning configuration to leverage the object paths for deriving# partitioning column information.# For more information on how partitions are extracted, see:# https://cloud.google.com/bigquery/docs/hive-partitioned-queries-gcs# We have a "/dt=YYYY-MM-DD/" path component in our example files as documented above.# Autolayout will expose this as a column named "dt" of type DATE.hive_partitioning_opts.mode="AUTO"hive_partitioning_opts.require_partition_filter=Truehive_partitioning_opts.source_uri_prefix=source_uri_prefixexternal_config.hive_partitioning=hive_partitioning_optstable=bigquery.Table(table_id)table.external_data_configuration=external_configtable=client.create_table(table)# Make an API request.print("Created table{}.{}.{}".format(table.project,table.dataset_id,table.table_id))

What's next

To search and filter code samples for other Google Cloud products, see theGoogle Cloud sample browser.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.