Data definition language (DDL) statements in GoogleSQL
Data definition language (DDL) statements let you create and modifyBigQuery resources usingGoogleSQLquery syntax. You can use DDL commands to create, alter, and delete resources,such as the following:
- Datasets
- Tables
- Table schemas
- Table clones
- Table snapshots
- Views
- User-defined functions (UDFs)
- Indexes
- Capacity commitments and reservations
- Row-level access policies
Required permissions
To create a job that runs a DDL statement, you must have thebigquery.jobs.create permission for the project where you are running the job.Each DDL statement also requires specific permissions on the affected resources,which are documented under each statement.
IAM roles
The predefined IAM rolesbigquery.user,bigquery.jobUser, andbigquery.admin include the requiredbigquery.jobs.create permission.
For more information about IAM roles in BigQuery,seePredefined roles and permissions or theIAM permissions reference.
Run DDL statements
You can run DDL statements by using the Google Cloud console, by using thebq command-line tool, by calling thejobs.query REST API, orprogrammatically using theBigQuery API client libraries.
Console
Go to the BigQuery page in the Google Cloud console.
ClickCompose new query.
Enter the DDL statement into theQuery editor text area. For example:
CREATETABLEmydataset.newtable(xINT64)
ClickRun.
bq
Enter thebq query commandand supply the DDL statement as the query parameter. Set theuse_legacy_sql flag tofalse.
bqquery--use_legacy_sql=false\'CREATE TABLE mydataset.newtable ( x INT64 )'
API
Call thejobs.query methodand supply the DDL statement in the request body'squery property.
DDL functionality extends the information returned by aJobs resource.statistics.query.statementType includes the following additional values for DDL support:
CREATE_TABLECREATE_TABLE_AS_SELECTDROP_TABLECREATE_VIEWDROP_VIEW
statistics.query has 2 additional fields:
ddlOperationPerformed: The DDL operation performed, possibly dependent on the existence of the DDL target. Current values include:CREATE: The query created the DDL target.SKIP: No-op. Examples —CREATE TABLE IF NOT EXISTSwas submitted, and the table exists. OrDROP TABLE IF EXISTSwas submitted, and the table does not exist.REPLACE: The query replaced the DDL target. Example —CREATE OR REPLACE TABLEwas submitted, and the table already exists.DROP: The query deleted the DDL target.
ddlTargetTable: When you submit aCREATE TABLE/VIEWstatement or aDROP TABLE/VIEWstatement, the target table is returned as an object with 3 fields:- "projectId": string
- "datasetId": string
- "tableId": string
Java
Call theBigQuery.create()method to start a query job. Call theJob.waitFor()method to wait for the DDL query to finish.
Before trying this sample, follow theJava setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryJava API reference documentation. To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.QueryJobConfiguration;// Sample to create a view using DDLpublicclassDDLCreateView{publicstaticvoidrunDDLCreateView(){// TODO(developer): Replace these variables before running the sample.StringprojectId="MY_PROJECT_ID";StringdatasetId="MY_DATASET_ID";StringtableId="MY_VIEW_ID";Stringddl="CREATE VIEW "+"`"+projectId+"."+datasetId+"."+tableId+"`"+" OPTIONS("+" expiration_timestamp=TIMESTAMP_ADD("+" CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),"+" friendly_name=\"new_view\","+" description=\"a view that expires in 2 days\","+" labels=[(\"org_unit\", \"development\")]"+" )"+" AS SELECT name, state, year, number"+" FROM `bigquery-public-data.usa_names.usa_1910_current`"+" WHERE state LIKE 'W%'`";ddlCreateView(ddl);}publicstaticvoidddlCreateView(Stringddl){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();QueryJobConfigurationconfig=QueryJobConfiguration.newBuilder(ddl).build();// create a view using query and it will wait to complete job.Jobjob=bigquery.create(JobInfo.of(config));job=job.waitFor();if(job.isDone()){System.out.println("View created successfully");}else{System.out.println("View was not created");}}catch(BigQueryException|InterruptedExceptione){System.out.println("View was not created. \n"+e.toString());}}}
Node.js
Before trying this sample, follow theNode.js setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryNode.js API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.
// Import the Google Cloud client library and create a clientconst{BigQuery}=require('@google-cloud/bigquery');constbigquery=newBigQuery();asyncfunctionddlCreateView(){// Creates a view via a DDL query/** * TODO(developer): Uncomment the following lines before running the sample. */// const projectId = "my_project"// const datasetId = "my_dataset"// const tableId = "my_new_view"constquery=` CREATE VIEW \`${projectId}.${datasetId}.${tableId}\` OPTIONS( expiration_timestamp=TIMESTAMP_ADD( CURRENT_TIMESTAMP(), INTERVAL 48 HOUR), friendly_name="new_view", description="a view that expires in 2 days", labels=[("org_unit", "development")] ) AS SELECT name, state, year, number FROM \`bigquery-public-data.usa_names.usa_1910_current\` WHERE state LIKE 'W%'`;// For all options, see https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/queryconstoptions={query:query,};// Run the query as a jobconst[job]=awaitbigquery.createQueryJob(options);job.on('complete',metadata=>{console.log(`Created new view${tableId} via job${metadata.id}`);});}Python
Call theClient.query()method to start a query job. Call theQueryJob.result()method to wait for the DDL query to finish.
Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.
# from google.cloud import bigquery# project = 'my-project'# dataset_id = 'my_dataset'# table_id = 'new_view'# client = bigquery.Client(project=project)sql="""CREATE VIEW `{}.{}.{}`OPTIONS( expiration_timestamp=TIMESTAMP_ADD( CURRENT_TIMESTAMP(), INTERVAL 48 HOUR), friendly_name="new_view", description="a view that expires in 2 days", labels=[("org_unit", "development")])AS SELECT name, state, year, number FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE state LIKE 'W%'""".format(project,dataset_id,table_id)job=client.query(sql)# API request.job.result()# Waits for the query to finish.print('Created new view "{}.{}.{}".'.format(job.destination.project,job.destination.dataset_id,job.destination.table_id,))On-demand query size calculation
If you use on-demand billing, BigQuery charges for data definitionlanguage (DDL) queries based on the number of bytes processed by the query.
| DDL statement | Bytes processed |
|---|---|
CREATE TABLE | None. |
CREATE TABLE ... AS SELECT ... | The sum of bytes processed for all the columns referenced from the tables scanned by the query. |
CREATE VIEW | None. |
DROP TABLE | None. |
DROP VIEW | None. |
For more information about cost estimation, seeEstimate and control costs.
CREATE SCHEMA statement
Creates a new dataset.
Key Point: This SQL statement uses the termSCHEMA to refer to a logicalcollection of tables, views, and other resources. The equivalent concept inBigQuery is adataset. In this context,SCHEMA does not referto BigQuerytable schemas.Syntax
CREATESCHEMA[IFNOTEXISTS][project_name.]dataset_name[DEFAULTCOLLATEcollate_specification][OPTIONS(schema_option_list)]
Arguments
IF NOT EXISTS: If any dataset exists with the same name, theCREATEstatement has no effect.DEFAULT COLLATE collate_specification: When a new table is created in thedataset, the table inherits adefaultcollation specificationunless a collation specification is explicitly specified for a table or acolumn.If you remove or change this collation specification later with the
ALTER SCHEMAstatement, this will not change existingcollation specifications in this dataset. If you want to update an existingcollation specification in a dataset, you must alter the column that containsthe specification.project_name: The name of the project where you are creating the dataset.Defaults to the project that runs this DDL statement.dataset_name: The name of the dataset to create.schema_option_list: A list of options for creatingthe dataset.
Details
The dataset is created in the location that you specify in the query settings.For more information, seeSpecifying your location.
For more information about creating a dataset, seeCreating datasets. For information about quotas, seeDataset limits.
schema_option_list
The option list specifies options for the dataset. Specify the options in thefollowing format:NAME=VALUE, ...The following options are supported:
NAME | VALUE | Details |
|---|---|---|
default_kms_key_name | STRING | Specifies the default Cloud KMS key for encrypting table data in this dataset. You can override this value when you create a table. |
default_partition_expiration_days | FLOAT64 | Specifies the default expiration time, in days, for table partitions in this dataset. You can override this value when you create a table. |
default_rounding_mode |
| Example: This specifies the
|
default_table_expiration_days | FLOAT64 | Specifies the default expiration time, in days, for tables in this dataset. You can override this value when you create a table. |
description | STRING | The description of the dataset. |
failover_reservation | STRING | Associates the dataset to a reservation in the case of a failover scenario. |
friendly_name | STRING | A descriptive name for the dataset. |
is_case_insensitive | BOOL | TRUE if the dataset and its table names are case-insensitive, otherwiseFALSE. By default, this isFALSE, which means the dataset and its table names are case-sensitive.
|
is_primary | BOOLEAN | Declares if the dataset is the primary replica. |
labels | <ARRAY<STRUCT<STRING, STRING>>> | An array of labels for the dataset, expressed as key-value pairs. |
location | STRING | The location in which to create the dataset. If you don't specify this option, the dataset is created in the location where the query runs. If you specify this option and also explicitly set the location for the query job, the two values must match; otherwise the query fails. |
max_time_travel_hours | SMALLINT | Specifies the duration in hours of thetime travel window for the dataset. Themax_time_travel_hours value must be an integer expressed in multiples of 24 (48, 72, 96, 120, 144, 168) between 48 (2 days) and 168 (7 days). 168 hours is the default if this option isn't specified. |
primary_replica | STRING | The replica name to set as theprimary replica. |
storage_billing_model | STRING | Alters thestorage billing model for the dataset. Set the The When you change a dataset's billing model, it takes 24 hours for the change to take effect. Once you change a dataset's storage billing model, you must wait 14 days before you can change the storage billing model again. |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags for the dataset, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.datasets.create | The project where you create the dataset. |
Examples
Creating a new dataset
The following example creates a dataset with a default table expiration and aset of labels.
CREATESCHEMAmydatasetOPTIONS(location="us",default_table_expiration_days=3.75,labels=[("label1","value1"),("label2","value2")])
Creating a case-insensitive dataset
The following example creates a case-insensitive dataset. Both the dataset nameand table names inside the dataset are case-insensitive.
CREATESCHEMAmydatasetOPTIONS(is_case_insensitive=TRUE)
Creating a dataset with collation support
The following example creates a dataset with a collation specification.
CREATESCHEMAmydatasetDEFAULTCOLLATE'und:ci'
CREATE TABLE statement
Creates a new table.
Syntax
CREATE[ORREPLACE][TEMP|TEMPORARY]TABLE[IFNOTEXISTS]table_name[(column|constraint_definition[,...])][DEFAULTCOLLATEcollate_specification][PARTITIONBYpartition_expression][CLUSTERBYclustering_column_list][WITHCONNECTIONconnection_name][OPTIONS(table_option_list)][ASquery_statement]column:=column_definitionconstraint_definition:=[primary_key]|[[CONSTRAINTconstraint_name]foreign_key,...]primary_key:=PRIMARYKEY(column_name[,...])NOTENFORCEDforeign_key:=FOREIGNKEY(column_name[,...])foreign_referenceforeign_reference:=REFERENCESprimary_key_table(column_name[,...])NOTENFORCED
Arguments
OR REPLACE: Replaces any table with the same name if it exists. Cannotappear withIF NOT EXISTS.TEMP | TEMPORARY: Creates atemporary table.IF NOT EXISTS: If any table exists with the same name, theCREATEstatement has no effect. Cannot appear withOR REPLACE.table_name: The name of the table to create. SeeTable path syntax. For temporary tables, do not includethe project name or dataset name.column: The table's schema information.constraint_definition: An expression that defines a table constraint.collation_specification:When a new column is added to the table without an explicitcollation specification,thecolumn inherits thiscollation specification forSTRINGtypes.If you remove or change this collation specification later with the
ALTER TABLEstatement, this will not change existingcollation specifications in this table. If you want to update an existingcollation specification in a table, you must alter the column that containsthe specification.If the table is part of a dataset, the default collation specification forthis table overrides the default collation specification for the dataset.
partition_expression: An expression that determineshow to partition the table.clustering_column_list: A comma-separated listof column references that determine how to cluster the table. You cannot havecollation on columns in this list.connection_name: Specifies aconnection resource that hascredentials for accessing the external data. Specify the connection namein the formPROJECT_ID.LOCATION.CONNECTION_ID. If theproject ID or location contains a dash, enclose the connection name inbackticks (`). To use adefault connection,specifyDEFAULTinstead of the connection string containingPROJECT_ID.LOCATION.CONNECTION_ID.table_option_list: A list of options for creating thetable.query_statement: The query from which the table should be created. For thequery syntax, seeSQL syntax reference.If a collation specification is used on this table, collation passesthrough this query statement.primary_key:An expression that defines a primary keytable constraint.BigQuery only supports unenforced primary keys.foreign_key:An expression that defines a foreign keytable constraint.BigQuery only supports unenforced foreign keys.
Details
CREATE TABLE statements must comply with the following rules:
- Only one
CREATEstatement is allowed. - Either the column list, the
AS query_statementclause, or both must bepresent. - When both the column list and the
AS query_statementclause are present,BigQuery ignores the names in theAS query_statementclause and matches the columns with the column list by position. - When the
AS query_statementclause is present and the column list isabsent, BigQuery determines the column names and typesfrom theAS query_statementclause. - Column names must be specified either through the column list,the
AS query_statementclause or schema of the table in theLIKEclause. - Duplicate column names are not allowed.
- When both the
LIKEand theAS query_statementclause are present, thecolumn list in the query statement must match the columns of the tablereferenced by theLIKEclause. - Table names are case-sensitive unless the dataset they belong to is not.To create a case-insensitive dataset, seeCreating a case-insensitive dataset.To alter a dataset to make it case-insensitive dataset, seeTurning on case insensitivity for a dataset.
Limitations:
- It is not possible to create aningestion-time partitioned tablefrom the result of a query. Instead, use a
CREATE TABLEDDL statement tocreate the table, and then use anINSERTDML statementto insert data into it. - It is not possible to use the
OR REPLACEmodifier to replace a table witha different kind of partitioning. Instead,DROPthe table, and then use aCREATE TABLE ... AS SELECT ...statement to recreate it.
This statement supports the following variants, which have the same limitations:
CREATE TABLE LIKE: Create a table with thesame schema as an existing table.CREATE TABLE COPY: Create a table by copyingschema and data from an existing table.
column
(column_name column_schema[, ...]) contains the table'sschema information in a comma-separated list.
ARRAY orSTRUCT elements.column:=column_namecolumn_schemacolumn_schema:={simple_type|STRUCT<field_list>|ARRAY<array_element_schema>}[PRIMARYKEYNOTENFORCED|REFERENCEStable_name(column_name)NOTENFORCED][DEFAULTdefault_expression|GENERATEDALWAYSAS(generation_expression)STOREDOPTIONS(generation_option_list)][NOTNULL][OPTIONS(column_option_list)]simple_type:={data_type|STRINGCOLLATEcollate_specification}field_list:=field_namecolumn_schema[,...]array_element_schema:={simple_type|STRUCT<field_list>}[NOTNULL]
column_nameis the name of the column.A column name:- Must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_)
- Must start with a letter or underscore
- Can be up to 300 characters
column_schema: Similar to adata type, but supports anoptionalNOT NULLconstraint for types other thanARRAY.column_schemaalso supports options on top-level columns andSTRUCTfields.column_schemacan be used only in the column definition list ofCREATE TABLEstatements. It cannot be used as a type in expressions.simple_type: Anysupported data typeaside fromSTRUCTandARRAY.If
simple_typeis aSTRING, it supports an additional clause forcollation,which defines how a resultingSTRINGcan be compared and sorted.The syntax looks like this:STRINGCOLLATEcollate_specificationIf you have
DEFAULT COLLATE collate_specificationassigned to the table,the collation specification for a column overrides the specification for thetable.default_expression: Thedefault valueassigned to the column. You cannot specifyDEFAULTifGENERATED ALWAYS ASis specified.generation_expression: (Preview)An expression for an automatically generated embedding column.Setting this field enablesautonomous embedding generationon the table. The onlysupportedgeneration_expressionsyntax is a call to theAI.EMBEDfunction.- You can't specify
GENERATED ALWAYS ASifDEFAULTis specified. - The
connection_idargument toAI.EMBEDis required when used in ageneration expression. - The type of the column must be
STRUCT<result ARRAY<FLOAT64>, status STRING>.
- You can't specify
generation_option_list: The options for a generated column.The only supported option isasynchronous = TRUE.field_list: Represents the fields in a struct.field_name: The name of the struct field. Struct field names have thesame restrictions as column names.NOT NULL: When theNOT NULLconstraint is present for a column or field,the column or field is created withREQUIREDmode. Conversely, when theNOT NULLconstraint is absent, the column or field is created withNULLABLEmode.Columns and fields of
ARRAYtype do not support theNOT NULLmodifier. Forexample, acolumn_schemaofARRAY<INT64> NOT NULLis invalid, sinceARRAYcolumns haveREPEATEDmode and can be empty but cannot beNULL.An array element in a table can never beNULL, regardless of whether theNOT NULLconstraint is specified. For example,ARRAY<INT64>is equivalenttoARRAY<INT64 NOT NULL>.The
NOT NULLattribute of a table'scolumn_schemadoes not propagatethrough queries over the table. If tableTcontains a column declared asx INT64 NOT NULL, for example,CREATE TABLE dataset.newtable AS SELECT x FROM Tcreates a table nameddataset.newtablein whichxisNULLABLE.
partition_expression
PARTITION BY is an optional clause that controlstable andvector index partitioning.partition_expression is an expression that determines how to partition thetable or vector index. The partition expression can contain the followingvalues:
_PARTITIONDATE. Partition by ingestion time with daily partitions. Thissyntax cannot be used with theAS query_statementclause.DATE(_PARTITIONTIME). Equivalent to_PARTITIONDATE. This syntax cannot beused with theAS query_statementclause.<date_column>. Partition by aDATEcolumn with daily partitions.DATE({ <timestamp_column> | <datetime_column> }). Partition by aTIMESTAMPorDATETIMEcolumn with daily partitions.DATETIME_TRUNC(<datetime_column>, { DAY | HOUR | MONTH | YEAR }). Partitionby aDATETIMEcolumn with the specified partitioning type.TIMESTAMP_TRUNC(<timestamp_column>, { DAY | HOUR | MONTH | YEAR }).Partition by aTIMESTAMPcolumn with the specified partitioning type.TIMESTAMP_TRUNC(_PARTITIONTIME, { DAY | HOUR | MONTH | YEAR }). Partitionby ingestion time with the specified partitioning type. This syntax cannot beused with theAS query_statementclause.DATE_TRUNC(<date_column>, { MONTH | YEAR }). Partition by aDATEcolumnwith the specified partitioning type.RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<start>, <end>[, <interval>])).Partition by an integer column with the specified range, where:startis the start of range partitioning, inclusive.endis the end of range partitioning, exclusive.intervalis the width of each range within the partition. Defaults to 1.
clustering_column_list
CLUSTER BY is an optional clause that controlstable clustering.clustering_column_list is a comma-separated list that determines how tocluster the table. The clustering column list can contain a list of up to fourclustering columns.
clustering_column_list.table_option_list
The option list lets you set table options such as alabel and an expiration time. You can include multipleoptions using a comma-separated list.
Specify a table option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
expiration_timestamp | TIMESTAMP | Example: This property is equivalent to theexpirationTime table resource property. |
partition_expiration_days |
| Example: Sets the partition expiration in days. For more information, see Set the partition expiration. By default, partitions don't expire. This property is equivalent to thetimePartitioning.expirationMs table resource property but uses days instead of milliseconds. One day is equivalent to 86400000 milliseconds, or 24 hours. This property can only be set if the table is partitioned. |
require_partition_filter |
| Example: Specifies whether queries on this table must include a a predicate filter that filters on the partitioning column. For more information, see Set partition filter requirements. The default value is This property is equivalent to thetimePartitioning.requirePartitionFilter table resource property. This property can only be set if the table is partitioned. |
kms_key_name |
| Example: This property is equivalent to theencryptionConfiguration.kmsKeyName table resource property. See more details aboutProtecting data with Cloud KMS keys. |
friendly_name |
| Example: This property is equivalent to thefriendlyName table resource property. |
description |
| Example: This property is equivalent to thedescription table resource property. |
labels |
| Example: This property is equivalent to thelabels table resource property. |
default_rounding_mode |
| Example: This specifies the defaultrounding mode that's used for values written to any new
This property is equivalent to the |
enable_change_history |
| Inpreview. Example: Set this property to |
max_staleness |
| Example: The maximum interval behind the current time where it's acceptable to read stale data. For example, withchange data capture, when this option is set, the table copy operation is denied if data is more stale than the
|
enable_fine_grained_mutations |
| Inpreview. Example: Set this property to |
storage_uri |
| Inpreview. Example: A fully qualified location prefix for the external folder where data is stored. Supports Required formanaged tables. |
file_format |
| Inpreview. Example: The open-source file format in which the table data is stored. Only Required formanaged tables. The default is |
table_format |
| Inpreview. Example: The open table format in which metadata-only snapshots are stored. Only Required formanaged tables. The default is |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags for the table, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
VALUE is a constant expression containing only literals, query parameters,and scalar functions.
The constant expressioncannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT,CREATE, orUPDATE - User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRINGREPLACEREGEXP_REPLACERANDFORMATLPADRPADREPEATSESSION_USERGENERATE_ARRAYGENERATE_DATE_ARRAY
IfVALUE evaluates toNULL, the corresponding optionNAME in theCREATE TABLE statement is ignored.
column_option_list
Specify a column option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
description |
| Example: This property is equivalent to theschema.fields[].description table resource property. |
rounding_mode |
| Example: This specifies therounding mode that's used for values written to a
This property is equivalent to the |
data_policies | ARRAY<STRING> | Applies adata policy to a column in a table (Preview). Example: The Example: |
VALUE is a constant expression containing only literals, query parameters,and scalar functions.
The constant expressioncannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT,CREATE, orUPDATE - User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRINGREPLACEREGEXP_REPLACERANDFORMATLPADRPADREPEATSESSION_USERGENERATE_ARRAYGENERATE_DATE_ARRAY
Setting theVALUE replaces the existing value of that option for the column, ifthere was one. Setting theVALUE toNULL clears the column's value for thatoption.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the table. |
In addition, theOR REPLACE clause requiresbigquery.tables.update andbigquery.tables.updateData permissions.
If theOPTIONS clause includes any expiration options, then thebigquery.tables.delete permission is also required.
Examples
Creating a new table
The following example creates a partitioned table namednewtable inmydataset:
CREATETABLEmydataset.newtable(xINT64OPTIONS(description="An optional INTEGER field"),ySTRUCT<aARRAY<STRING>OPTIONS(description="A repeated STRING field"),bBOOL>)PARTITIONBY_PARTITIONDATEOPTIONS(expiration_timestamp=TIMESTAMP"2025-01-01 00:00:00 UTC",partition_expiration_days=1,description="a table that expires in 2025, with each partition living for 24 hours",labels=[("org_unit","development")])
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. So, instead ofmydataset.newtable, your table qualifier might be`myproject.mydataset.newtable`.
If the table name exists in the dataset, the following error is returned:
Already Exists:project_id:dataset.table
The table uses the followingpartition_expression to partition the table:PARTITION BY _PARTITIONDATE. This expression partitions the table usingthe date in the_PARTITIONDATE pseudocolumn.
The table schema contains two columns:
- x: An integer, with description "An optional INTEGER field"
y: A STRUCT containing two columns:
- a: An array of strings, with description "A repeated STRING field"
- b: A boolean
The table option list specifies the:
- Table expiration time: January 1, 2025 at 00:00:00 UTC
- Partition expiration time: 1 day
- Description:
A table that expires in 2025 - Label:
org_unit = development
Creating a new table from an existing table
The following example creates a table namedtop_words inmydataset from aquery:
CREATETABLEmydataset.top_wordsOPTIONS(description="Top ten words per Shakespeare corpus")ASSELECTcorpus,ARRAY_AGG(STRUCT(word,word_count)ORDERBYword_countDESCLIMIT10)AStop_wordsFROM`bigquery-public-data`.samples.shakespeareGROUPBYcorpus;
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. So, instead ofmydataset.top_words, your table qualifier might be`myproject.mydataset.top_words`.
If the table name exists in the dataset, the following error is returned:
Already Exists:project_id:dataset.table
The table schema contains 2 columns:
- corpus: Name of a Shakespeare corpus
top_words: An
Note: When you examine the table schema in the Google Cloud console, aSTRUCT is displayed as a RECORD column, and an ARRAY is displayed as aREPEATED column. The STRUCT and ARRAY data types are used to create nestedand repeated data in BigQuery. For more information, seeSpecifying nested and repeated fields.ARRAYofSTRUCTs containing 2 fields:word(aSTRING) andword_count(anINT64with the word count)
The table option list specifies the:
- Description:
Top ten words per Shakespeare corpus
Creating a table only if the table doesn't exist
The following example creates a table namednewtable inmydataset only if notable namednewtable exists inmydataset. If the table name exists in thedataset, no error is returned, and no action is taken.
CREATETABLEIFNOTEXISTSmydataset.newtable(xINT64,ySTRUCT<aARRAY<STRING>,bBOOL>)OPTIONS(expiration_timestamp=TIMESTAMP"2025-01-01 00:00:00 UTC",description="a table that expires in 2025",labels=[("org_unit","development")])
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. So, instead ofmydataset.newtable, your table qualifier might be`myproject.mydataset.newtable`.
The table schema contains 2 columns:
- x: An integer
y: A STRUCT containing a (an array of strings) and b (aboolean)
Note: When you examine the table schema in the Google Cloud console, aSTRUCT is displayed as a RECORD, and an ARRAY is displayed as REPEATED. TheSTRUCT and ARRAY data types are used to create nested and repeated data inBigQuery. For more information, seeSpecifying nested and repeated fields.
The table option list specifies the:
- Expiration time: January 1, 2025 at 00:00:00 UTC
- Description:
A table that expires in 2025 - Label:
org_unit = development
Creating or replacing a table
The following example creates a table namednewtable inmydataset, and ifnewtable exists inmydataset, it is overwritten with an empty table.
CREATEORREPLACETABLEmydataset.newtable(xINT64,ySTRUCT<aARRAY<STRING>,bBOOL>)OPTIONS(expiration_timestamp=TIMESTAMP"2025-01-01 00:00:00 UTC",description="a table that expires in 2025",labels=[("org_unit","development")])
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. So, instead ofmydataset.newtable, your table qualifier might be`myproject.mydataset.newtable`.
The table schema contains 2 columns:
- x: An integer
y: A STRUCT containing a (an array of strings) and b (aboolean)
Note: When you examine the table schema in the Google Cloud console, aSTRUCT is displayed as a RECORD, and an ARRAY is displayed as REPEATED. TheSTRUCT and ARRAY data types are used to create nested and repeated data inBigQuery. For more information, seeSpecifying nested and repeated fields.
The table option list specifies the:
- Expiration time: January 1, 2025 at 00:00:00 UTC
- Description:
A table that expires in 2025 - Label:
org_unit = development
Creating a table withREQUIRED columns
The following example creates a table namednewtable inmydataset. TheNOTNULL modifier in the column definition list of aCREATE TABLE statementspecifies that a column or field is created inREQUIRED mode.
CREATETABLEmydataset.newtable(xINT64NOTNULL,ySTRUCT<aARRAY<STRING>,bBOOLNOTNULL,cFLOAT64 >NOTNULL,zSTRING)
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. So, instead ofmydataset.newtable, your table qualifier might be`myproject.mydataset.newtable`.
If the table name exists in the dataset, the following error is returned:
Already Exists:project_id:dataset.table
The table schema contains 3 columns:
- x: A
REQUIREDinteger - y: A
REQUIREDSTRUCT containing a (an array of strings), b (aREQUIREDboolean), and c (aNULLABLEfloat) z: A
Note: When you examine the table schema in the Google Cloud console, aSTRUCT is displayed as a RECORD, and an ARRAY is displayed as REPEATED. TheSTRUCT and ARRAY data types are used to create nested and repeated data inBigQuery. For more information, seeSpecifying nested and repeated fields.NULLABLEstring
Creating a table with collation support
The following examples create a table namednewtable inmydataset withcolumnsa,b,c, and a struct with fieldsx andy.
AllSTRING column schemas in this table are collated with'und:ci':
CREATETABLEmydataset.newtable(aSTRING,bSTRING,cSTRUCT<xFLOAT64yARRAY<STRING>>)DEFAULTCOLLATE'und:ci';
Onlyb andy are collated with'und:ci':
CREATETABLEmydataset.newtable(aSTRING,bSTRINGCOLLATE'und:ci',cSTRUCT<xFLOAT64yARRAY<STRINGCOLLATE'und:ci'>>);
Creating a table with parameterized data types
The following example creates a table namednewtable inmydataset. Theparameters in parentheses specify that the column contains a parameterized datatype. SeeParameterized Data Typesfor more information about parameterized types.
CREATETABLEmydataset.newtable(xSTRING(10),ySTRUCT<aARRAY<BYTES(5)>,bNUMERIC(15,2)OPTIONS(rounding_mode='ROUND_HALF_EVEN'),cFLOAT64>,zBIGNUMERIC(35))
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. Instead ofmydataset.newtable, your table qualifier should be`myproject.mydataset.newtable`.
If the table name exists in the dataset, the following error is returned:
Already Exists:project_id:dataset.table
The table schema contains 3 columns:
- x: A parameterized string with a maximum length of 10
- y: A STRUCT containing a (an array of parameterized bytes with a maximum length of 5), b (aparameterized NUMERIC with a maximum precision of 15, maximum scale of 2,and rounding mode set to 'ROUND_HALF_EVEN'), and c (a float)
- z: A parameterized BIGNUMERIC with a maximum precision of 35 and maximum scale of 0
Creating a partitioned table
The following example creates apartitioned tablenamednewtable inmydataset using aDATE column:
CREATETABLEmydataset.newtable(transaction_idINT64,transaction_dateDATE)PARTITIONBYtransaction_dateOPTIONS(partition_expiration_days=3,description="a table partitioned by transaction_date")
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. So, instead ofmydataset.newtable, your table qualifier might be`myproject.mydataset.newtable`.
The table schema contains 2 columns:
- transaction_id: An integer
- transaction_date: A date
The table option list specifies the:
- Partition expiration: Three days
- Description:
A table partitioned by transaction_date
Creating a partitioned table from the result of a query
The following example creates apartitioned tablenameddays_with_rain inmydataset using aDATE column:
CREATETABLEmydataset.days_with_rainPARTITIONBYdateOPTIONS(partition_expiration_days=365,description="weather stations with precipitation, partitioned by day")ASSELECTDATE(CAST(yearASINT64),CAST(moASINT64),CAST(daASINT64))ASdate,(SELECTANY_VALUE(name)FROM`bigquery-public-data.noaa_gsod.stations`ASstationsWHEREstations.usaf=stn)ASstation_name,-- Stations can have multiple namesprcpFROM`bigquery-public-data.noaa_gsod.gsod2017`ASweatherWHEREprcp!=99.9-- Filter unknown valuesANDprcp >0-- Filter stations/days with no precipitationIf you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_idcontains special characters:`project_id.dataset.table`. So, instead ofmydataset.days_with_rain, your table qualifier might be`myproject.mydataset.days_with_rain`.
The table schema contains 2 columns:
- date: The
DATEof data collection - station_name: The name of the weather station as a
STRING - prcp: The amount of precipitation in inches as a
FLOAT64
The table option list specifies the:
- Partition expiration: One year
- Description:
Weather stations with precipitation, partitioned by day
Creating a clustered table
Example 1
The following example creates aclustered tablenamedmyclusteredtable inmydataset. The table is apartitioned table,partitioned by a truncatedTIMESTAMP column and clustered by aSTRING columnnamedcustomer_id.
CREATETABLEmydataset.myclusteredtable(input_timestampTIMESTAMP,customer_idSTRING,transaction_amountNUMERIC)PARTITIONBYTIMESTAMP_TRUNC(input_timestamp,HOUR)CLUSTERBYcustomer_idOPTIONS(partition_expiration_days=3,description="a table clustered by customer_id")
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_id contains special characters:`project_id.dataset.table`. So, instead ofmydataset.myclusteredtable, your table qualifier might be`myproject.mydataset.myclusteredtable`.
The table schema contains 3 columns:
- input_timestamp: The time of data collection as a
TIMESTAMP - customer_id: The customer ID as a
STRING - transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Partition expiration: 3 days
- Description:
A table clustered by customer_id
Example 2
The following example creates aclustered tablenamedmyclusteredtable inmydataset. The table is aningestion-time partitioned table.
CREATETABLEmydataset.myclusteredtable(customer_idSTRING,transaction_amountNUMERIC)PARTITIONBYDATE(_PARTITIONTIME)CLUSTERBYcustomer_idOPTIONS(partition_expiration_days=3,description="a table clustered by customer_id")
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_id contains special characters:`project_id.dataset.table`. So, instead ofmydataset.myclusteredtable, your table qualifier might be`myproject.mydataset.myclusteredtable`.
The table schema contains 2 columns:
- customer_id: The customer ID as a
STRING - transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Partition expiration: 3 days
- Description:
A table clustered by customer_id
Example 3
The following example creates aclustered tablenamedmyclusteredtable inmydataset. The table is not partitioned.
CREATETABLEmydataset.myclusteredtable(customer_idSTRING,transaction_amountNUMERIC)CLUSTERBYcustomer_idOPTIONS(description="a table clustered by customer_id")
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_id contains special characters:`project_id.dataset.table`. So, instead ofmydataset.myclusteredtable, your table qualifier might be`myproject.mydataset.myclusteredtable`.
The table schema contains 2 columns:
- customer_id: The customer ID as a
STRING - transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Description:
A table clustered by customer_id
Creating a table with autonomous embedding generation
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
The following example creates a table namedembedded_table inmydataset withanautonomous embedding generationcolumnembedding that generates embeddings fromcontent column:
CREATETABLEmydataset.embedded_table(idINT64,contentSTRING,embeddingSTRUCT,statusSTRING>GENERATEDALWAYSAS(AI.EMBED(content,connection_id=>"US.embed_connection",endpoint=>"text-embedding-005"))STOREDOPTIONS(asynchronous=true));
Creating a clustered table from the result of a query
Example 1
The following example creates a partitioned andclustered tablenamedmyclusteredtable inmydataset using the result of a query.
CREATETABLEmydataset.myclusteredtable(input_timestampTIMESTAMP,customer_idSTRING,transaction_amountNUMERIC)PARTITIONBYDATE(input_timestamp)CLUSTERBYcustomer_idOPTIONS(partition_expiration_days=3,description="a table clustered by customer_id")ASSELECT*FROMmydataset.myothertable
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_id contains special characters:`project_id.dataset.table`. So, instead ofmydataset.myclusteredtable, your table qualifier might be`myproject.mydataset.myclusteredtable`.
The table schema contains 3 columns:
- input_timestamp: The time of data collection as a
TIMESTAMP - customer_id: The customer ID as a
STRING - transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Partition expiration: 3 days
- Description:
A table clustered by customer_id
Example 2
The following example creates aclustered tablenamedmyclusteredtable inmydataset using the result of a query. The tableis not partitioned.
CREATETABLEmydataset.myclusteredtable(customer_idSTRING,transaction_amountNUMERIC)CLUSTERBYcustomer_idOPTIONS(description="a table clustered by customer_id")ASSELECT*FROMmydataset.myothertable
If you haven't configured a default project, prepend a project ID to the datasetname in the example SQL, and enclose the name in backticks ifproject_id contains special characters:`project_id.dataset.table`. So, instead ofmydataset.myclusteredtable, your table qualifier might be`myproject.mydataset.myclusteredtable`.
The table schema contains 2 columns:
- customer_id: The customer ID as a
STRING - transaction_amount: The transaction amount as
NUMERIC
The table option list specifies the:
- Description:
A table clustered by customer_id
Creating a temporary table
The following example creates a temporary table namedExample and insertsvalues into it.
CREATETEMPTABLEExample(xINT64,ySTRING);INSERTINTOExampleVALUES(5,'foo');INSERTINTOExampleVALUES(6,'bar');SELECT*FROMExample;
This script returns the following output:
+-----+---+-----+|Row|x|y|+-----+---|-----+|1|5|foo||2|6|bar|+-----+---|-----+Load data across clouds
Example 1
Suppose you have a BigLake table namedmyawsdataset.orders thatreferences data fromAmazon S3.You want to transfer data from that table to aBigQuery tablemyotherdataset.shipments in the US multi-region.
First, display information about themyawsdataset.orders table:
bqshowmyawsdataset.orders;The output is similar to the following:
Last modified Schema Type Total URIs Expiration----------------- -------------------------- ---------- ------------ ----------- 31 Oct 17:40:28 |- l_orderkey: integer EXTERNAL 1 |- l_partkey: integer |- l_suppkey: integer |- l_linenumber: integer |- l_returnflag: string |- l_linestatus: string |- l_commitdate: date
Next, display information about themyotherdataset.shipments table:
bqshowmyotherdataset.shipments
The output is similar to the following. Some columns are omitted to simplify theoutput.
Last modified Schema Total Rows Total Bytes Expiration Time Partitioning Clustered Fields Total Logical ----------------- --------------------------- ------------ ------------- ------------ ------------------- ------------------ --------------- 31 Oct 17:34:31 |- l_orderkey: integer 3086653 210767042 210767042 |- l_partkey: integer |- l_suppkey: integer |- l_commitdate: date |- l_shipdate: date |- l_receiptdate: date |- l_shipinstruct: string |- l_shipmode: string
Now, using theCREATE TABLE AS SELECT statement you can selectively load datato themyotherdataset.orders table in the US multi-region:
CREATEORREPLACETABLEmyotherdataset.ordersPARTITIONBYDATE_TRUNC(l_commitdate,YEAR)ASSELECT*FROMmyawsdataset.ordersWHEREEXTRACT(YEARFROMl_commitdate)=1992;
ResourceExhausted error, retry after sometime. If the issue persists, you cancontact support.You can then perform a join operation with the newly created table:
SELECTorders.l_orderkey,orders.l_orderkey,orders.l_suppkey,orders.l_commitdate,orders.l_returnflag,shipments.l_shipmode,shipments.l_shipinstructFROMmyotherdataset.shipmentsJOIN`myotherdataset.orders`asordersONorders.l_orderkey=shipments.l_orderkeyANDorders.l_partkey=shipments.l_partkeyANDorders.l_suppkey=shipments.l_suppkeyWHEREorders.l_returnflag='R';-- 'R' means refunded.
When new data is available, append the data of the 1993 year to the destinationtable using theINSERT INTO SELECT statement:
INSERTINTOmyotherdataset.ordersSELECT*FROMmyawsdataset.ordersWHEREEXTRACT(YEARFROMl_commitdate)=1993;
Example 2
The following example inserts data into an ingestion-time partitioned table:
CREATETABLEmydataset.orders(idString,numeric_idINT64)PARTITIONBY_PARTITIONDATE;
After creating a partitioned table, you can insert data into the ingestion-timepartitioned table:
INSERTINTOmydataset.orders(_PARTITIONTIME,id,numeric_id)SELECTTIMESTAMP("2023-01-01"),id,numeric_id,FROMmydataset.ordersof23WHEREnumeric_id>4000000;
CREATE TABLE LIKE statement
Creates a new table with all of the same metadata of another table.
Syntax
CREATE[ORREPLACE]TABLE[IFNOTEXISTS]table_nameLIKE[[project_name.]dataset_name.]source_table_name...[OPTIONS(table_option_list)]
Details
This statement is a variant of theCREATE TABLE statement and has the samelimitations.Other than the use of theLIKE clause in place of a column list,the syntax is identical to theCREATE TABLE syntax.
TheCREATE TABLE LIKE statement copies only the metadata of the source table.You can use theAS query_statement clause to include data into the new table.
The new table has no relationship to the source table after creation; thusmodifications to the source table will not propagate to the new table.
By default, the new table inherits partitioning, clustering, and optionsmetadata from the source table. You can customize metadata in the new table byusing the optional clauses in the SQL statement. For example, if you want tospecify a different set of options for the new table, then include theOPTIONSclause with a list of options and values. This behavior matches that ofALTER TABLE SET OPTIONS.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the table. |
bigquery.tables.get | The source table. |
In addition, theOR REPLACE clause requiresbigquery.tables.update andbigquery.tables.updateData permissions.
If theOPTIONS clause includes any expiration options, then thebigquery.tables.delete permission is also required.
Examples
Example 1
The following example creates a new table namednewtable inmydataset with the same metadata assourcetable:
CREATETABLEmydataset.newtableLIKEmydataset.sourcetable
Example 2
The following example creates a new table namednewtable inmydataset with the same metadata assourcetable and the data from theSELECT statement:
CREATETABLEmydataset.newtableLIKEmydataset.sourcetableASSELECT*FROMmydataset.myothertable
CREATE TABLE COPY statement
Creates a table that has the same metadata and data as another table.The source table can be a table, atable clone, or atable snapshot.
Syntax
CREATE[ORREPLACE]TABLE[IFNOTEXISTS]table_nameCOPYsource_table_name...[OPTIONS(table_option_list)]
Details
This statement is a variant of theCREATE TABLE statement and has the samelimitations.Other than the use of theCOPY clause in place of a column list,the syntax is identical to theCREATE TABLE syntax.
TheCREATE TABLE COPY statement copies both the metadata and data from thesource table.
The new table inherits partitioning and clustering from the source table. Bydefault, the table options metadata from the source table are also inherited,but you can override table options by using theOPTIONS clause. The behavioris equivalent to runningALTER TABLE SET OPTIONS after the table is copied.
The new table has no relationship to the source table after creation;modifications to the source table are not propagated to the new table.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the table copy. |
bigquery.tables.get | The source table. |
bigquery.tables.getData | The source table. |
In addition, theOR REPLACE clause requiresbigquery.tables.update andbigquery.tables.updateData permissions.
If theOPTIONS clause includes any expiration options, then thebigquery.tables.delete permission is also required.
CREATE SNAPSHOT TABLE statement
Creates atable snapshot based on asource table. The source table can be a table, atable clone, ora table snapshot.
Syntax
CREATESNAPSHOTTABLE[IFNOTEXISTS]table_snapshot_nameCLONEsource_table_name[FORSYSTEM_TIMEASOFtime_expression][OPTIONS(snapshot_option_list)]
Arguments
IF NOT EXISTS: If a table snapshot or othertable resourceexists with the same name, theCREATEstatement has no effect.table_snapshot_name: The name of the table snapshot that you want to create.The table snapshot name must be unique per dataset. SeeTable path syntax.source_table_name: The name of the table that you want to snapshot or thetable snapshot that you want to copy. SeeTable path syntax.If the source table is a standard table, then BigQuery createsa table snapshot of the source table. If the source table is a table snapshot,then BigQuery creates a copy of the table snapshot.
FOR SYSTEM_TIME AS OF:Lets you select the version of the table that was current at the timespecified bytimestamp_expression. It can only be used when creating asnapshot of a table; it can't be used when making a copy of a table snapshot.snapshot_option_list: Additional table snapshotcreation options such as alabel and an expirationtime.
Details
CREATE SNAPSHOT TABLE statements must comply with the following rules:
- Only one
CREATEstatement is allowed. - The source table must be one of the following:
- A table
- A table clone
- A table snapshot
- The
FOR SYSTEM_TIME AS OFclause can only be used when creating a snapshotof a table or table clone; it can't be used when making a copy of a tablesnapshot.
snapshot_option_list
The option list lets you set table snapshot options such as alabel and an expiration time. You can include multipleoptions using a comma-separated list.
Specify a table snapshot option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
expiration_timestamp | TIMESTAMP | Example: This property is equivalent to the |
friendly_name |
| Example: This property is equivalent to the |
description |
| Example: This property is equivalent to the |
labels |
| Example: This property is equivalent to the |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
VALUE is a constant expression that contains only literals, query parameters,and scalar functions.
The constant expression cannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT,CREATE, andUPDATE - User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRINGREPLACEREGEXP_REPLACERANDFORMATLPADRPADREPEATSESSION_USERGENERATE_ARRAYGENERATE_DATE_ARRAY
IfVALUE evaluates toNULL, the corresponding optionNAME in theCREATE SNAPSHOT TABLE statement is ignored.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the tablesnapshot. |
bigquery.tables.createSnapshot | The source table. |
bigquery.tables.get | The source table. |
bigquery.tables.getData | The source table. |
Examples
Create a table snapshot: fail if it already exists
The following example creates a table snapshot of the tablemyproject.mydataset.mytable. The table snapshot is created in the datasetmydataset and is namedmytablesnapshot:
CREATESNAPSHOTTABLE`myproject.mydataset.mytablesnapshot`CLONE`myproject.mydataset.mytable`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL48HOUR),friendly_name="my_table_snapshot",description="A table snapshot that expires in 2 days",labels=[("org_unit","development")])
If the table snapshot name already exists in the dataset, then the followingerror is returned:
Already Exists: myproject.mydataset.mytablesnapshot
The table snapshot option list specifies the following:
- Expiration time: 48 hours after the time the table snapshot is created
- Friendly name:
my_table_snapshot - Description:
A table snapshot that expires in 2 days - Label:
org_unit = development
Create a table snapshot: ignore if it already exists
The following example creates a table snapshot of the tablemyproject.mydataset.mytable. The table snapshot is created in the datasetmydataset and is namedmytablesnapshot:
CREATESNAPSHOTTABLEIFNOTEXISTS`myproject.mydataset.mytablesnapshot`CLONE`myproject.mydataset.mytable`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL48HOUR),friendly_name="my_table_snapshot",description="A table snapshot that expires in 2 days"labels=[("org_unit","development")])
The table snapshot option list specifies the following:
- Expiration time: 48 hours after the time the table snapshot is created
- Friendly name:
my_table_snapshot - Description:
A table snapshot that expires in 2 days - Label:
org_unit = development
If the table snapshot name alreadyexists in the dataset, then no action is taken, and no error is returned.
For information about restoring table snapshots, seeCREATE TABLE CLONE.
For information about removing table snapshots, seeDROP SNAPSHOT TABLE.
CREATE TABLE CLONE statement
Creates atable clone based on a sourcetable. The source table can be a table, a table clone,or atable snapshot.
Syntax
CREATETABLE[IFNOTEXISTS]destination_table_nameCLONEsource_table_name[FORSYSTEM_TIMEASOFtime_expression]...[OPTIONS(table_option_list)]
Details
Other than the use of theCLONE clause in place of a column list, the syntaxis identical to theCREATE TABLE syntax.
Arguments
IF NOT EXISTS: If the specified destination table name already exists, theCREATEstatement has no effect.destination_table_name: The name of the table that you want to create.The table name mustbe unique per dataset. The table name can contain the following:- Up to 1,024 characters
- Letters (upper or lower case), numbers, and underscores
OPTIONS(table_option_list): Lets you specifyadditional table creation options such as alabel andan expiration time.source_table_name: The name of the source table.
CREATE TABLE CLONE statements must comply with the following rules:
- Only one
CREATEstatement is allowed. - The table that is being cloned must be a table, a table clone, ora table snapshot.
OPTIONS
CREATE TABLE CLONE options are the same asCREATE TABLE options.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the table clone. |
bigquery.tables.get | The source table. |
bigquery.tables.getData | The source table. |
bigquery.tables.restoreSnapshot | The source table (required only if the source table is a table snapshot). |
If theOPTIONS clause includes any expiration options, then thebigquery.tables.delete permission is also required.
Examples
Restore a table snapshot: fail if destination table already exists
The following example creates the tablemyproject.mydataset.mytable from the table snapshotmyproject.mydataset.mytablesnapshot:
CREATETABLE`myproject.mydataset.mytable`CLONE`myproject.mydataset.mytablesnapshot`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL365DAY),friendly_name="my_table",description="A table that expires in 1 year",labels=[("org_unit","development")])
If the table name exists in the dataset, then the following error isreturned:
Already Exists: myproject.mydataset.mytable.
The table option list specifies the following:
- Expiration time: 365 days after the time that the table is created
- Friendly name:
my_table - Description:
A table that expires in 1 year - Label:
org_unit = development
Create a clone of a table: ignore if the destination table already exists
The following example creates the table clonemyproject.mydataset.mytableclone based on the tablemyproject.mydataset.mytable:
CREATETABLEIFNOTEXISTS`myproject.mydataset.mytableclone`CLONE`myproject.mydataset.mytable`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL365DAY),friendly_name="my_table",description="A table that expires in 1 year",labels=[("org_unit","development")])
The table option list specifies the following:
- Expiration time: 365 days after the time the table is created
- Friendly name:
my_table - Description:
A table that expires in 1 year - Label:
org_unit = development
If the table name exists in the dataset, then no action is taken, and no erroris returned.
For information about creating a copy of a table, seeCREATE TABLE COPY.
For information about creating a snapshot of a table, seeCREATE SNAPSHOT TABLE.
CREATE VIEW statement
Creates a new view.
Syntax
CREATE[ORREPLACE]VIEW[IFNOTEXISTS]view_name[(view_column_name_list)][OPTIONS(view_option_list)]ASquery_expressionview_column_name_list:=view_column[,...]view_column:=column_name[OPTIONS(view_column_option_list)]
Arguments
OR REPLACE: Replaces any view with the same name if it exists. Cannotappear withIF NOT EXISTS.IF NOT EXISTS: If a view or othertable resourceexists with the same name, theCREATEstatement has no effect. Cannot appearwithOR REPLACE.view_name: The name of the view you're creating. SeeTable path syntax.view_column_name_list: Lets you explicitlyspecify the column names of the view, which may be aliases to the column namesin the underlying SQL query.view_option_list: Additional view creation optionssuch as alabel and an expiration time.query_expression: The GoogleSQL query expression used to define theview.
Details
CREATE VIEW statements must comply with the following rules:
- Only one
CREATEstatement is allowed.
view_column_name_list
The view's column name list is optional. The names must be unique but do not have to be the same as the column names of the underlying SQL query. For example, if your view is created with the following statement:
CREATEVIEWmydataset.age_groups(age,count)ASSELECTage,COUNT(*)FROMmydataset.peoplegroupbyage;Then you can query it with:
SELECTage,countfrommydataset.age_groups;The number of columns in the column name list must match the number of columns in the underlying SQL query. If the columns in the table of the underlying SQL query is added or dropped, the view becomes invalid and must be recreated. For example, if theage column is dropped from themydataset.people table, then the view created in the previous example becomes invalid.
view_column_option_list
Theview_column_option_list lets you specify optional top-level columnoptions. Column options for a view have the same syntax and requirements asfor a table, but with a different list ofNAME andVALUE fields:
NAME | VALUE | Details |
|---|---|---|
description |
| Example: |
view_option_list
The option list allows you to set view options such as alabel and an expiration time. You can include multipleoptions using a comma-separated list.
Specify a view option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
expiration_timestamp | TIMESTAMP | Example: This property is equivalent to theexpirationTime table resource property. |
friendly_name |
| Example: This property is equivalent to thefriendlyName table resource property. |
description |
| Example: This property is equivalent to thedescription table resource property. |
labels |
| Example: This property is equivalent to thelabels table resource property. |
privacy_policy |
| The policies to enforce when anyone queries the view. To learn more about the policies available for a view, see the |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags for the view, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
VALUE is a constant expression containing only literals, query parameters,and scalar functions.
The constant expressioncannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT,CREATE, orUPDATE - User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRINGREPLACEREGEXP_REPLACERANDFORMATLPADRPADREPEATSESSION_USERGENERATE_ARRAYGENERATE_DATE_ARRAY
IfVALUE evaluates toNULL, the corresponding optionNAME in theCREATE VIEW statement is ignored.
privacy_policy
The following policies are available in theprivacy_policy view optionto createanalysis rules. A policy representsa condition that needs to be met before a query can be run.
| Policy | Details |
|---|---|
| The aggregation threshold policy to enforce when a view is queried. Syntax: '{ "aggregation_threshold_policy": { "threshold": value, "privacy_unit_columns": value }}'Parameters:
Example: privacy_policy='{"aggregation_threshold_policy": {"threshold" : 50, "privacy_unit_columns": "ID"}}' |
| A differential privacy policy for the view. When this parameter is included, only differentially private queries can be run on the view. Syntax: '{ "differential_privacy_policy": { "privacy_unit_column": value, "max_epsilon_per_query": value, "epsilon_budget": value, "delta_per_query": value, "delta_budget": value, "max_groups_contributed": value }}'Parameters:
Example: privacy_policy='{"differential_privacy_policy": { "privacy_unit_column": "contributor_id", "max_epsilon_per_query": 0.01, "epsilon_budget": 25.6, "delta_per_query": 0.005, "delta_budget": 9.6, "max_groups_contributed": 2}}' |
| A join restriction policy for the view. When this parameter is included, only the specified joins can be run on the specified columns in the view. This policy can be used alone or with other policies, such as the aggregation threshold or differential privacy policy. Syntax: '{ "join_restriction_policy": { "join_condition": value, "join_allowed_columns": value }}'Parameters:
Example: privacy_policy='{"join_restriction_policy": { "join_condition": 'JOIN_ANY', "join_allowed_columns": ['col1', 'col2']}}' |
Default project in view body
If the view is created in the same project used to run theCREATE VIEWstatement, the view bodyquery_expression can reference entities withoutspecifying the project; the default project is the projectwhich owns the view. Consider the sample query below.
CREATEVIEWmyProject.myDataset.myViewASSELECT*FROManotherDataset.myTable;After running the aboveCREATE VIEW query in the projectmyProject, you canrun the querySELECT * FROM myProject.myDataset.myView. Regardless of the project youchoose to run thisSELECT query, the referenced tableanotherDataset.myTableis always resolved against projectmyProject.
If the view is not created in the same project used to run theCREATE VIEWstatement, then all references in the view bodyquery_expression must bequalified with project IDs. For instance, the preceding sampleCREATE VIEW queryis invalid if it runs in a project different frommyProject.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the view. |
In addition, theOR REPLACE clause requiresbigquery.tables.updatepermission.
If theOPTIONS clause includes an expiration time, then thebigquery.tables.delete permission is also required.
Examples
Creating a new view
The following example creates a view namednewview inmydataset:
CREATEVIEW`myproject.mydataset.newview`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL48HOUR),friendly_name="newview",description="a view that expires in 2 days",labels=[("org_unit","development")])ASSELECTcolumn_1,column_2,column_3FROM`myproject.mydataset.mytable`If the view name exists in the dataset, the following error is returned:
Already Exists:project_id:dataset.table
The view is defined using the following GoogleSQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
newview - Description:
A view that expires in 2 days - Label:
org_unit = development
Creating a view only if the view doesn't exist
The following example creates a view namednewview inmydataset only if noview namednewview exists inmydataset. If the view name exists in thedataset, no error is returned, and no action is taken.
CREATEVIEWIFNOTEXISTS`myproject.mydataset.newview`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL48HOUR),friendly_name="newview",description="a view that expires in 2 days",labels=[("org_unit","development")])ASSELECTcolumn_1,column_2,column_3FROM`myproject.mydataset.mytable`The view is defined using the following GoogleSQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
newview - Description:
A view that expires in 2 days - Label:
org_unit = development
Creating or replacing a view
The following example creates a view namednewview inmydataset, and ifnewview exists inmydataset, it is overwritten using the specified queryexpression.
CREATEORREPLACEVIEW`myproject.mydataset.newview`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL48HOUR),friendly_name="newview",description="a view that expires in 2 days",labels=[("org_unit","development")])ASSELECTcolumn_1,column_2,column_3FROM`myproject.mydataset.mytable`
The view is defined using the following GoogleSQL query:
SELECT column_1, column_2, column_3 FROMmyproject.mydataset.mytable
The view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
newview - Description:
A view that expires in 2 days - Label:
org_unit = development
Creating a view with column descriptions
The following example creates a view namednewview inmydataset. This viewdefinition provides the column description for each column inmytable.You can rename columns from the original query.
CREATEVIEW`myproject.mydataset.newview`(column_1_new_nameOPTIONS(DESCRIPTION='Description of the column 1 contents'),column_2_new_nameOPTIONS(DESCRIPTION='Description of the column 2 contents'),column_3_new_nameOPTIONS(DESCRIPTION='Description of the column 3 contents'))ASSELECTcolumn_1,column_2,column_3FROM`myproject.mydataset.mytable`
CREATE MATERIALIZED VIEW statement
Creates a new materialized view.
Syntax
CREATE[ORREPLACE]MATERIALIZEDVIEW[IFNOTEXISTS]materialized_view_name[PARTITIONBYpartition_expression][CLUSTERBYclustering_column_list][OPTIONS(materialized_view_option_list)]ASquery_expression
Arguments
OR REPLACE: Replaces a materialized view with the same name if it exists.Cannot appear withIF NOT EXISTS.IF NOT EXISTS: If a materialized view or othertable resourceexists with the same name, theCREATEstatement has no effect. Cannot appearwithOR REPLACE.materialized_view_name: The name of the materialized view you're creating.SeeTable path syntax.If the
project_nameis omitted from the materialized view name, or it is thesame as the project that runs this DDL query, then the latter is also used asthe default project for references to tables, functions, and other resourcesinquery_expression. The default project of the references is fixed and doesnot depend on the future queries that invoke the new materialized view.Otherwise, all references inquery_expressionmust be qualified withproject names.The materialized view name must be unique per dataset.
partition_expression: An expression that determineshow to partition the table. A materialized view can only be partitioned in thesame way as the table inquery expression(thebase table) is partitioned.clustering_column_list: A comma-separated listof column references that determine how to cluster the materialized view.materialized_view_option_list: Allows youto specify additional materialized view options such as a whether refresh isenabled, the refresh interval, alabel, and anexpiration time.query_expression: The GoogleSQL query expression used to define thematerialized view.
Details
CREATE MATERIALIZED VIEW statements must comply with the following rules:
- Only one
CREATEstatement is allowed.
Default project in materialized view body
If the materialized view is created in the same project used to run theCREATE MATERIALIZED VIEWstatement, the materialized view bodyquery_expression can reference entities withoutspecifying the project; the default project is the projectwhich owns the materialized view. Consider the sample query below.
CREATEMATERIALIZEDVIEWmyProject.myDataset.myViewASSELECT*FROManotherDataset.myTable;After running the aboveCREATE MATERIALIZED VIEW query in the projectmyProject, you canrun the querySELECT * FROM myProject.myDataset.myView. Regardless of the project youchoose to run thisSELECT query, the referenced tableanotherDataset.myTableis always resolved against projectmyProject.
If the materialized view is not created in the same project used to run theCREATE VIEWstatement, then all references in the materialized view bodyquery_expression must bequalified with project IDs. For instance, the preceding sampleCREATE MATERIALIZED VIEW queryis invalid if it runs in a project different frommyProject.
materialized_view_option_list
The option list allows you to set materialized view options such as a whetherrefresh is enabled. the refresh interval, alabel andan expiration time. You can include multiple options using a comma-separatedlist.
Specify a materialized view option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
enable_refresh | BOOLEAN | Example: |
refresh_interval_minutes | FLOAT64 | Example: |
expiration_timestamp | TIMESTAMP | Example: This property is equivalent to theexpirationTime table resource property. |
max_staleness | INTERVAL | Example: The |
allow_non_incremental_definition | BOOLEAN | Example: The |
kms_key_name |
| Example: This property is equivalent to theencryptionConfiguration.kmsKeyName table resource property. See more details aboutProtecting data with Cloud KMS keys. |
friendly_name |
| Example: This property is equivalent to thefriendlyName table resource property. |
description |
| Example: This property is equivalent to thedescription table resource property. |
labels |
| Example: This property is equivalent to thelabels table resource property. |
tags | ARRAY<STRUCT<STRING, STRING>> | An array of IAM tags for the materialized view, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the materializedview. |
In addition, theOR REPLACE clause requiresbigquery.tables.updatepermission.
If theOPTIONS clause includes any expiration options, then thebigquery.tables.delete permission is also required.
Examples
Creating a new materialized view
The following example creates a materialized view namednew_mv inmydataset:
CREATEMATERIALIZEDVIEW`myproject.mydataset.new_mv`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL48HOUR),friendly_name="new_mv",description="a materialized view that expires in 2 days",labels=[("org_unit","development")],enable_refresh=true,refresh_interval_minutes=20)ASSELECTcolumn_1,SUM(column_2)ASsum_2,AVG(column_3)ASavg_3FROM`myproject.mydataset.mytable`GROUPBYcolumn_1
If the materialized view name exists in the dataset, the following error isreturned:
Already Exists:project_id:dataset.materialized_view
When you use a DDL statement to create a materialized view, you must specify theproject, dataset, and materialized view in the following format:`project_id.dataset.materialized_view`(including the backticks ifproject_id contains special characters); for example,`myproject.mydataset.new_mv`.
The materialized view is defined using the following GoogleSQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The materialized view option list specifies the:
- Expiration time: 48 hours from the time the materialized view is created
- Friendly name:
new_mv - Description:
A materialized view that expires in 2 days - Label:
org_unit = development - Refresh enabled: true
- Refresh interval: 20 minutes
Creating a materialized view only if the materialized view doesn't exist
The following example creates a materialized view namednew_mv inmydatasetonly if no materialized view namednew_mv exists inmydataset. If thematerialized view name exists in the dataset, no error is returned, and noaction is taken.
CREATEMATERIALIZEDVIEWIFNOTEXISTS`myproject.mydataset.new_mv`OPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL48HOUR),friendly_name="new_mv",description="a view that expires in 2 days",labels=[("org_unit","development")],enable_refresh=false)ASSELECTcolumn_1,column_2,column_3FROM`myproject.mydataset.mytable`
The materialized view is defined using the following GoogleSQL query:
SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`
The materialized view option list specifies the:
- Expiration time: 48 hours from the time the view is created
- Friendly name:
new_mv - Description:
A view that expires in 2 days - Label:
org_unit = development - Refresh enabled: false
Creating a materialized view with partitioning and clustering
The following example creates a materialized view namednew_mv inmydataset,partitioned by thecol_datetime column and clusteredby thecol_int column:
CREATEMATERIALIZEDVIEW`myproject.mydataset.new_mv`PARTITIONBYDATE(col_datetime)CLUSTERBYcol_intASSELECTcol_int,col_datetime,COUNT(1)ascntFROM`myproject.mydataset.mv_base_table`GROUPBYcol_int,col_datetime
The base table,mv_base_table, must also be partitioned by thecol_datetime column. For more information, seeWorking with partitioned and clustered tables.
CREATE MATERIALIZED VIEW AS REPLICA OF statement
Creates areplica of a materialized view.The source materialized view mustbe over an Amazon Simple Storage Service (Amazon S3) BigLake table. You can use the materializedview replica to make Amazon S3 data available locally for joins.
For more information, seeCreate materialized view replicas.
Syntax
CREATEMATERIALIZEDVIEWreplica_name[OPTIONS(materialized_view_replica_option_list)]ASREPLICAOFsource_materialized_view_name
Arguments
replica_name: The name of the materialized view replica you're creating, intable path syntax. If the project name is omitted from thematerialized view replica name, the current project is used as the default.The materialized view replica name must be unique for each dataset.
materialized_view_replica_option_list:Allows you to specify options such as the replication interval.source_materialized_view_name: The name of the materialized view you arereplicating, in table path syntax. The source materialized view must be overan Amazon S3 BigLake table, and must be authorized on thedataset that contains that table.
materialized_view_replica_option_list
The option list lets you set materialized view replica options.
Specify a materialized view replica option list in the following format:
NAME=VALUE, ...
NAME | VALUE | Details |
|---|---|---|
replication_interval_seconds | INT64 | Specifies how often to replicate the data from the source materialized view to the replica. Must be a value between Example: |
Required permissions
This statement requires the followingIAM permissions:
bigquery.tables.createbigquery.tables.getbigquery.tables.getDatabigquery.tables.replicateDatabigquery.jobs.create
Example
The following example creates a materialized view replica namedmv_replicainbq_dataset:
CREATEMATERIALIZEDVIEW`myproject.bq_dataset.mv_replica`OPTIONS(replication_interval_seconds=600)ASREPLICAOF`myproject.s3_dataset.my_s3_mv`
CREATE EXTERNAL SCHEMA statement
Creates a new federated dataset.
A federated dataset is a connection between BigQuery and anexternal data source at the dataset level. For more information about creatingfederated datasets, see the following:
Syntax
CREATEEXTERNALSCHEMA[IFNOTEXISTS]dataset_name[WITHCONNECTIONconnection_name][OPTIONS(external_schema_option_list)]
Arguments
IF NOT EXISTS: If any dataset exists with the same name, theCREATEstatement has no effect.dataset_name: The name of the dataset to create.connection_name: Specifies aconnection resource that hascredentials for accessing the external data. Specify the connection namein the formPROJECT_ID.LOCATION.CONNECTION_ID. If theproject ID or location contains a dash, enclose the connection name inbackticks (`).external_schema_option_list: A list ofoptions for creating the federated dataset.
Details
The dataset is created in the location that you specify in the query settings.For more information, seeSpecify locations.The location must support the kind of federated dataset that you are creating,for example, you can only create AWS Glue federated datasets in AWSlocations.
For more information about creating a dataset, seeCreate datasets. For information about quotas, seedataset limits.
external_schema_option_list
The option list specifies options for the federated dataset. Specify the optionsin the following format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
description | STRING | The description of the dataset. |
friendly_name | STRING | A descriptive name for the dataset. |
labels | <ARRAY<STRUCT<STRING, STRING>>> | An array of labels for the dataset, expressed as key-value pairs. |
location | STRING | The location in which to create the dataset. If you don't specify this option, the dataset is created in the location where the query runs. If you specify this option and also explicitly set the location for the query job, the two values must match; otherwise the query fails. The location must support the kind of federated dataset that you are creating, for example, you can only create AWS Glue federated datasets in AWS locations. |
external_source | STRING | The source of the external dataset. For AWS Glue federated datasets this must be anAmazon Resource Name (ARN), with a prefix identifying the source, such asaws-glue://. For Spanner federated datasets, this must be a specific Spanner database with agoogle-cloudspanner:/ prefix. For example:google-cloudspanner:/projects/my_project/instances/my_instance/databases/my_database. |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.datasets.create | The project where you create the federated dataset. |
bigquery.connections.use | The project where you create the federated dataset. |
bigquery.connections.delegate | The project where you create the federated dataset. |
Examples
The following example creates an AWS Glue federated dataset:
CREATEEXTERNALSCHEMAmydatasetWITHCONNECTIONmyproject.`aws-us-east-1`.myconnectionOPTIONS(external_source='aws-glue://arn:aws:glue:us-east-1:123456789:database/test_database',location='aws-us-east-1');CREATE EXTERNAL TABLE statement
Creates a new external table.
External tables let BigQuery query data that is stored outside ofBigQuery storage. For more information about external tables, seeIntroduction to external data sources.
Syntax
CREATE[ORREPLACE]EXTERNALTABLE[IFNOTEXISTS]table_name[(column_namecolumn_schema,...)][WITHCONNECTION{connection_name|DEFAULT}][WITHPARTITIONCOLUMNS[(partition_column_namepartition_column_type,...)]]OPTIONS(external_table_option_list,...);
Arguments
OR REPLACE: Replaces any external table with the same name if it exists.Cannot appear withIF NOT EXISTS.IF NOT EXISTS: If an external table or othertable resourceexists with the same name, theCREATEstatement has no effect. Cannot appearwithOR REPLACE.table_name: The name of the external table. SeeTable path syntax.column_name: The name of a column in the table.column_schema: Specifies the schema of the column. It uses the samesyntax as thecolumn_schemadefinition intheCREATE TABLEstatement. If you don't includethis clause, BigQuery detects the schema automatically.connection_name: Specifies aconnection resource that hascredentials for accessing the external data. Specify the connection namein the formPROJECT_ID.LOCATION.CONNECTION_ID. If theproject ID or location contains a dash, enclose the connection name inbackticks (`). To use adefault connection,specifyDEFAULTinstead of the connection string containingPROJECT_ID.LOCATION.CONNECTION_ID.partition_column_name: The name of a partition column. Include thisfield if your external data uses a hive-partitioned layout. For moreinformation, see:Supported data layouts.partition_column_type: The partition column type.external_table_option_list: A list of optionsfor creating the external table.
Details
TheCREATE EXTERNAL TABLE statement does not support creating temporaryexternal tables.
To create an externally partitioned table, use theWITH PARTITION COLUMNSclause to specify the partition schema details. BigQueryvalidates the column definitions against the external data location. The schemadeclaration must strictly follow the ordering of the fields in the externalpath. For more information about external partitioning, seeQuerying externally partitioned data.
external_table_option_list
The option list specifies options for creating the external table. Theformatanduris options are required. Specify the option list in the followingformat:NAME=VALUE, ...
| Options | |
|---|---|
allow_jagged_rows |
If Applies to CSV data. |
allow_quoted_newlines |
If Applies to CSV data. |
bigtable_options |
Only required when creating a Bigtable external table. Specifies the schema of the Bigtable external table in JSON format. For a list of Bigtable table definition options, see |
compression |
The compression type of the data source. Supported values include: Applies to CSV and JSON data. |
decimal_target_types |
Determines how to convert a Example: |
description |
A description of this table. |
enable_list_inference |
If Applies to Parquet data. |
enable_logical_types |
If Applies to Avro data. |
encoding |
The character encoding of the data. Supported values include: Applies to CSV data. |
enum_as_string |
If Applies to Parquet data. |
expiration_timestamp |
The time when this table expires. If not specified, the table does not expire. Example: |
field_delimiter |
The separator for fields in a CSV file. Applies to CSV data. |
format |
The format of the external data. Supported values for Supported values for The value |
hive_partition_uri_prefix |
A common prefix for all source URIs before the partition key encoding begins. Applies only to hive-partitioned external tables. Applies to Avro, CSV, JSON, Parquet, and ORC data. Example: |
file_set_spec_type |
Specifies how to interpret source URIs for load jobs and external tables. Supported values include:
For example, if you have a source URI of |
ignore_unknown_values |
If Applies to CSV and JSON data. |
json_extension |
For JSON data, indicates a particular JSON interchange format. If not specified, BigQuery reads the data as generic JSON records. Supported values include: |
max_bad_records |
The maximum number of bad records to ignore when reading the data. Applies to: CSV, JSON, and Google Sheets data. |
max_staleness |
Applicable forBigLake tables andobject tables. Specifies whether cached metadata is used by operations against the table, and how fresh the cached metadata must be in order for the operation to use it. To disable metadata caching, specify 0. This is the default. To enable metadata caching, specify aninterval literal value between 30 minutes and 7 days. For example, specify |
null_marker |
The string that represents Applies to CSV data. |
null_markers |
(Preview) The list of strings that represent This option cannot be used with Applies to CSV data. |
object_metadata |
Only required when creating anobject table. Set the value of this option to |
preserve_ascii_control_characters |
If Applies to CSV data. |
projection_fields |
A list of entity properties to load. Applies to Datastore data. |
quote |
The string used to quote data sections in a CSV file. If your data contains quoted newline characters, also set the Applies to CSV data. |
reference_file_schema_uri |
User provided reference file with the table schema. Applies to Parquet/ORC/AVRO data. Example: |
require_hive_partition_filter |
If Applies to Avro, CSV, JSON, Parquet, and ORC data. |
sheet_range |
Range of a Google Sheets spreadsheet to query from. Applies to Google Sheets data. Example: |
skip_leading_rows |
The number of rows at the top of a file to skip when reading the data. Applies to CSV and Google Sheets data. |
source_column_match |
(Preview) This controls the strategy used to match loaded columns to the schema. If this value is unspecified, then the default is based on how the schema is provided. If autodetect is enabled, then the default behavior is to match columns by name. Otherwise, the default is to match columns by position. This is done to keep the behavior backward-compatible. Supported values include:
|
tags | <ARRAY<STRUCT<STRING, STRING>>>An array of IAM tags for the table, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
time_zone |
(Preview) Default time zone that will apply when parsing timestamp values that have no specific time zone. Checkvalid time zone names. If this value is not present, the timestamp values without specific time zone is parsed using default time zone UTC. Applies to CSV and JSON data. |
date_format |
(Preview) Format elements that define how the DATE values are formatted in the input files (for example, If this value is present, this format is the only compatible DATE format.Schema autodetection will also decide DATE column type based on this format instead of the existing format. If this value is not present, the DATE field is parsed with thedefault formats. Applies to CSV and JSON data. |
datetime_format |
(Preview) Format elements that define how the DATETIME values are formatted in the input files (for example, If this value is present, this format is the only compatible DATETIME format.Schema autodetection will also decide DATETIME column type based on this format instead of the existing format. If this value is not present, the DATETIME field is parsed with thedefault formats. Applies to CSV and JSON data. |
time_format |
(Preview) Format elements that define how the TIME values are formatted in the input files (for example, If this value is present, this format is the only compatible TIME format.Schema autodetection will also decide TIME column type based on this format instead of the existing format. If this value is not present, the TIME field is parsed with thedefault formats. Applies to CSV and JSON data. |
timestamp_format |
(Preview) Format elements that define how the TIMESTAMP values are formatted in the input files (for example, If this value is present, this format is the only compatible TIMESTAMP format.Schema autodetection will also decide TIMESTAMP column type based on this format instead of the existing format. If this value is not present, the TIMESTAMP field is parsed with thedefault formats. Applies to CSV and JSON data. |
uris | For external tables, including object tables, that aren't Bigtable tables:
An array of fully qualified URIs for the external data locations. Each URI can contain one asterisk ( The following examples show valid
For Bigtable tables:
The URI identifying the Bigtable table to use as a data source. You can only specify one Bigtable URI. Example: For more information on constructing a Bigtable URI, seeRetrieve the Bigtable URI. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.create | The dataset where you create the external table. |
In addition, theOR REPLACE clause requiresbigquery.tables.updatepermission.
If theOPTIONS clause includes an expiration time, then thebigquery.tables.delete permission is also required.
Examples
The following example creates a BigLake table and explicitlyspecifies the schema. It also specifies refreshing metadata cache automaticallyat a system-defined interval.
CREATEORREPLACEEXTERNALTABLEmydataset.newtable(xINT64,ySTRING,zBOOL)WITHCONNECTIONmyconnectionOPTIONS(format="PARQUET",max_staleness=STALENESS_INTERVAL,metadata_cache_mode='AUTOMATIC');The following example creates an external table from multiple URIs. The dataformat is CSV. This example uses schema auto-detection.
CREATEEXTERNALTABLEdataset.CsvTableOPTIONS(format='CSV',uris=['gs://bucket/path1.csv','gs://bucket/path2.csv']);The following example creates an external table from a CSV file and explicitlyspecifies the schema. It also specifies the field delimiter ('|') and sets themaximum number of bad records allowed.
CREATEORREPLACEEXTERNALTABLEdataset.CsvTable(xINT64,ySTRING)OPTIONS(format='CSV',uris=['gs://bucket/path1.csv'],field_delimiter='|',max_bad_records=5);The following example creates an externally partitioned table. It uses schema auto-detection to detect both the file schema and the hive partitioning layout. If the external path isgs://bucket/path/field_1=first/field_2=1/data.parquet, the partition columns are detected asfield_1 (STRING) andfield_2 (INT64).
CREATEEXTERNALTABLEdataset.AutoHivePartitionedTableWITHPARTITIONCOLUMNSOPTIONS(uris=['gs://bucket/path/*'],format='PARQUET',hive_partition_uri_prefix='gs://bucket/path',require_hive_partition_filter=false);
The following example creates an externally partitioned table by explicitly specifying the partition columns. This example assumes that the external file path has the patterngs://bucket/path/field_1=first/field_2=1/data.parquet.
CREATEEXTERNALTABLEdataset.CustomHivePartitionedTableWITHPARTITIONCOLUMNS(field_1STRING,-- column order must match the external pathfield_2INT64)OPTIONS(uris=['gs://bucket/path/*'],format='PARQUET',hive_partition_uri_prefix='gs://bucket/path',require_hive_partition_filter=false);
CREATE FUNCTION statement
Creates a newuser-defined function(UDF). BigQuery supports UDFs written in SQL, JavaScript, orPython.
Syntax
To create a SQL UDF, use the following syntax:
CREATE[ORREPLACE][TEMPORARY|TEMP]FUNCTION[IFNOTEXISTS][[project_name.]dataset_name.]function_name([named_parameter[,...]])([named_parameter[,...]])[RETURNSdata_type]AS(sql_expression)[OPTIONS(function_option_list)]named_parameter:param_nameparam_type
To create a JavaScript UDF, use the following syntax:
CREATE[ORREPLACE][TEMPORARY|TEMP]FUNCTION[IFNOTEXISTS][[project_name.]dataset_name.]function_name([named_parameter[,...]])RETURNSdata_type[determinism_specifier]LANGUAGEjs[OPTIONS(function_option_list)]ASjavascript_codenamed_parameter:param_nameparam_typedeterminism_specifier:{DETERMINISTIC|NOTDETERMINISTIC}
To create a Python UDF, use the following syntax:
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: For support during the preview, emailbq-python-udf-feedback@google.com.CREATE[ORREPLACE]FUNCTION[IFNOTEXISTS][project_name.]dataset_name.function_name([named_parameter[,...]])RETURNSdata_typeLANGUAGEpython[WITHCONNECTIONconnection_path]OPTIONS(function_option_list)ASpython_codenamed_parameter:param_nameparam_type
To create a remote function, use the following syntax:
CREATE[ORREPLACE][TEMPORARY|TEMP]FUNCTION[IFNOTEXISTS][[project_name.]dataset_name.]function_name([named_parameter[,...]])RETURNSdata_typeREMOTEWITHCONNECTIONconnection_path[OPTIONS(function_option_list)]named_parameter:param_nameparam_type
Routine names must contain only letters, numbers, and underscores, and be at most 256 characters long.
Arguments
OR REPLACE: Replaces any function with the same name if it exists. Cannotappear withIF NOT EXISTS.IF NOT EXISTS: If any dataset exists with the same name, theCREATEstatement has no effect. Cannot appear withOR REPLACE.TEMPorTEMPORARY: Creates a temporary function. If the clause is notpresent, the statement creates a persistent UDF. You can reuse persistentUDFs across multiple queries, whereas you can only use temporary UDFs in asingle query, script, session, or procedure.project_name: For persistent functions, the name of the project whereyou are creating the function. Defaults to the project that runs the DDLquery. Do not include the project name for temporary functions.dataset_name: For persistent functions, the name of the dataset whereyou are creating the function. Defaults to thedefaultDatasetin therequest. Do not include the dataset name for temporary functions.function_name: The name of the function.named_parameter: A comma-separatedparam_nameandparam_typepair. The value ofparam_typeis aBigQuerydata type. For a SQL UDF, thevalue ofparam_typecan also beANY TYPE.determinism_specifier: Applies only to JavaScript UDFs.Provides a hint to BigQuery as to whether the query result canbe cached. Can be one of the following values:DETERMINISTIC: The function always returns the same result when passedthe same arguments. The query result is potentially cacheable. Forexample, if the functionadd_one(i)always returnsi + 1, thefunction is deterministic.NOT DETERMINISTIC: The function does not always return the same resultwhen passed the same arguments, and therefore is not cacheable. Forexample, if the functionjadd_random(i)returnsi + rand(), thefunction is not deterministic and BigQuery does not usecached results.If all of the invoked functions are
DETERMINISTIC,BigQuery tries to cache the result, unless the resultscan't be cached for other reasons. For more information, seeUsing cached query results.
data_type: The data type that the function returns.If the function is defined in SQL, then the
RETURNSclause isoptional. If theRETURNSclause is omitted, then BigQueryinfers the result type of the function from the SQL function body when aquery calls the function.If the function is defined in JavaScript, then the
RETURNSclause isrequired. For more information about allowed values fordata_type,seeSupportedJavaScript UDF data types.
sql_expression: The SQL expression that defines the function.function_option_list: A list of options forcreating the function.javascript_code: The definition of a JavaScript function. The value is astring literal.If the code includes quotes and backslashes, it must be either escaped orrepresented as a raw string. For example, the codereturn "\n";can berepresented as one of the following:- Quoted string
"return \"\\n\";". Both quotes and backslashes needto be escaped. - Triple quoted string:
"""return "\\n";""". Backslashes needto be escaped, quotes don't. - Raw string:
r"""return "\n";""". No escaping is needed.
- Quoted string
python_code: The definition of a Python function. The value is astring literal.If the code includes quotes and backslashes, it must be escaped orrepresented as a raw string. For example, the codereturn "\n";can berepresented as one of the following:- Quoted string:
"return \"\\n\";". Both quotes and backslashes needto be escaped. - Triple quoted string:
"""return "\\n";""". Backslashes needto be escaped, quotes don't. - Raw string:
r"""return "\n";""". No escaping is needed.
- Quoted string:
connection_name: Specifies aconnection resource that hascredentials for accessing the remote endpoint or for running Python code.Specify the connection name in the formproject_name.location.connection_id: If theproject name or location contains a dash, enclose the connection name inbackticks (`).
function_option_list
The option list specifies options for creating a UDF. The following options aresupported:
NAME | VALUE | Details |
|---|---|---|
description |
| A description of the UDF. This option isn't supported when creating a temporary function. |
library |
| An array of JavaScript libraries to include in the function definition. Applies only to JavaScript and Python UDFs. For more information, seeIncluding JavaScript libraries. Example: |
endpoint |
| An HTTP endpoint of Cloud Functions. Applies only to remote functions. Example: For more information, see Create a remote function. |
user_defined_context |
| A list of key-value pairs that will be sent with every HTTP request when the function is invoked. Applies only to remote functions. Example: |
max_batching_rows |
| The maximum number of rows in each HTTP request. If not specified, BigQuery decides how many rows are included in a HTTP request. Applies only to remote functions and Python UDFs. |
runtime_version |
| The name of the runtime version to run provided Python code. Applies only to Python UDFs. Example: |
entry_point |
| The name of the function defined in Python code as the entry point when the Python UDF is invoked. Applies only to Python UDFs. |
packages |
| An array of Python packages to install in the function definition. Applies only to Python UDFs. For more information, see Use third party packages. Example: |
container_cpu |
| Amount of CPU provisioned for a Python UDF container instance. Applied only to Python UDFs. For more information, see Configure container limits for Python UDF. |
container_memory |
| Amount of memory provisioned for a Python UDF container instance. Applies only to Python UDFs. For more information, see Configure container limits for Python UDF. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.routines.create | The dataset where you create the function. |
In addition, theOR REPLACE clause requiresbigquery.routines.updatepermission.
To create a remote function, additionalIAM permissions are needed:
| Permission | Resource |
|---|---|
bigquery.connections.delegate | The connection which you use to create theremote function. |
Examples
Create a SQL UDF
The following example creates a persistent SQL UDF namedmultiplyInputs ina dataset namedmydataset.
CREATEFUNCTIONmydataset.multiplyInputs(xFLOAT64,yFLOAT64)RETURNSFLOAT64AS(x*y);Create a JavaScript UDF
The following example creates a temporary JavaScript UDF namedmultiplyInputsand calls it from inside aSELECT statement.
CREATETEMPFUNCTIONmultiplyInputs(xFLOAT64,yFLOAT64)RETURNSFLOAT64LANGUAGEjsASr""" return x*y;""";SELECTmultiplyInputs(a,b)FROM(SELECT3asa,2asb);Create a remote function
The following example creates a temporary remote function namedtempRemoteMultiplyInputs inUS location, using a connection calledmyconnection in the 'US' region.
CREATETEMPFUNCTIONtempRemoteMultiplyInputs(xFLOAT64,yFLOAT64)RETURNSFLOAT64REMOTEWITHCONNECTIONus.myconnectionOPTIONS(endpoint="https://us-central1-myproject.cloudfunctions.net/multiply");The following example creates a persistent remote function namedremoteMultiplyInputs in a dataset namedmydataset using a connection calledmyconnection. The location and project of the dataset and the connection mustmatch.
CREATEFUNCTIONmydataset.remoteMultiplyInputs(xFLOAT64,yFLOAT64)RETURNSFLOAT64REMOTEWITHCONNECTIONus.myconnectionOPTIONS(endpoint="https://us-central1-myproject.cloudfunctions.net/multiply");Create a Python UDF
The following example creates a Python UDF namedmultiplyInputs.
CREATEFUNCTIONmydataset.multiplyInputs(xFLOAT64,yFLOAT64)RETURNSFLOAT64LANGUAGEpythonOPTIONS(entry_point='multiply',runtime_version='python-3.11'packages=['pandas==2.2'])ASr"""import pandas as pddef multiply(df: pd.DataFrame): return df['x'] * df['y']""";CREATE AGGREGATE FUNCTION statement (SQL)
Creates a new SQLuser-defined aggregate function(UDAF).
Syntax
To create a SQL UDAF, use the following syntax:
CREATE[ORREPLACE][{TEMPORARY|TEMP}]AGGREGATEFUNCTION[IFNOTEXISTS]function_path([function_parameter[,...]])[RETURNSdata_type]AS(sql_function_body)[OPTIONS(function_option_list)]function_path:[[project_name.]dataset_name.]function_namefunction_parameter:parameter_namedata_type[NOTAGGREGATE]
Arguments
OR REPLACE: Replaces any function with the same name if it exists.OR REPLACEcan't appear withIF NOT EXISTS.IF NOT EXISTS: If any dataset exists with the same name, theCREATEstatement has no effect.IF NOT EXISTScan't appear withOR REPLACE.TEMPorTEMPORARY: The function is temporary; that is, it exists for the lifetime of a single query, script, session, or procedure. A temporary function can't have the same name as a built-in function. If the names match, an error is produced. IfTEMPorTEMPORARYis not included, a persistent function is created. You can reuse persistent functions across multiple queries.function_path: The path where the function must be created and the name of the function.project_name: For persistent functions, the name of the project where you are creating the function. Defaults to the project that runs the DDL query. Don't include the project name for temporary functions.dataset_name: For persistent functions, the name of the dataset where you are creating the function. Defaults todefaultDatasetin the request. Don't include the dataset name for temporary functions.function_name: The name of the function. Function names must contain only letters, numbers, and underscores, and be at most 256 characters long.
function_parameter: A parameter for the function.parameter_name: The name of the function parameter.parameter_data_type: The GoogleSQLdata type for the function parameter.NOT AGGREGATE: The function parameter is not an aggregate. A non-aggregate function parameter can appear anywhere in the function definition.
return_data_type: The GoogleSQL data type that the function should return. GoogleSQL infers the result data type of the function from the function body when theRETURNclause is omitted.function_body: The SQL expression that defines the function body.function_option_list: A list of options for creating the function. For more information, seefunction_option_list.
function_option_list
The option list specifies options for creating a SQL UDAF. The followingoptions are supported:
NAME | VALUE | Details |
|---|---|---|
description |
| A description of the UDAF. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.routines.create | The dataset where you create the function. |
In addition, theOR REPLACE clause requires thebigquery.routines.updatepermission.
Examples
Create and call a SQL UDAF
The following example shows a persistent SQL UDAF that includes anon-aggregate function parameter. Inside the function definition, theaggregateSUM method takes the aggregate function parameter dividend,while the non-aggregate division operator (/ ) takes thenon-aggregate function parameter divisor.
CREATEAGGREGATEFUNCTIONmyProject.myDataset.ScaledSum(dividendFLOAT64,divisorFLOAT64NOTAGGREGATE)RETURNSFLOAT64AS(SUM(dividend)/divisor);-- Call the SQL UDAF.SELECTScaledSum(col1,2)ASscaled_sumFROM(SELECT1AScol1UNIONALLSELECT3AScol1UNIONALLSELECT5AScol1);/*------------* | scaled_sum | +------------+ | 4.5 | *------------*/
CREATE AGGREGATE FUNCTION statement (JavaScript)
Creates a newJavaScript user-defined aggregate function(UDAF).
Syntax
To create a JavaScript UDAF, use the following syntax:
CREATE[ORREPLACE][{TEMPORARY|TEMP}]AGGREGATEFUNCTION[IFNOTEXISTS]function_path([function_parameter[,...]])RETURNSreturn_data_typeLANGUAGEjs[OPTIONS(function_option_list)]ASfunction_bodyfunction_path:[[project_name.]dataset_name.]function_namefunction_parameter:parameter_nameparameter_data_type[NOTAGGREGATE]
Arguments
OR REPLACE: Replaces any function with the same name if it exists.OR REPLACEcan't appear withIF NOT EXISTS.IF NOT EXISTS: If any dataset exists with the same name, theCREATEstatement has no effect.IF NOT EXISTScan't appear withOR REPLACE.TEMPorTEMPORARY: The function is temporary; that is, it exists for the lifetime of a single query, script, session, or procedure. A temporary function can't have the same name as a built-in function. If the names match, an error is produced. IfTEMPorTEMPORARYis not included, a persistent function is created. You can reuse persistent functions across multiple queries.function_path: The path where the function must be created and the name of the function.project_name: For persistent functions, the name of the project where you are creating the function. Defaults to the project that runs the DDL query. Don't include the project name for temporary functions.dataset_name: For persistent functions, the name of the dataset where you are creating the function. Defaults todefaultDatasetin the request. Don't include the dataset name for temporary functions.function_name: The name of the function. Function names must contain only letters, numbers, and underscores, and be at most 256 characters long.
function_parameter: A parameter for the function.parameter_name: The name of the function parameter.parameter_data_type: The GoogleSQLdata type for the function parameter.NOT AGGREGATE: The function parameter is not an aggregate. Only one non-aggregate function parameter is allowed per JavaScript UDAF, and it must be the last parameter in the list.
return_data_type: The GoogleSQL data type that the function should return.function_body: The JavaScript expression that defines the function body. For more information, seefunction_body.function_option_list: A list of options for creating the function. For more information, seefunction_option_list.
function_body
The body of the JavaScript function must be a quoted string literalthat represents the JavaScript code. To learn more about the different types ofquoted string literals you can use, seeFormats for quoted literals.
Only certain type encodings are allowed. To learn more,seeSQL type encodings in a JavaScript UDAF.
The JavaScript function body must include four JavaScript functionsthat initialize, aggregate, merge, and finalize the results for theJavaScript UDAF. To learn more about theinitialState,aggregate,merge,andfinalize JavaScript functions, seeRequired aggregate functions in a JavaScript UDAF.
Only serialized data can be passed into the JavaScript aggregate functions.If you need to serialize data such as functions or symbols to pass them intothe aggregate functions, use the JavaScript serialization functions.For more information, seeSerialization functions for a JavaScript UDAF.
function_option_list
The option list specifies options for creating a JavaScript UDAF. The followingoptions are supported:
NAME | VALUE | Details |
|---|---|---|
description |
| A description of the UDAF. |
library |
| An array of JavaScript libraries to include in the JavaScript UDAF function body. Example: |
SQL type encodings in a JavaScript UDAF
In JavaScript UDAFs,GoogleSQL data typesrepresentJavaScript data typesin the following manner:
| GoogleSQL data type | JavaScript data type | Notes |
|---|---|---|
ARRAY | Array | An array of arrays is not supported. To get around this limitation, use theArray<Object<Array>> (JavaScript) andARRAY<STRUCT<ARRAY>> (GoogleSQL) data types. |
BIGNUMERIC | Number orString | Same asNUMERIC. |
BOOL | Boolean | |
BYTES | Uint8Array | |
DATE | Date | |
FLOAT64 | Number | |
INT64 | BigInt | |
JSON | Various types | The GoogleSQLJSON data type can be converted into a JavaScriptObject,Array, or other GoogleSQL-supported JavaScript data type. |
NUMERIC | Number orString | If aNUMERIC value can be represented exactly as anIEEE 754 floating-point value (range[-253, 253]), and has no fractional part, it is encoded as aNumber data type, otherwise it is encoded as aString data type. |
STRING | String | |
STRUCT | Object | EachSTRUCT field is a named property in theObject data type. An unnamedSTRUCT field is not supported. |
TIMESTAMP | Date | Date contains a microsecond field with the microsecond fraction ofTIMESTAMP. |
Required aggregation functions in a JavaScript UDAF
The JavaScript function body must include the following exportableJavaScript functions:
initialStatefunction: Sets up the initial aggregation state of the UDAFand then returns the initial aggregation state.Syntax:
exportfunctioninitialState([nonAggregateParam]){...}Parameters:
nonAggregateParam: Replace this parameter with aNOT AGGREGATEfunction parameter name.
Examples:
exportfunctioninitialState(){...}exportfunctioninitialState(initialSum){...}aggregatefunction: Aggregates one row of data, updating state to storethe result of the aggregation. Doesn't return a value.Syntax:
exportfunctionaggregate(state,aggregateParam[,...][,nonAggregateParam]){...}Parameters:
state: The aggregate state, which isinitialStateon thefirst invocation, and then the return value of the previous call toaggregatethereafter.aggregateParam: The name of an aggregation parameter in theJavaScript UDAF. The argument for this parameter will be aggregated.nonAggregateParam: Replace with aNOT AGGREGATEfunction parameter name.
Example:
exportfunctionaggregate(currentState,aggX,aggWeight,initialSum)mergefunction: Combines two aggregation states from a prior callto theaggregate,merge, orinitialStatefunction. This function doesnot return a value.Syntax:
exportfunctionmerge(state,partialState[,nonAggregateParam]){...}Parameters:
state: The state into whichpartialStateis merged.partialState: The second aggregation state to merge.nonAggregateParam: Replace with aNOT AGGREGATEfunction parameter name.
Details:
Depending on the size and organization of the underlying data being queried,the
mergefunction might or might not be called. For example, if aparticular set of data is small, or the data is partitioned in a way thatresults in small sets of data, themergefunction won't be called.Example:
exportfunctionmerge(currentState,partialState,initialSum)finalizefunction: Computes the final aggregation result and then returnsthis result for the UDAF.Syntax:
exportfunctionfinalize(state[,nonAggregateParam]){...}Parameters:
state: The final aggregation state.nonAggregateParam: Replace with aNOT AGGREGATEfunction parameter name.
The final aggregation state is returned by the
mergefunction(oraggregatefunction ifmergeis never invoked). If the input is empty afterNULLfiltering,the final aggregation state isinitialState.Example:
exportfunctionfinalize(finalState,initialSum)
Serialization functions for a JavaScript UDAF
If you want to work with non-serializable aggregation states, theJavaScript UDAF must provide theserialize anddeserialize functions:
serializefunction: Converts an aggregation state into aBigQuery-serializable object. An object in JavaScriptis BigQuery-serializable if all fieldsare a JavaScript primitive data type (for example,String,Number,null,undefined), anotherBigQuery-serializable object, or a JavaScriptArray, where all elements are either primitives orBigQuery-serializable objects.Syntax:
exportfunctionserialize(state[,nonAggregateParam]){...}Arguments:
state: The aggregation state to serialize.nonAggregateParam: Replace with aNOT AGGREGATEfunction parameter name.
Example:
exportfunctionserialize(stateToSerialize,initialSum)deserializefunction: Converts a serialized state into an aggregationstate. An aggregated state can be passed into theserialize,aggregate,merge, andfinalizefunctions.Syntax:
exportfunctiondeserialize(serializedState[,nonAggregateParam]){...}Arguments:
serializedState: The serialized state to convert into the aggregationstate.nonAggregateParam: Replace with aNOT AGGREGATEfunction parameter name.
Example:
exportfunctiondeserialize(stateToDeserialize,initialSum)
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.routines.create | The dataset where you create the function. |
In addition, theOR REPLACE clause requires thebigquery.routines.updatepermission.
Examples
Calculate the positive sum of all rows
A JavaScript UDAF is similar to a JavaScript UDF, but defines anaggregate function instead of a scalar function. In the following example,a temporary JavaScript UDAF calculates the sum of all rows that have apositive value. The JavaScript UDAF body is quoted within a raw string:
CREATETEMPAGGREGATEFUNCTIONSumPositive(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr''' export function initialState() { return {sum: 0} } export function aggregate(state, x) { if (x > 0) { state.sum += x; } } export function merge(state, partialState) { state.sum += partialState.sum; } export function finalize(state) { return state.sum; }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([1.0,-1.0,3.0,-3.0,5.0,-5.0])ASx)SELECTSumPositive(x)ASsumFROMnumbers;/*-----* | sum | +-----+ | 9.0 | *-----*/
Get the weighted average of all rows
A JavaScript UDAF can have aggregate and non-aggregate parameters.In the following example, the JavaScript UDAF calculates the weighted averageforx after starting with an initial sum (initialSum).x andweight areaggregate parameters, andinitialSum is a non-aggregate parameter:
CREATEORREPLACEAGGREGATEFUNCTIONmy_project.my_dataset.WeightedAverage(xINT64,weightFLOAT64,initialSumFLOAT64NOTAGGREGATE)RETURNSINT64LANGUAGEjsAS''' export function initialState(initialSum) { return {count: 0, sum: initialSum} } export function aggregate(state, x, weight) { state.count += 1; state.sum += Number(x) * weight; } export function merge(state, partialState) { state.sum += partialState.sum; state.count += partialState.count; } export function finalize(state) { return state.sum / state.count; }''';SELECTmy_project.my_dataset.WeightedAverage(item,weight,2)ASweighted_averageFROM(SELECT1ASitem,2.45ASweightUNIONALLSELECT3ASitem,0.11ASweightUNIONALLSELECT5ASitem,7.02ASweight);/*------------------* | weighted_average | +------------------+ | 13 | *------------------*/
CREATE TABLE FUNCTION statement
Creates a newtable function,also called atable-valued function (TVF).
Syntax
CREATE[ORREPLACE]TABLEFUNCTION[IFNOTEXISTS][[project_name.]dataset_name.]function_name([function_parameter[,...]])[RETURNSTABLE <column_declaration[,...] >][OPTIONS(table_function_options_list)]ASsql_queryfunction_parameter:parameter_name{data_type|ANYTYPE|TABLE <column_declaration[,...] >}column_declaration:column_namedata_type
Arguments
OR REPLACE: Replaces any table function with the same name if it exists.Cannot appear withIF NOT EXISTS.IF NOT EXISTS: If any table function exists with the same name, theCREATEstatement has no effect. Cannot appear withOR REPLACE.project_name: The name of the project where you are creating the function.Defaults to the project that runs this DDL statement.dataset_name: The name of the dataset where you are creating the function.function_name: The name of the function to create.function_parameter: A parameter for the function, specified as a parametername and a data type. The value ofdata_typeis a scalarBigQuerydata typeorANY TYPE, or a table schema. Usingtable parameters in atable function is inPreview.RETURNS TABLE: The schema of the table that the function returns, specifiedas a comma-separated list of column name and data type pairs. IfRETURNSTABLEis absent, BigQuery infers the output schema from thequery statement in the function body. IfRETURNS TABLEis included, thenames in the returned table type must match column names from the SQL query.sql_query: Specifies the SQL query to run. The SQL query must include namesfor all columns.
table_function_options_list
Thetable_function_options_list lets you specify table function options. Table functionoptions have the same syntax and requirements as table options but with adifferent list ofNAMEs andVALUEs:
NAME | VALUE | Details |
|---|---|---|
description |
| The description of the table function. |
Details
BigQuery coerces argument types when possible. For example, ifthe parameter type isFLOAT64 and you pass anINT64 value, thenBigQuery coerces it to aFLOAT64.
If a parameter type isANY TYPE, the function accepts an input of any type forthis argument. The type that you pass to the function must be compatible withthe function definition. If you pass an argument with an incompatible type, thequery returns an error. If more than one parameter has typeANY TYPE,BigQuery does not enforce any type relationship between them.If the parameter type is a table schema, the function acceptsan input table containing a superset of those columns in any order.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.routines.create | The dataset where you create the tablefunction. |
In addition, theOR REPLACE clause requiresbigquery.routines.updatepermission.
Examples
The following table function takes anINT64 parameter that is used to filterthe results of a query:
CREATEORREPLACETABLEFUNCTIONmydataset.names_by_year(yINT64)ASSELECTyear,name,SUM(number)AStotalFROM`bigquery-public-data.usa_names.usa_1910_current`WHEREyear=yGROUPBYyear,name
The following example specifies the returnTABLE type in theRETURNS clause:
CREATEORREPLACETABLEFUNCTIONmydataset.names_by_year(yINT64)RETURNSTABLE<nameSTRING,yearINT64,totalINT64>ASSELECTyear,name,SUM(number)AStotalFROM`bigquery-public-data.usa_names.usa_1910_current`WHEREyear=yGROUPBYyear,name
The following example computes total sales for items with the nameitem_name from theorders table:
CREATETABLEFUNCTIONmydataset.compute_sales(ordersTABLE<itemSTRING,salesINT64>,item_nameSTRING)AS(SELECTSUM(sales)AStotal_sales,itemFROMordersWHEREitem=item_nameGROUPBYitem);
CREATE PROCEDURE statement
Creates a newprocedure,which is a block of statements that can be called from other queries.Procedures can call themselves recursively.
Syntax
To create aGoogleSQL stored procedure,use the following syntax:
CREATE[ORREPLACE]PROCEDURE[IFNOTEXISTS][[project_name.]dataset_name.]procedure_name(procedure_argument[,...])[OPTIONS(procedure_option_list)]BEGINmulti_statement_queryEND;procedure_argument:[procedure_argument_mode]argument_nameargument_type
procedure_argument_mode:IN|OUT|INOUT
To create astored procedure for Apache Spark,use the following syntax:
CREATE[ORREPLACE]PROCEDURE[IFNOTEXISTS][[project_name.]dataset_name.]procedure_name(procedure_argument[,...])[EXTERNALSECURITYexternal_security]WITHCONNECTIONconnection_project_id.connection_region.connection_id[OPTIONS(procedure_option_list)]LANGUAGElanguage[ASpyspark_code]procedure_argument:[procedure_argument_mode]argument_nameargument_type
procedure_argument_mode:IN|OUT|INOUTexternal_security:INVOKER
Arguments
OR REPLACE: Replaces any procedure with the same name if it exists. Cannotappear withIF NOT EXISTS.IF NOT EXISTS: If any procedure exists with the same name, theCREATEstatement has no effect. Cannot appear withOR REPLACE.project_name: The name of the project where you are creating theprocedure. Defaults to the project that runs this DDL query. If the projectname contains special characters such as colons, it should be quoted in backticks`(example:`google.com:my_project`).dataset_name: The name of the dataset where you are creating the procedure.Defaults to thedefaultDatasetin the request.procedure_name: The name of the procedure to create.external_security: The procedure to be executed with the privileges ofthe user that calls it.connection_project_id: the project thatcontains theconnection to runSpark procedures—for example,myproject.connection_region: the region thatcontains the connection to run Sparkprocedures—for example,us.connection_id: the connection ID—for example,myconnection.When youview the connection detailsin the Google Cloud console, the connection ID is the value in the lastsection of the fully qualified connection ID that is shown inConnection ID—for example
projects/myproject/locations/connection_location/connections/myconnection.For more information, seeCreate a stored procedure for Apache Spark.
multi_statement_query: Themulti-statement queryto run.language: The language in which the stored procedure for Apache Spark iswritten. BigQuery supports stored procedures for ApacheSpark that are written in Python, Java, or Scala.pyspark_code: The PySpark code for the stored procedure for Apache Spark ifyou want to pass the body of the procedure inline. Cannot appear withmain_file_uriinprocedure_option_list.argument_type: Any valid BigQuerytype.procedure_argument_mode: Specifies whether an argument is an input, anoutput, or both.
procedure_option_list
Theprocedure_option_list lets you specify procedure options. Procedureoptions have the same syntax and requirements as table options but with adifferent list ofNAMEs andVALUEs:
NAME | VALUE | Details |
|---|---|---|
strict_mode |
| It is useful for catching many common types of errors. The errors are not exhaustive, and successful creation of a procedure withstrict_mode doesn't guarantee that the procedure will successfully execute at runtime. If If Default value is strict_mode=FALSE |
description |
| A description of the procedure. Example: description="A procedure that runs a query." |
engine | STRING | The engine type for processing stored procedures for Apache Spark. Must be specified for stored procedures for Spark. Valid value:engine="SPARK" |
runtime_version | STRING | The runtime version of stored procedures for Spark. If not specified, the system default runtime version is used. Stored procedures for Spark support the same list of runtime versions as Serverless for Apache Spark. However, we recommend to specify a runtime version. For more information, seeServerless for Apache Spark runtime releases. Example:runtime_version="1.1" |
container_image | STRING | Custom container image for the runtime environment of the stored procedure for Spark. If not specified, the system default container image that includes the default Spark, Java, and Python packages associated with a runtime version is used. You can provide a custom container Docker image that includes your own built Java or Python dependencies. As Spark is mounted into your custom container at runtime, you must omit Spark in your custom container image. For optimized performance, we recommend you to host your image in Artifact Registry. For more information, seeUse custom containers with Serverless for Apache Spark. Example: |
properties | ARRAY<STRUCT<STRING, STRING>> | A key-value pair to include properties for stored procedures for Spark. Stored procedures for Spark support most of theSpark properties and a list of Serverless for Apache Spark properties. If you specify unsupported Spark properties such as YARN-related Spark properties, BigQuery fails to create the stored procedure. You can add Spark properties using the following format: bqquery--nouse_legacy_sql--dry_run'CREATE PROCEDURE my_bq_project.my_dataset.spark_proc()WITH CONNECTION `my-project-id.us.my-connection`OPTIONS(engine="SPARK",main_file_uri="gs://my-bucket/my-pyspark-main.py",properties=[("spark.executor.instances", "3"),("spark.yarn.am.memory", "3g")])LANGUAGE PYTHON'# Error in query string: Invalid value: \Invalidproperties:\Attemptedtosetunsupportedproperties:\[spark:spark.yarn.am.memory]at[1:1] |
main_file_uri | STRING | The Cloud Storage URI of the main Python, Scala, or Java JAR file of the Spark application. Applies only to stored procedures for Spark. Alternatively, if you want to add the body of the stored procedure that's written in Python in the main_file_uri="gs://my-bucket/my-pyspark-main.py" For Scala and Java languages, this field contains a path to only one JAR file. You can set only one value for main_file_uri="gs://my-bucket/my-scala-main.jar" |
main_class | STRING | Applies only to stored procedures for Spark written in Java and Scala. Specify a fully-qualified class name in a JAR set with the main_class=”com.example.wordcount” |
py_file_uris | ARRAY<STRING> | Python files to be placed on the Optional. Cloud Storage URIs of Python files to pass to the PySpark framework. Supported file formats include the following: py_file_uris=[ "gs://my-bucket/my-pyspark-file1.py", "gs://my-bucket/my-pyspark-file2.py" ] |
jar_uris | ARRAY<STRING> | Path to the JAR files to include on the driver and executor classpaths. Applies only to stored procedures for Apache Spark. Optional. Cloud Storage URIs of JAR files to add to the classpath of the Spark driver and tasks. Example:jar_uris=["gs://my-bucket/my-lib1.jar", "gs://my-bucket/my-lib2.jar"] |
file_uris | ARRAY<STRING> | Files to be placed in the working directory of each executor. Applies only to stored procedures for Apache Spark. Optional. Cloud Storage URIs of files to be placed in the working directory of each executor. Example:file_uris=["gs://my-bucket/my-file1", "gs://my-bucket/my-file2"] |
archive_uris | ARRAY<STRING> | Archive files to be extracted into the working directory of each executor. Applies only to stored procedures for Apache Spark. Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file formats include the following: archive_uris=["gs://my-bucket/my-archive1.zip", "gs://my-bucket/my-archive2.zip"] |
Argument mode
IN indicates that the argument is only an input to the procedure. You canspecify either a variable or a value expression forIN arguments.
OUT indicates that the argument is an output of the procedure. AnOUTargument is initialized toNULL when the procedure starts. Youmust specify a variable forOUT arguments.
INOUT indicates that the argument is both an input to and an output fromthe procedure. You must specify a variable forINOUT arguments. AnINOUTargument can be referenced in the body of a procedure as a variable and assignednew values.
If neitherIN,OUT, norINOUT is specified, the argument is treated as anIN argument.
Variable scope
If a variable is declared outside a procedure, passed as an INOUT or OUT argument to a procedure, and the procedure assigns a new value to that variable, that new value is visible outside of the procedure.
Variables declared in a procedure are not visible outside of the procedure,and vice versa.
AnOUT orINOUT argument can be assigned a value usingSET, in which casethe modified value is visible outside of the procedure. If the procedure exitssuccessfully, then the value of theOUT orINOUT argument is the final valueassigned to thatINOUT variable.
Temporary tables exist for the duration of thescript, so if a procedure creates a temporary table, the caller of the procedurewill be able to reference the temporary table as well.
Default project in procedure body
Procedure bodies can reference entities without specifying the project; thedefault project is the project which owns the procedure, not necessarily theproject used to run theCREATE PROCEDURE statement. Consider the sample querybelow.
CREATEPROCEDUREmyProject.myDataset.QueryTable()BEGINSELECT*FROManotherDataset.myTable;END;After creating the above procedure, you can run the queryCALL myProject.myDataset.QueryTable(). Regardless of the project youchoose to run thisCALL query, the referenced tableanotherDataset.myTableis always resolved against projectmyProject.
Required permissions
This statement requires the followingIAM permission:
| Permission | Resource |
|---|---|
bigquery.routines.create | The dataset where you create the procedure. |
To create a stored procedure for Apache Spark, additionalIAM permission are needed:
| Permission | Resource |
|---|---|
bigquery.connections.delegate | The connection which you use tocreate the stored procedure for Apache Spark. |
In addition, theOR REPLACE clause requiresbigquery.routines.updatepermission.
SQL examples
You can also seeexamples of stored procedures forApache Spark.
The following example creates a SQL procedure that both takesx as an inputargument and returnsx as output; because no argument mode is present for theargumentdelta, it is an input argument. The procedure consists of a blockcontaining a single statement, which assigns the sum of the two input argumentstox.
CREATEPROCEDUREmydataset.AddDelta(INOUTxINT64,deltaINT64)BEGINSETx=x+delta;END;The following example calls theAddDelta procedure from the example above,passing it the variableaccumulator both times; because the changes toxwithinAddDelta are visible outside ofAddDelta, these procedure callsincrementaccumulator by a total of 8.
DECLAREaccumulatorINT64DEFAULT0;CALLmydataset.AddDelta(accumulator,5);CALLmydataset.AddDelta(accumulator,3);SELECTaccumulator;This returns the following:
+-------------+|accumulator|+-------------+|8|+-------------+The following example creates the procedureSelectFromTablesAndAppend, whichtakestarget_date as an input argument and returnsrows_added as an output.The procedure creates a temporary tableDataForTargetDate from a query; then,it calculates the number of rows inDataForTargetDate and assigns the resulttorows_added. Next, it inserts a new row intoTargetTable, passing thevalue oftarget_date as one of the column names. Finally, it drops the tableDataForTargetDate and returnsrows_added.
CREATEPROCEDUREmydataset.SelectFromTablesAndAppend(target_dateDATE,OUTrows_addedINT64)BEGINCREATETEMPTABLEDataForTargetDateASSELECTt1.id,t1.x,t2.yFROMdataset.partitioned_table1ASt1JOINdataset.partitioned_table2ASt2ONt1.id=t2.idWHEREt1.date=target_dateANDt2.date=target_date;SETrows_added=(SELECTCOUNT(*)FROMDataForTargetDate);SELECTid,x,y,target_date-- note that target_date is a parameterFROMDataForTargetDate;DROPTABLEDataForTargetDate;END;The following example declares a variablerows_added, then passes it as anargument to theSelectFromTablesAndAppend procedure from the previous example,along with the value ofCURRENT_DATE; then it returns a message stating howmany rows were added.
DECLARErows_addedINT64;CALLmydataset.SelectFromTablesAndAppend(CURRENT_DATE(),rows_added);SELECTFORMAT('Added %d rows',rows_added);CREATE ROW ACCESS POLICY statement
Creates or replaces arow-level access policy. Row-levelaccess policies on a table must have unique names.
Syntax
CREATE[ORREPLACE]ROWACCESSPOLICY[IFNOTEXISTS]row_access_policy_nameONtable_name[GRANTTO(grantee_list)]FILTERUSING(filter_expression);Arguments
IF NOT EXISTS: If any row-level access policy exists with the same name, theCREATEstatement has no effect. Cannot appear withOR REPLACE.row_access_policy_name: The name of the row-level access policy that you arecreating. The row-level access policy name must be unique for each table. Therow-level access policy name can contain the following:- Up to 256 characters.
- Letters (upper or lowercase), numbers, and underscores. Must start with aletter.
table_name: The name of the table that you want to create a row-level accesspolicy for. The table must already exist.
Caution: If noGRANT TO grantee_list: An optional clause that specifies the initialmembers that the row-level access policy should be created with.grantee_listis provided, then the row-level access policyfor the specified filter is initialized with no principals. This configurationprevents all data reads by everyone.grantee_listis a list ofiam_memberusers or groups. Strings must bevalidIAM principals, ormembers, following the format of anIAM Policy Binding member,and must be quoted. The following types are supported:grantee_listtypesuser:{emailid}An email address that represents a specific Google account.
Example:
user:alice@example.comserviceAccount:{emailid}An email address that represents a serviceaccount.
Example:
serviceAccount:my-other-app@appspot.gserviceaccount.comgroup:{emailid}An email address that represents a Google group.
Example:
group:admins@example.comdomain:{domain}The Google Workspace domain (primary) that represents allthe users of that domain.
Example:
domain:example.comallAuthenticatedUsersA special identifier that represents all serviceaccounts and all users on the internet who have authenticated with aGoogle Account. This identifier includes accounts that aren't connected to aGoogle Workspace or Cloud Identity domain, such as personal Gmail accounts.Users who aren't authenticated, such as anonymous visitors, aren't included. allUsersA special identifier that represents anyone who is on the internet, including authenticated and unauthenticated users. Because BigQuery requires authentication before a user can access the service, allUsersincludes only authenticated users.You can combine a series of
iam_membervalues, if they are comma-separatedand quoted separately. For example:"user:alice@example.com","group:admins@example.com","user:sales@example.com"filter_expression: Defines the subset of table rows to show only to themembers of thegrantee_list. Thefilter_expressionis similar to theWHEREclause in aSELECTquery.The following are valid filter expressions:
- GoogleSQL scalar functions.
SESSION_USER(), to restrict access only to rows that belong to the userrunning the query. If none of the row-level access policies are applicableto the querying user, then the user has no access to the data in the table.TRUE. Grants the principals in thegrantee_listfield access to all rowsof the table.
The filter expression cannot contain the following:
- References to other tables, such assubqueries.
- SQL statements such as
SELECT,CREATE, orUPDATE. - User-defined functions.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.rowAccessPolicies.create | The target table. |
bigquery.rowAccessPolicies.setIamPolicy | The target table. |
bigquery.tables.getData | The target table. |
CREATE CAPACITY statement
Purchasesslots by creating a new capacity commitment.
Caution: Before you purchase slots, understand the details of thecommitment plans andpricing.Syntax
CREATECAPACITY`project_id.location_id.commitment_id`OPTIONS(capacity_commitment_option_list);
Arguments
project_id: The project ID of the administration project that willmaintain ownership of this commitment.location_id: Thelocation ofthe commitment.commitment_id: The ID of the commitment. The value must be unique to theproject and location. It must start and end with a lowercase letter or anumber and contain only lowercase letters, numbers and dashes.capacity_commitment_option_list: The options you can set to describe the capacity commitment.
capacity_commitment_option_list
The option list specifies options for the capacity commitment. Specify the options in the following format:NAME=VALUE, ...
The following options are supported:
NAME | TYPE | Details |
|---|---|---|
plan | String | The commitment plan to purchase. Supported values include:ANNUAL,THREE_YEAR, andTRIAL. For more information, seeCommitment plans. |
renewal_plan | String | The commitment renewal plan. Applies only whenplan isANNUAL,THREE_YEAR, orTRIAL. For more information, see Renewing commitments. |
slot_count | Integer | The number of slots in the commitment. |
edition | String | The edition associated with this reservation. For more information about editions, seeIntroduction to BigQuery editions. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.capacityCommitments.create | The administration project thatmaintains ownership of thecommitments. |
Example
The following example creates a capacity commitment of 100 annual slots that arelocated in theregion-us region and managed by a projectadmin_project:
CREATECAPACITY`admin_project.region-us.my-commitment`OPTIONS(slot_count=100,plan='ANNUAL');
CREATE RESERVATION statement
Creates a reservation. For more information, seeIntroduction to Reservations.
Syntax
CREATERESERVATION`project_id.location_id.reservation_id`OPTIONS(reservation_option_list);
Arguments
project_id: The project ID of the administration project where thecapacity commitment was created.location: Thelocation ofthe reservation.reservation_id: The reservation ID.reservation_option_list: The options you can set to describe the reservation.
reservation_option_list
The option list specifies options for the dataset. Specify the options in the following format:NAME=VALUE, ...
The following options are supported:
NAME | TYPE | Details |
|---|---|---|
ignore_idle_slots | BOOLEAN | If the value istrue, then the reservation uses only the slots that are provisioned to it. The default value isfalse. For more information, seeIdle slots. |
slot_capacity | INTEGER | The number of slots to allocate to the reservation. If this reservation was created with anedition, this is equivalent to the amount ofbaseline slots. |
target_job_concurrency | INTEGER | A soft upper bound on the number of jobs that can run concurrently in this reservation. |
edition | STRING | The edition associated with this reservation. For more information about editions, seeIntroduction to BigQuery editions. |
autoscale_max_slots | INTEGER | The maximum number of slots that could be added to the reservation by autoscaling. |
secondary_location | STRING | The secondary location to use in the case of disaster recovery. |
max_slots | INTEGER | The maximum number of slots the reservation can consume. For more details about predictable reservations, seeReservation predictability. |
scaling_mode | STRING | The scaling mode of the reservation. This value must be configured together withmax_slots. Also, this value must be aligned withignore_idle_slots. For details, seeReservation predictability. |
labels | <ARRAY<STRUCT<STRING, STRING>>> | An array of labels for the reservation, expressed as key-value pairs. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.reservations.create | The administration project that maintainsownership of the commitments. |
Example
The following example creates a reservation of 100 slots in the projectadmin_project:
CREATERESERVATION`admin_project.region-us.prod`OPTIONS(slot_capacity=100);
CREATE ASSIGNMENT statement
Assigns a project, folder, or organization to a reservation.
Syntax
CREATEASSIGNMENT`project_id.location_id.reservation_id.assignment_id`OPTIONS(assignment_option_list)
Arguments
project_id: The project ID of the administration project where thereservation was created.location: Thelocation ofthe reservation.reservation_id: The reservation ID.assignment_id: The ID of the assignment. The value must be unique to theproject and location. It must start and end with a lowercase letter or anumber and contain only lowercase letters, numbers and dashes.assignment_option_list: The options you can set to describe assignment.
To remove a project from any reservations and use on-demand billing instead, setreservation_id tonone.
assignment_option_list
The option list specifies options for the dataset. Specify the options in the following format:NAME=VALUE, ...
The following options are supported:
NAME | TYPE | Details |
|---|---|---|
assignee | String | The ID of the project, folder, or organization to assign to the reservation. |
job_type | String | The type of job to assign to this reservation. Supported values includeQUERY,PIPELINE,ML_EXTERNAL,CONTINUOUS, andBACKGROUND. For more information, seeReservation assignments. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.reservationAssignments.create | The administration project andthe assignee. |
Example
The following example assigns the projectmy_project to theprod reservationfor query jobs:
CREATEASSIGNMENT`admin_project.region-us.prod.my_assignment`OPTIONS(assignee='projects/my_project',job_type='QUERY');
The following example assigns an organization to theprod reservation forpipeline jobs, such as load and extract jobs:
CREATEASSIGNMENT`admin_project.region-us.prod.my_assignment`OPTIONS(assignee='organizations/1234',job_type='PIPELINE');
CREATE SEARCH INDEX statement
Creates a newsearch index on one or morecolumns of a table.
A search index enables efficient queries using theSEARCHfunction.
Syntax
CREATESEARCHINDEX[IFNOTEXISTS]index_nameONtable_name({ALLCOLUMNS[WITHCOLUMNOPTIONS(column[,...])]|column[,...]})[OPTIONS(index_option_list)]column:=column_name[OPTIONS(index_column_option_list)]
Arguments
IF NOT EXISTS: If there is already a search index by that name on the table,do nothing. If the table has a search index by a different name, then returnan error.index_name: The name of the search index you're creating. Since the searchindex is always created in the same project and dataset as the base table,there is no need to specify these in the name.table_name: The name of the table. SeeTable path syntax.ALL COLUMNS: If data types are not specified, creates a search index onevery column in the table which contains aSTRINGfield. If data types arespecified, create a search index on every column in the table which matchesany of the data types specified.WITH COLUMN OPTIONS: Can only be used withALL COLUMNSto setoptions on specific indexed columns.column_name: The name of a top-level column in the table which is one ofthe following supported data types or contains a field with one of thesupported data types:Supported data types Notes STRINGPrimitive data type. INT64Primitive data type. TIMESTAMPPrimitive data type. ARRAY<PRIMITIVE_DATA_TYPE>Must contain a primitive data type in this list. STRUCTorARRAY<STRUCT>Must contain at least one nested field thatis a primitive data type in this list or ARRAY<PRIMITIVE_DATA_TYPE>.JSONMust contain at least one nested field of a typethat matches any data types in this list. index_column_option_list: The list of optionsto set on indexed columns.index_option_list: The list of options to set on thesearch index.
Details
You can create only one search index per base table. You cannot create a searchindex on a view or materialized view. To modify which columns areindexed,DROP the current index and create a new one.
BigQuery returns an error if anycolumn_name is not aSTRINGor does not contain aSTRING field, or if you callCREATE SEARCH INDEX onALL COLUMNS of a table which contains noSTRING fields.
Creating a search index fails on a table which has column ACLs or row filters;however, these may all be added to the table after creation of the index.
index_option_list
The option list specifies options for the search index. Specify the options inthe following format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
analyzer | STRING | Example: Thetext analyzer to use to generate tokens for the search index. The supported values are |
analyzer_options | JSON-formatted STRING | The text analyzer configurations to set when creating a search index. Supported whenanalyzer is equal to'LOG_ANALYZER' or'PATTERN_ANALYZER'. For examples of JSON-formatted strings with different text analyzers, seeWork with text analyzers. |
data_types | ARRAY<STRING> | Example: An array of data types to set when creating a search index. Supported data types are |
default_index_column_granularity | STRING | InPreview. Example: The default granularity of information to store for each indexed column. The supported values are |
index_column_option_list
NAME | VALUE | Details |
|---|---|---|
index_granularity | STRING | InPreview. Example: The granularity of information to store for the indexed column. This setting overrides the default granularity specified in the |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.createIndex | The base table where you create the index. |
Examples
The following example creates a search index calledmy_index on all stringcolumns ofmy_table. In this case, the index is only created on columna.
CREATETABLEdataset.my_table(aSTRING,bINT64);CREATESEARCHINDEXmy_indexONdataset.my_table(ALLCOLUMNS);
The following example creates a search index on columnsa,my_struct.string_field, andb that uses theNO_OP_ANALYZER text analyzer.It sets the default index column granularity toCOLUMN and overrides thesetting for columna toGLOBAL.
CREATETABLEdataset.complex_table(aSTRING,my_structSTRUCT<string_fieldSTRING,int_fieldINT64>,bARRAY<STRING>);CREATESEARCHINDEXmy_indexONdataset.complex_table(aOPTIONS(index_granularity='GLOBAL'),my_struct,b)OPTIONS(analyzer='NO_OP_ANALYZER',default_index_column_granularity='COLUMN');
CREATE VECTOR INDEX statement
Creates a newvector index on a column of atable.
A vector index lets you perform avector searchmore quickly, with the trade-off of reducingrecalland so returning more approximate results.
Syntax
CREATE[ORREPLACE]VECTORINDEX[IFNOTEXISTS]index_nameONtable_name(column_name)[STORING(stored_column_name[,...])][PARTITIONBYpartition_expression]OPTIONS(index_option_list);
Arguments
OR REPLACE: Replaces any vector index with the same name if it exists.Can't appear withIF NOT EXISTS.IF NOT EXISTS: If there is already a vector index by that name on the table,do nothing. If the table has a vector index by a different name, then returnan error.index_name: The name of the vector index you're creating. Since the indexis always created in the same project and dataset as the base table, there isno need to specify these in the name.table_name: The name of the table. SeeTable path syntax.column_name: The name of a column with a type ofARRAY<FLOAT64>,or if you're usingautonomous embedding generation (Preview),aSTRUCT<result ARRAY<FLOAT64>, status STRING>column.If column type isARRAY<FLOAT64>, thenall elements in the array must be non-NULL, and all values in the columnmust have the same array dimensions.stored_column_name: The name ofa top-level column in the table to store inthe vector index. The column type can't beRANGE.Stored columns are not used if the table has a row-level access policy or thecolumn has a policy tag. To learn more, seeStore columns and pre-filter.partition_expression: An expression that determineshow to partition the vector index. You can only partition TreeAH indexes.(Preview)index_option_list: The list of options to seton the vector index.
Details
You can only create vector indexes onstandard tables.
You can create only one vector index per table. You can't create a vector indexon a table that already has asearch indexwith the same index name.
To modify which column is indexed,DROP the currentindex and create a new one.
index_option_list
The option list specifies options for the vector index. Specify the options inthe following format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
index_type | STRING | Required. The algorithm to use to build the vector index. The supported values areIVF andTREE_AH.
|
distance_type | STRING | Specifies the default distance type to use when performing a vector search using this index. The supported values areEUCLIDEAN,COSINE, andDOT_PRODUCT.EUCLIDEAN is the default.The index creation itself always uses If you specify a value for the |
ivf_options | JSON-formatted STRING | The options to use with theIVF algorithm. Defaults to'{}' to denote that all underlying options use their corresponding default values.The only supported option is During indexing, vectors are assigned to the list corresponding to their nearest cluster centroid. If you omit this argument, BigQuery determines a default value based on your data characteristics. The default value works well for most use cases.
The statement fails if |
tree_ah_options | JSON-formatted STRING | The options to use with theTREE_AH algorithm. Defaults to'{}' to denote that all underlying options use their corresponding default values.Two options are supported:
For example The statement fails if |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.createIndex | The table where you create the vector index. |
If you choose to use theOR REPLACE clause, you must also have thebigquery.tables.updateIndex permission.
Examples
The following example creates a vector index on theembedding columnofmy_table:
CREATETABLEmy_dataset.my_table(idINT64,embeddingARRAY<FLOAT64>);CREATEVECTORINDEXmy_indexONmy_dataset.my_table(embedding)OPTIONS(index_type='IVF');
The following example creates a vector index on theembedding columnofmy_table, and specifies the distance type to use and the IVF options:
CREATETABLEmy_dataset.my_table(idINT64,embeddingARRAY<FLOAT64>);CREATEVECTORINDEXmy_indexONmy_dataset.my_table(embedding)OPTIONS(index_type='IVF',distance_type='COSINE',ivf_options='{"num_lists":2500}');
The following example creates a vector index on theembedding columnofmy_table, and specifies the distance type to use and the TREE_AH options:
CREATETABLEmy_dataset.my_table(idINT64,embeddingARRAY<FLOAT64>);CREATEVECTORINDEXmy_indexONmy_dataset.my_table(embedding)OPTIONS(index_type='TREE_AH',distance_type='EUCLIDEAN',tree_ah_options='{"normalization_type": "L2"}');
CREATE DATA_POLICY statement
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: To provide feedback or request support for this feature, send an email tobigquery-security@google.com.Creates or replaces adata policy.The name of the data policy must be unique within the project.
Syntax
CREATE[ORREPLACE]DATA_POLICY[IFNOTEXISTS]`project_id.region-location_id.data_policy_id`OPTIONS(index_option_list);
Arguments
OR REPLACE: Replaces any data policy with the same name if it exists.Can't appear withIF NOT EXISTS.IF NOT EXISTS: If there is already a data policy by that name in theproject, theCREATEstatement has no effect.project_id: The project ID of the project where the data policy will residein.location_id: The location of the data policy.data_policy_id: The name of the data policy that is unique within theproject that the data policy resides in.index_option_list: The list of options to set onthe data policy.
index_option_list
The option list specifies options for the data policy. Specify the options inthe following format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
data_policy_type | STRING | The supported values areDATA_MASKING_POLICY andRAW_DATA_ACCESS_POLICY.If not specified, the default value is You can't update this field once the data policy has been created.
|
masking_expression | STRING | Specifies thepredefined masking rule or acustom masking routine. |
Required permissions
The user or service account that creates a data policy must have thebigquery.dataPolicies.create permission.
Thebigquery.dataPolicies.create permission is included in theBigQuery Data Policy Admin, BigQuery Admin and BigQuery Data Owner roles.
If you are creating a data policy that references a custom masking routine,you also needroutine permissions.
ALTER SCHEMA SET DEFAULT COLLATE statement
Setscollation specifications ona dataset.
Syntax
ALTERSCHEMA[IFEXISTS][project_name.]dataset_nameSETDEFAULTCOLLATEcollate_specification
Arguments
IF EXISTS: If no dataset exists with that name, the statement has no effect.DEFAULT COLLATE collate_specification: When a new table is created in thedataset, the table inherits adefaultcollation specificationunless a collation specification is explicitly specified for acolumn.The updated collation specification only applies to tables created afterwards.
project_name: The name of the project that contains the dataset. Defaultsto the project that runs this DDL statement.dataset_name: The name of the dataset.collate_specification:Specifies the collation specifications to set.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.datasets.get | The dataset to alter. |
bigquery.datasets.update | The dataset to alter. |
Example
Assume you have an existing table,mytable_a, in a dataset calledmydataset.For example:
CREATESCHEMAmydataset
CREATETABLEmydataset.mytable_a(numberINT64,wordSTRING)
+----------------------+| mydataset.mytable_a || number INT64 || word STRING |+----------------------+At a later time, you decide to add a collation specification to yourdataset. For example:
ALTERSCHEMAmydatasetSETDEFAULTCOLLATE'und:ci'
If you create a new table for your dataset, it inheritsCOLLATE 'und:ci' forallSTRING columns. For example, collation is added tocharacterswhen you create themytable_b table in themydataset dataset:
CREATETABLEmydataset.mytable_b(amountINT64,charactersSTRING)
+--------------------------------------+| mydataset.mytable_b || amount INT64 || characters STRING COLLATE 'und:ci' |+--------------------------------------+However, although you have updated the collation specification for the dataset,your existing table,mytable_a, continues to use the previouscollation specification. For example:
+---------------------+| mydataset.mytable_a || number INT64 || word STRING |+---------------------+ALTER SCHEMA SET OPTIONS statement
Sets options on a dataset.
The statement runs in the location of the dataset if the dataset exists, unlessyou specify the location in the query settings. For more information, seeSpecifying your location.
Syntax
ALTERSCHEMA[IFEXISTS][project_name.]dataset_nameSETOPTIONS(schema_set_options_list)
Arguments
IF EXISTS: If no dataset exists with that name, the statement has no effect.project_name: The name of the project that contains the dataset. Defaultsto the project that runs this DDL statement.dataset_name: The name of the dataset.schema_set_options_list: The list of options toset.
schema_set_options_list
The option list specifies options for the dataset. Specify the options in thefollowing format:NAME=VALUE, ...The following options are supported:
NAME | VALUE | Details |
|---|---|---|
default_kms_key_name | STRING | Specifies the default Cloud KMS key for encrypting table data in this dataset. You can override this value when you create a table. |
default_partition_expiration_days | FLOAT64 | Specifies the default expiration time, in days, for table partitions in this dataset. You can override this value when you create a table. |
default_rounding_mode |
| Example: This specifies the
|
default_table_expiration_days | FLOAT64 | Specifies the default expiration time, in days, for tables in this dataset. You can override this value when you create a table. |
description | STRING | The description of the dataset. |
failover_reservation | STRING | Associates the dataset to a reservation in the case of a failover scenario. |
friendly_name | STRING | A descriptive name for the dataset. |
is_case_insensitive | BOOL | TRUE if the dataset and its table names are case-insensitive, otherwiseFALSE. By default, this isFALSE, which means the dataset and its table names are case-sensitive.
|
is_primary | BOOLEAN | Declares if the dataset is the primary replica. |
labels | <ARRAY<STRUCT<STRING, STRING>>> | An array of labels for the dataset, expressed as key-value pairs. |
max_time_travel_hours | SMALLINT | Specifies the duration in hours of thetime travel window for the dataset. Themax_time_travel_hours value must be an integer expressed in multiples of 24 (48, 72, 96, 120, 144, 168) between 48 (2 days) and 168 (7 days). 168 hours is the default if this option isn't specified. |
primary_replica | STRING | The replica name to set as theprimary replica. |
storage_billing_model | STRING | Alters thestorage billing model for the dataset. Set the The When you change a dataset's billing model, it takes 24 hours for the change to take effect. Once you change a dataset's storage billing model, you must wait 14 days before you can change the storage billing model again. |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags for the dataset, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.datasets.get | The dataset to alter. |
bigquery.datasets.update | The dataset to alter. |
Examples
Setting the default table expiration for a dataset
The following example sets the default table expiration.
ALTERSCHEMAmydatasetSETOPTIONS(default_table_expiration_days=3.75)
Turning on case insensitivity for a dataset
The following example turns on case insensitivity for the name of a dataset andthe table names within that dataset.
ALTERSCHEMAmydatasetSETOPTIONS(is_case_insensitive=TRUE)
ALTER SCHEMA ADD REPLICA statement
Adds a replica to a schema (preview).
Syntax
ALTERSCHEMA[IFEXISTS][project_name.]dataset_nameADDREPLICAreplica_name[OPTIONS(add_replica_options_list)]
Arguments
IF EXISTS: If no dataset exists with that name, the statement has no effect.dataset_name: The name of the table to alter. SeeTable pathsyntax.replica_name: The name of the new replica. Conventionally, this is the sameas the location you are creating the replica in.add_replica_option_list: The list of options toset.
add_replica_options_list
The option list specifies options for the dataset. Specify the options in thefollowing format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
location | STRING | The location in which to create the replica. |
replica_kms_key | STRING | The Cloud Key Management Service key set in the destination region.replica_kms_key is used as a substitute encryption key in the destination region for any keys used in the source region. Any table in the source region that's encrypted with a Cloud KMS key is encrypted with thereplica_kms_key. This value must be a Cloud KMS key created in the replica dataset's region, not the source dataset's region. For more information about setting up a Cloud KMS key, seeGrant encryption and decryption permission. |
Required permissions
To get the permissions that you need to manage replicas, ask your administrator to grant you theBigQuery Data Editor (roles/bigquery.dataEditor) IAM role on your schema. For more information about granting roles, seeManage access to projects, folders, and organizations.
You might also be able to get the required permissions throughcustom roles or otherpredefined roles.
Examples
The following example adds a secondary replica that is namedEU in theEUmulti-region to a schema that is namedcross_region_dataset:
ALTERSCHEMAcross_region_datasetADDREPLICA`EU`OPTIONS(location=`eu`);
ALTER SCHEMA DROP REPLICA statement
Drops a replica from a schema (preview).
Syntax
ALTERSCHEMA[IFEXISTS]dataset_nameDROPREPLICAreplica_name
IF EXISTS: If no dataset exists with that name, the statement has no effect.dataset_name: The name of the table to alter. SeeTable path syntax.replica_name: The name of the replica to drop.
Required permissions
To get the permissions that you need to manage replicas, ask your administrator to grant you theBigQuery Data Editor (roles/bigquery.dataEditor) IAM role on your schema. For more information about granting roles, seeManage access to projects, folders, and organizations.
You might also be able to get the required permissions throughcustom roles or otherpredefined roles.
Examples
The following example removes a replica that is located in theus-east4 region from thecross_region_dataset dataset:
ALTERSCHEMA[IFEXISTS]cross_region_datasetDROPREPLICA`us-east4`
ALTER TABLE SET OPTIONS statement
Sets the options on a table.
Syntax
ALTERTABLE[IFEXISTS]table_nameSETOPTIONS(table_set_options_list)
Arguments
IF EXISTS: If no table exists with that name, the statement has no effect.table_name: The name of the table to alter. SeeTable path syntax.table_set_options_list: The list of options toset.
Details
This statement is not supported forexternal tables.
table_set_options_list
The option list lets you set table options such as alabel and an expiration time. You can include multipleoptions using a comma-separated list.
Specify a table option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
expiration_timestamp | TIMESTAMP | Example: This property is equivalent to theexpirationTime table resource property. |
partition_expiration_days |
| Example: Sets the partition expiration in days. For more information, see Set the partition expiration. By default, partitions don't expire. This property is equivalent to thetimePartitioning.expirationMs table resource property but uses days instead of milliseconds. One day is equivalent to 86400000 milliseconds, or 24 hours. This property can only be set if the table is partitioned. |
require_partition_filter |
| Example: Specifies whether queries on this table must include a a predicate filter that filters on the partitioning column. For more information, see Set partition filter requirements. The default value is This property is equivalent to thetimePartitioning.requirePartitionFilter table resource property. This property can only be set if the table is partitioned. |
kms_key_name |
| Example: This property is equivalent to theencryptionConfiguration.kmsKeyName table resource property. See more details aboutProtecting data with Cloud KMS keys. |
friendly_name |
| Example: This property is equivalent to thefriendlyName table resource property. |
description |
| Example: This property is equivalent to thedescription table resource property. |
labels |
| Example: This property is equivalent to thelabels table resource property. |
default_rounding_mode |
| Example: This specifies the defaultrounding mode that's used for values written to any new
This property is equivalent to the |
enable_change_history |
| Inpreview. Example: Set this property to |
max_staleness |
| Example: The maximum interval behind the current time where it's acceptable to read stale data. For example, withchange data capture, when this option is set, the table copy operation is denied if data is more stale than the
|
enable_fine_grained_mutations |
| Inpreview. Example: Set this property to |
storage_uri |
| Inpreview. Example: A fully qualified location prefix for the external folder where data is stored. Supports Required formanaged tables. |
file_format |
| Inpreview. Example: The open-source file format in which the table data is stored. Only Required formanaged tables. The default is |
table_format |
| Inpreview. Example: The open table format in which metadata-only snapshots are stored. Only Required formanaged tables. The default is |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags for the table, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
VALUE is a constant expression containing only literals, query parameters,and scalar functions.
The constant expressioncannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT,CREATE, orUPDATE - User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRINGREPLACEREGEXP_REPLACERANDFORMATLPADRPADREPEATSESSION_USERGENERATE_ARRAYGENERATE_DATE_ARRAY
Setting the value replaces the existing value of that option for the table, ifthere was one. Setting the value toNULL clears the table's value for thatoption.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
Setting the expiration timestamp and description on a table
The following example sets the expiration timestamp on a table to seven daysfrom the execution time of theALTER TABLE statement, and sets the descriptionas well:
ALTERTABLEmydataset.mytableSETOPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL7DAY),description="Table that expires seven days from now")
Setting the require partition filter attribute on a partitioned table
The following example sets thetimePartitioning.requirePartitionFilterattribute on apartitioned table:
ALTERTABLEmydataset.mypartitionedtableSETOPTIONS(require_partition_filter=true)
Queries that reference this table must use a filter on the partitioning column,or else BigQuery returns an error. Setting this option totruecan help prevent mistakes in querying more data than intended.
Clearing the expiration timestamp on a table
The following example clears the expiration timestamp on a table so that it willnot expire:
ALTERTABLEmydataset.mytableSETOPTIONS(expiration_timestamp=NULL)
ALTER TABLE ADD COLUMN statement
Adds one or more new columns to an existing table schema.
Syntax
ALTERTABLEtable_nameADDCOLUMN[IFNOTEXISTS]column[,...]Arguments
table_name: The name of the table. SeeTable path syntax.IF NOT EXISTS: If the column name already exists, the statement has no effect.column: The column to add. This includes the name of the column and schemato add. The column name and schema use the same syntax used in theCREATE TABLEstatement.
Details
You cannot use this statement to create:
- Partitioned columns.
- Clustered columns.
- Nested columns inside existing
RECORDfields.
You cannot add aREQUIRED column to an existing table schema. However, youcan create a nestedREQUIRED column as part of a newRECORD field.
This statement is not supported forexternal tables.
Without theIF NOT EXISTS clause, if the table already contains a column withthat name, the statement returns an error. If theIF NOT EXISTS clause isincluded and the column name already exists, no error is returned, and noaction is taken.
The value of the new column for existing rows is set to one of the following:
NULLif the new column was added withNULLABLEmode. This is the defaultmode.- An empty
ARRAYif the new column was added withREPEATEDmode.
For more information about schema modifications in BigQuery, seeModifying table schemas.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
Adding columns
The following example adds the following columns to an existing table namedmytable:
- Column
Aof typeSTRING. - Column
Bof typeGEOGRAPHY. - Column
Cof typeNUMERICwithREPEATEDmode. - Column
Dof typeDATEwith a description.
ALTERTABLEmydataset.mytableADDCOLUMNASTRING,ADDCOLUMNIFNOTEXISTSBGEOGRAPHY,ADDCOLUMNCARRAY<NUMERIC>,ADDCOLUMNDDATEOPTIONS(description="my description")If any of the columns namedA,C, orD already exist, the statement fails.If columnB already exists, the statement succeeds because of theIF NOTEXISTS clause.
Adding aRECORD column
The following example adds a column namedA of typeSTRUCT that contains thefollowing nested columns:
- Column
Bof typeGEOGRAPHY. - Column
Cof typeINT64withREPEATEDmode. - Column
Dof typeINT64withREQUIREDmode. - Column
Eof typeTIMESTAMPwith a description.
ALTERTABLEmydataset.mytableADDCOLUMNASTRUCT<BGEOGRAPHY,CARRAY<INT64>,DINT64NOTNULL,ETIMESTAMPOPTIONS(description="creation time")>The query fails if the table already has a column namedA, even if thatcolumn does not contain any of the nested columns that are specified.
The newSTRUCT namedA is nullable, but the nested columnD withinA isrequired for anySTRUCT values ofA.
Adding collation support to a column
When you create a new column for your table, you can specifically assign anew collation specification to that column.
ALTERTABLEmydataset.mytableADDCOLUMNwordSTRINGCOLLATE'und:ci'
ALTER TABLE ADD FOREIGN KEY statement
Adds aforeign key constraint to an existing table.You can add multiple foreign key constraints by using additionalADD FOREIGN KEY statements.
Syntax
ALTERTABLE[[project_name.]dataset_name.]fk_table_nameADD[CONSTRAINT[IFNOTEXISTS]constraint_name]FOREIGNKEY(fk_column_name[,...])REFERENCESpk_table_name(pk_column_name[,...])NOTENFORCED[ADD...];
Arguments
project_name: The name of the project containing the table with aprimary key.Defaults to the project that runs this DDL statement if undefined.dataset_name: The name of the dataset that contains the table with a primarykey. Defaults to the project that runs this DDL statement if undefined.fk_table_name: The name of the existing table to add a foreign key to.IF NOT EXISTS: If a constraint of the same name already exists in thedefined table, the statement has no effect.constraint_name: The name of the constraint to add.fk_column_name: In the foreign key table, the name of the foreign keycolumn. Only top-level columns can be used as foreign key columns.pk_table_name: The name of the table that contains the primary key.pk_column_name: In the primary key table, the name of the primary keycolumn. Only top-level columns can be used as primary key columns.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example adds themy_fk_name foreign key constraint to thefk_table table. This example depends on an existing table,pk_table.
Add a primary key to the
pk_tabletable:ALTERTABLEpk_tableADDPRIMARYKEY(x,y)NOTENFORCED;
Create a table named
fk_tablefor the foreign key.CREATETABLEfk_table(xint64,yint64,iint64,jint64,uint64,vint64);
Add the
my_fk_nameforeign key constraint to thefk_table.ALTERTABLEfk_tableADDCONSTRAINTmy_fk_nameFOREIGNKEY(u,v)REFERENCESpk_table(x,y)NOTENFORCED
The following example adds thefk andfk2 foreign key constraints to thefk_table table in a single statement. This example depends on an existingtable,pk_table.
Add a primary key to the
pk_tabletable:ALTERTABLEpk_tableADDPRIMARYKEY(x,y)NOTENFORCED;
Create a table named
fk_tablefor multiple foreign key constraints.CREATETABLEfk_table(xint64,yint64,iint64,jint64,uint64,vint64);
Add the
fkandfk2constraints tofk_tablein one statement.ALTERTABLEfk_tableADDPRIMARYKEY(x,y)NOTENFORCED,ADDCONSTRAINTfkFOREIGNKEY(u,v)REFERENCESpk_table(x,y)NOTENFORCED,ADDCONSTRAINTfk2FOREIGNKEY(i,j)REFERENCESpk_table(x,y)NOTENFORCED;
ALTER TABLE ADD PRIMARY KEY statement
Adds aprimary key to an existing table.
Syntax
ALTERTABLE[[project_name.]dataset_name.]table_nameADDPRIMARYKEY(column_list)NOTENFORCED;
Arguments
project_name: The name of the project containing the table with a primarykey. Defaults to the project that runs this DDL statement if undefined.dataset_name: The name of the dataset that contains the table with a primarykey.table_name: The name of the existing table with a primary key.column_list: The list of columns to be added as primary keys.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example adds the primary key constraint ofx andyto thepk_table table.
ALTERTABLEpk_tableADDPRIMARYKEY(x,y)NOTENFORCED;
ALTER TABLE RENAME TO statement
Renames a clone, snapshot, or table.
TheALTER TABLE RENAME TO statement recreates the table in the destinationdataset with the creation timestamp of the original table. If you haveconfigureddataset-level tableexpiration, the renamedtable might be immediately deleted if its original creation timestamp fallsoutside of the expiration window.
Syntax
ALTERTABLE[IFEXISTS]table_nameRENAMETOnew_table_name
Arguments
IF EXISTS: If no table exists with that name, the statement has no effect.table_name: The name of the table to rename. SeeTable path syntax.new_table_name: The new name of the table. The value ofnew_table_namemust onlyinclude the name of the table, not the fulltable path syntax.The new name cannot be an existing table name.
Details
- If you want to rename a table that has data streaming into it, you must stopthe streaming, commit any pending streams, and waitfor BigQuery to indicate that streamingis not in use.
- While a table can usually be renamed 5 hours after the last streamingoperation, it might take longer.
- Existing table ACLs and row access policies are preserved, but table ACL androw access policy updates made during the table rename are not preserved.
- You can't concurrently rename a table and run a DML statement on that table.
- Renaming a table removes allData Catalog tags(deprecated) andDataplex Universal Catalog aspectson the table.
- Any search index or vector index created on the table is dropped when the table is renamed.
- You can't rename external tables.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
Renaming a table
The following example renames the tablemydataset.mytable tomydataset.mynewtable:
ALTERTABLEmydataset.mytableRENAMETOmynewtable
ALTER TABLE RENAME COLUMN statement
Caution: Renaming a column deletes allData Catalog tags(deprecated) andDataplex Universal Catalog aspectsthat are attached to it. Primary key columns can't be renamed.Renames one or more columns in an existing table schema.
Syntax
ALTERTABLE[IFEXISTS]table_nameRENAMECOLUMN[IFEXISTS]column_to_column[,...]column_to_column:=column_nameTOnew_column_name
Arguments
(ALTER TABLE) IF EXISTS: If the specified table does not exist, thestatement has no effect.table_name: The name of the table to alter. SeeTable path syntax.(ALTER COLUMN) IF EXISTS: If the specified column does not exist, thestatement has no effect.column_name: The name of the top-level column you're altering.new_column_name: The new name of the column. The new name cannot be anexisting column name.
Details
This statement is not supported forexternal tables.
If the table to be modified has active row-level access policies, the statementreturns an error.
Without theIF EXISTS clause, if the table does not contain a column with thatname, then the statement returns an error. If theIF EXISTS clause is includedand the column name does not exist, then no error is returned, and no action istaken.
This statement only renames the column from the table. Any objects that refer tothe column, such as views or materialized views, must be updated or recreatedseparately.
You cannot use this statement to rename the following:
- Subfields, such as nested columns in a
STRUCT - Partitioning columns
- Clustering columns
- Fields that are part of primary key constraints or foreign key constraints
- Columns in a table that has row access policies
After one or more columns in a table are renamed, you cannot do the following:
- Query the table with legacy SQL.
- Query the table as a wildcard table.
Renaming thecolumns with their original names removes these restrictions.
MultipleRENAME COLUMN statements in oneALTER TABLE statement aresupported. The sequence of renames are interpreted and validated in order.Eachcolumn_name must refer to a column name that exists after all precedingrenames have been applied.RENAME COLUMN cannot be used with otherALTERTABLE actions in one statement.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
Renaming columns
The following example renames columns from an existing table namedmytable:
- Column
A->columnA - Column
B->columnB
ALTERTABLEmydataset.mytableRENAMECOLUMNATOcolumnA,RENAMECOLUMNIFEXISTSBTOcolumnB
If columnA does not exist, then the statement fails. If columnB does notexist, then the statement still succeeds because of theIF EXISTS clause.
The following example swaps the names ofcolumnA andcolumnB:
ALTERTABLEmydataset.mytableRENAMECOLUMNcolumnATOtemp_col,RENAMECOLUMNcolumnBTOcolumnA,RENAMECOLUMNtemp_colTOcolumnB
ALTER TABLE DROP COLUMN statement
Drops one or more columns from an existing table schema.
Syntax
ALTERTABLEtable_nameDROPCOLUMN[IFEXISTS]column_name[,...]Arguments
table_name: The name of the table to alter. SeeTable path syntax. The table must already exist and have aschema.IF EXISTS: If the specified column does not exist, the statement has noeffect.column_name: The name of the column to drop.
Details
Dropping a column is a metadata-only operation and does notimmediately free up the storage that is associated with the dropped column. Thestorage is freed up the next time the table is written to, typically when youperform a DML operation on it or when a background optimzation job happens.SinceDROP COLUMN is not a data cleanup operation, there is no guaranteedtime window within which the data will be deleted.
There are two options for immediately reclaiming storage:
- Overwrite a table with a
SELECT * EXCEPTquery. - Export the data to Cloud Storage, delete the unwanted columns, and thenload the data into a new table with the correct schema.
You can restore a dropped column in a table usingtime travel.You cannot use this statement to drop the following:
- Partitioned columns
- Clustered columns
- Fields that are part ofprimary key constraints or foreign key constraints
- Nested columns inside existing
RECORDfields - Columns in a table that has row access policies
After one or more columns in a table are dropped you cannot do the following:
- Query the table with legacy SQL.
- Query the table as a wildcard table.
This statement is not supported forexternal tables.
Without theIF EXISTS clause, if the table does not contain a column with thatname, then the statement returns an error. If theIF EXISTS clause is included andthe column name does not exist, then no error is returned, and no action is taken.
This statement only removes the column from the table. Any objects that refer tothe column, such as views or materialized views, must be updated or recreatedseparately.
For more information about schema modifications inBigQuery, seeModifying table schemas.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
Dropping columns
The following example drops the following columns from an existing table namedmytable:
- Column
A - Column
B
ALTERTABLEmydataset.mytableDROPCOLUMNA,DROPCOLUMNIFEXISTSBIf the column namedA does not exist, then the statement fails. If columnBdoes not exist, then the statement still succeeds because of theIF EXISTS clause.
After one or more columns in a table are dropped, you cannot do the following:
- Query the table with legacy SQL.
- Accelerate queries on the table with BigQuery BI Engine.
- Query the table as a Wildcard Table.
- Copy the table in the Google Cloud console.
- Copy the table using the
bq cpcommand.
Recreating the table usingCREATE TABLE ... AS SELECT ... removes these restrictions.
ALTER TABLE DROP CONSTRAINT statement
Drops a constraint from an existing table. You can use this statement to dropforeign key constraints from a table.
Syntax
ALTERTABLE[[project_name.]dataset_name.]table_nameDROPCONSTRAINT[IFEXISTS]constraint_name;
Arguments
project_name: The name of the project containing the table with a primary key. Defaults to the project that runs this DDL statement if undefined.dataset_name: The name of the dataset that contains the table with a primary key.table_name: The name of the existing table with a primary key.IF EXISTS: If no primary key exists in the defined table, the statement has no effect.constraint_name: The name of the constraint to drop.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example drops the constraintmyConstraint from the existingtablemyTable.
ALTERTABLEmytableDROPCONSTRAINTmyConstraint;
ALTER TABLE DROP PRIMARY KEY statement
Drops aprimary key from anexisting table.
Syntax
ALTERTABLE[[project_name.]dataset_name.]table_nameDROPPRIMARYKEY[IFEXISTS];
Arguments
project_name: The name of the project containing the table with a primarykey. Defaults to the project that runs this DDL statement if undefined.dataset_name: The name of the dataset that contains the table with aprimary key.table_name: The name of the existing table with a primary key.IF EXISTS: If no primary key exists in the defined table, the statementhas no effect.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example drops all primary keys from the existing tablemyTable.
ALTERTABLEmyTableDROPPRIMARYKEY;
ALTER TABLE SET DEFAULT COLLATE statement
Setscollation specifications ona table.
Syntax
ALTERTABLEtable_nameSETDEFAULTCOLLATEcollate_specification
Arguments
table_name: The name of the table to alter. SeeTable path syntax. The table must already exist and have aschema.SET DEFAULT COLLATE collate_specification: When a new column is created in theschema, and if the column does not have an explicitcollation specification,thecolumn inherits thiscollation specification forSTRINGtypes. The updatedcollation specification only applies to columns added afterwards.If you want to add a collation specification on a new column inan existing table, you can do this when youadd the column. If you add acollation specification directly on a column, the collation specificationfor the column has precedence over a table's default collation specification.You cannot update an existing collation specification on a column.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Example
Assume you have an existing table,mytable, in a schema calledmydataset.
CREATETABLEmydataset.mytable(numberINT64,wordSTRING)DEFAULTCOLLATE'und:ci'
When you createmytable, allSTRING columns inheritCOLLATE 'und:ci'.The resulting table has this structure:
+--------------------------------+| mydataset.mytable || number INT64 || word STRING COLLATE 'und:ci' |+--------------------------------+At a later time, you decide to change the collation specification for yourtable.
ALTERTABLEmydataset.mytableSETDEFAULTCOLLATE''
Although you have updated the collation specification, your existing column,word, continues to use the previous collation specification.
+--------------------------------+| mydataset.mytable || number INT64 || word STRING COLLATE 'und:ci' |+--------------------------------+However, if you create a new column for your table, the new column includes thenew collation specification. In the following example a column callednameis added. Because the new collation specification is empty, the defaultcollation specification is used.
ALTERTABLEmydataset.mytableADDCOLUMNnameSTRING
+--------------------------------+| mydataset.mytable || number INT64 || word STRING COLLATE 'und:ci' || name STRING COLLATE |+--------------------------------+ALTER COLUMN SET OPTIONS statement
Sets options, such as the column description, on a column in a table or viewin BigQuery.
Syntax
ALTER{TABLE|VIEW}[IFEXISTS]nameALTERCOLUMN[IFEXISTS]column_nameSETOPTIONS({column_set_options_list|view_column_set_options_list})
Arguments
(ALTER { TABLE | VIEW }) IF EXISTS: If no table or view exists with thatname, then the statement has no effect.name: The name of the table or view to alter. SeeTable path syntax.(ALTER COLUMN) IF EXISTS: If the specified column does not exist, thestatement has no effect.column_name: The name of the top-level column you're altering. Modifyingsubfields, such as nested columns in aSTRUCT, is not supported.column_set_options_list: The list of options toset on the column of the table. This option must be used withTABLE.view_column_set_options_list: The list ofoptions to set on the column of the view. This option must be used withVIEW.
Details
This statement is not supported forexternal tables.
column_set_options_list
Specify a column option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
description |
| Example: This property is equivalent to theschema.fields[].description table resource property. |
rounding_mode |
| Example: This specifies therounding mode that's used for values written to a
This property is equivalent to the |
data_policies | ARRAY<STRING> | Applies adata policy to a column in a table (Preview). Example: The Example: |
VALUE is a constant expression containing only literals, query parameters,and scalar functions.
The constant expressioncannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT,CREATE, orUPDATE - User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRINGREPLACEREGEXP_REPLACERANDFORMATLPADRPADREPEATSESSION_USERGENERATE_ARRAYGENERATE_DATE_ARRAY
Setting theVALUE replaces the existing value of that option for the column, ifthere was one. Setting theVALUE toNULL clears the column's value for thatoption.
view_column_set_options_list
Theview_column_option_list lets you specify optional top-level columnoptions. Column options for a view have the same syntax and requirements asfor a table, but with a different list ofNAME andVALUE fields:
NAME | VALUE | Details |
|---|---|---|
description |
| Example: |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example sets a new description on a table column calledprice:
ALTERTABLEmydataset.mytableALTERCOLUMNpriceSETOPTIONS(description='Price per unit');
The following example sets a new description on a view column calledtotal:
ALTERVIEWmydataset.myviewALTERCOLUMNtotalSETOPTIONS(description='Total sales of the product');
ALTER COLUMN DROP NOT NULL statement
Removes aNOT NULL constraint from a column in a table in BigQuery.
Syntax
ALTERTABLE[IFEXISTS]table_nameALTERCOLUMN[IFEXISTS]columnDROPNOTNULL
Arguments
(ALTER TABLE) IF EXISTS: If no table exists with that name, the statementhas no effect.table_name: The name of the table to alter. SeeTable path syntax.(ALTER COLUMN) IF EXISTS: If the specified column does not exist, thestatement has no effect.column_name: The name of the top level column you're altering. Modifyingsubfields is not supported.
Details
If a column does not have aNOT NULL constraint the query returns an error.
This statement is not supported forexternal tables.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example removes theNOT NULL constraint from a column calledmycolumn:
ALTERTABLEmydataset.mytableALTERCOLUMNmycolumnDROPNOTNULL
ALTER COLUMN SET DATA TYPE statement
Changes the data type of a column in a table in BigQueryto a less restrictive data type. For example, aNUMERIC data type can be changedto aBIGNUMERIC type but not the reverse.
Syntax
ALTERTABLE[IFEXISTS]table_nameALTERCOLUMN[IFEXISTS]column_nameSETDATATYPEcolumn_schema
Arguments
(ALTER TABLE) IF EXISTS: If no table exists with that name, the statementhas no effect.table_name: The name of the table to alter. SeeTable path syntax.(ALTER COLUMN) IF EXISTS: If the specified column does not exist, thestatement has no effect.column_name: The name of the top level column you're altering. Modifyingsubfields is not supported.column_schema: The schema that you're converting the column to. This schemauses the same syntax used in theCREATE TABLEstatement.
Details
The following data type conversions are supported::
INT64toNUMERIC,BIGNUMERIC,FLOAT64NUMERICtoBIGNUMERIC,FLOAT64
You can also convert data types from more restrictive to less restrictiveparameterized data types.For example, you can increase the maximum length of a string type or increase theprecision or scale of a numeric type.
The following are examples of valid parameterized data type conversions:
NUMERIC(10, 6)toNUMERIC(12, 8)NUMERICtoBIGNUMERIC(40, 20)STRING(5)toSTRING(7)
This statement is not supported forexternal tables.
Without theIF EXISTS clause, if the table does not contain a column with thatname, the statement returns an error. If theIF EXISTS clause is included andthe column name does not exist, no error is returned, and no action is taken.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
Changing the data type for a column
The following example changes the data type of columnc1 from anINT64 toNUMERIC:
CREATETABLEdataset.my_table(c1INT64);ALTERTABLEdataset.my_tableALTERCOLUMNc1SETDATATYPENUMERIC;
Changing the data type for a field
The following example changes the data type of one of the fields in thes1 column:
CREATETABLEdataset.my_table(s1STRUCT<aINT64,bSTRING>);ALTERTABLEdataset.my_tableALTERCOLUMNs1SETDATATYPESTRUCT<aNUMERIC,bSTRING>;
Changing precision
The following example changes the precision of a parameterized data typecolumn:
CREATETABLEdataset.my_table(ptNUMERIC(7,2));ALTERTABLEdataset.my_tableALTERCOLUMNptSETDATATYPENUMERIC(8,2);
ALTER COLUMN SET DEFAULT statement
Sets thedefault value of a column.
Syntax
ALTERTABLE[IFEXISTS]table_nameALTERCOLUMN[IFEXISTS]column_nameSETDEFAULTdefault_expression;
Arguments
(ALTER TABLE) IF EXISTS: If the specified table does not exist, thestatement has no effect.table_name: The name of the table to alter. SeeTable path syntax.(ALTER COLUMN) IF EXISTS: If the specified column does not exist, thestatement has no effect.column_name: The name of the top-level column to add a default value to.default_expression: The default value assigned to the column. Theexpression must be aliteral or one ofthe following functions:
Details
Setting the default value for a column only affects future inserts to the table.It does not change any existing table data.
The type of the default value must match the type of the column.ASTRUCT type can only have a default value set for the entireSTRUCT field. Youcannot set the default value for a subset of the fields. You cannot set thedefault value of an array toNULL or set an element withinan array toNULL.
If the default value is a function, it is evaluated at the time that the valueis written to the table, not the time the table is created.
You can't set default values on columns that areprimary keys.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example sets the default value of the columnmycolumn to thecurrent time:
ALTERTABLEmydataset.mytableALTERCOLUMNmycolumnSETDEFAULTCURRENT_TIME();
ALTER COLUMN DROP DEFAULT statement
Removes thedefault value assigned to a column.This is the same as setting the default value toNULL.
Syntax
ALTERTABLE[IFEXISTS]table_nameALTERCOLUMN[IFEXISTS]column_nameDROPDEFAULT;
Arguments
(ALTER TABLE) IF EXISTS: If the specified table does not exist, thestatement has no effect.table_name: The name of the table to alter. SeeTable path syntax.(ALTER COLUMN) IF EXISTS: If the specified column does not exist, thestatement has no effect.column_name: The name of the top-level column to remove the default valuefrom. If you drop the default value from a column that does not havea default set, an error is returned.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The table to alter. |
bigquery.tables.update | The table to alter. |
Examples
The following example removes the default value from the columnmycolumn:
ALTERTABLEmydataset.mytableALTERCOLUMNmycolumnDROPDEFAULT;
ALTER VIEW SET OPTIONS statement
Sets the options on aview.
Syntax
ALTERVIEW[IFEXISTS]view_nameSETOPTIONS(view_set_options_list)
Arguments
IF EXISTS: If no view exists with that name, the statement has no effect.view_name: The name of the view to alter. SeeTable path syntax.view_set_options_list: The list of options to set.
view_set_options_list
The option list allows you to set view options such as alabel and an expiration time. You can include multipleoptions using a comma-separated list.
Specify a view option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
expiration_timestamp | TIMESTAMP | Example: This property is equivalent to theexpirationTime table resource property. |
friendly_name |
| Example: This property is equivalent to thefriendlyName table resource property. |
description |
| Example: This property is equivalent to thedescription table resource property. |
labels |
| Example: This property is equivalent to thelabels table resource property. |
privacy_policy |
| The policies to enforce when anyone queries the view. To learn more about the policies available for a view, see the |
tags | <ARRAY<STRUCT<STRING, STRING>>> | An array of IAM tags for the view, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
VALUE is a constant expression containing only literals, query parameters,and scalar functions.
The constant expressioncannot contain:
- A reference to a table
- Subqueries or SQL statements such as
SELECT,CREATE, orUPDATE - User-defined functions, aggregate functions, or analytic functions
- The following scalar functions:
ARRAY_TO_STRINGREPLACEREGEXP_REPLACERANDFORMATLPADRPADREPEATSESSION_USERGENERATE_ARRAYGENERATE_DATE_ARRAY
Setting the value replaces the existing value of that option for the view, ifthere was one. Setting the value toNULL clears the view's value for thatoption.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The view to alter. |
bigquery.tables.update | The view to alter. |
Examples
Setting the expiration timestamp and description on a view
The following example sets the expiration timestamp on a view to seven daysfrom the execution time of theALTER VIEW statement, and sets the descriptionas well:
ALTERVIEWmydataset.myviewSETOPTIONS(expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),INTERVAL7DAY),description="View that expires seven days from now")
ALTER MATERIALIZED VIEW SET OPTIONS statement
Sets the options on a materialized view.
Syntax
ALTERMATERIALIZEDVIEW[IFEXISTS]materialized_view_nameSETOPTIONS(materialized_view_set_options_list)
Arguments
IF EXISTS: If no materialized view exists with that name, the statement hasno effect.materialized_view_name: The name of the materialized view to alter. SeeTable path syntax.materialized_view_set_options_list:The list of options to set.
materialized_view_set_options_list
The option list allows you to set materialized view options such as a whetherrefresh is enabled. the refresh interval, alabel andan expiration time. You can include multiple options using a comma-separatedlist.
Specify a materialized view option list in the following format:
NAME=VALUE, ...
NAME andVALUE must be one of the following combinations:
NAME | VALUE | Details |
|---|---|---|
enable_refresh | BOOLEAN | Example: |
refresh_interval_minutes | FLOAT64 | Example: |
expiration_timestamp | TIMESTAMP | Example: This property is equivalent to theexpirationTime table resource property. |
max_staleness | INTERVAL | Example: The |
allow_non_incremental_definition | BOOLEAN | Example: The |
kms_key_name |
| Example: This property is equivalent to theencryptionConfiguration.kmsKeyName table resource property. See more details aboutProtecting data with Cloud KMS keys. |
friendly_name |
| Example: This property is equivalent to thefriendlyName table resource property. |
description |
| Example: This property is equivalent to thedescription table resource property. |
labels |
| Example: This property is equivalent to thelabels table resource property. |
tags | ARRAY<STRUCT<STRING, STRING>> | An array of IAM tags for the materialized view, expressed as key-value pairs. The key should be thenamespaced key name, and the value should be theshort name. |
Setting the value replaces the existing value of that option for thematerialized view, if there was one. Setting the value toNULL clears thematerialized view's value for that option.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.get | The materialized view to alter. |
bigquery.tables.update | The materialized view to alter. |
Examples
Setting the enable refresh state and refresh interval on a materialized view
The following example enables refresh and sets the refresh interval to 20minutes on a materialized view:
ALTERMATERIALIZEDVIEWmydataset.my_mvSETOPTIONS(enable_refresh=true,refresh_interval_minutes=20)
ALTER ORGANIZATION SET OPTIONS statement
Sets the options on an organization.
Syntax
ALTERORGANIZATIONSETOPTIONS(organization_set_options_list);
Arguments
organization_set_options_list:The list of options to set.
organization_set_options_list
The option list specifies options for the organization. Specify the options in thefollowing format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
default_kms_key_name | STRING | The default Cloud Key Management Service key for encrypting table data, including temporary or anonymous tables. For more information, seeCustomer-managed Cloud KMS keys. Example: This property is equivalent to the |
default_time_zone | STRING | The default time zone to use in time zone-dependent SQL functions, when a time zone is not specified as an argument. For more information, seetime zones. Example: |
default_query_job_timeout_ms | INT64 | The default time after which a query job times out. The timeout period must be between 5 minutes and 48 hours. Example: |
default_interactive_query_queue_timeout_ms | INT64 | The default amount of time that an interactive query is queued. If unset, the default is 6 hours. The minimum value is 1 millisecond. The maximum value is 48 hours. To disable interactive query queueing, set the value to -1. Example: |
default_batch_query_queue_timeout_ms | INT64 | The default amount of time that a batch query is queued. If unset, the default is 24 hours. The minimum value is 1 millisecond. The maximum value is 48 hours. To disable batch query queueing, set the value to -1. Example: |
default_query_optimizer_options | STRING | The history-based query optimizations. This option can be one of the following:
Example: |
query_runtime | STRING | Specifies whether the BigQuery query processor uses theadvanced runtime. Set the Example: |
Setting the value replaces the existing value of that option for theorganization, if there is one. Setting the value toNULL clears theorganization's value for that option.
Required permissions
TheALTER ORGANIZATION SET OPTIONS statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.config.update | The organization to alter. |
Examples
The following example sets the default time zone to America/Chicago and the default query job timeout to one hour for an organization in the US region:
ALTERORGANIZATIONSETOPTIONS(`region-us.default_time_zone`="America/Chicago",`region-us.default_job_query_timeout_ms`=3600000);
The following example sets the default time zone, the default query job timeout,the default interactive and batch queue timeouts, and the defaultCloud KMS key, clearing the organization level default settings:
ALTERORGANIZATIONSETOPTIONS(`region-us.default_time_zone`=NULL,`region-us.default_kms_key_name`=NULL,`region-us.default_query_job_timeout_ms`=NULL,`region-us.default_interactive_query_queue_timeout_ms`=NULL,`region-us.default_batch_query_queue_timeout_ms`=NULL);
ALTER PROJECT SET OPTIONS statement
Sets the options on a project.
Syntax
ALTERPROJECTproject_idSETOPTIONS(project_set_options_list);
Arguments
project_id: The name of the project you're altering. This argument is optional, and defaults to the project that runs this DDL query.project_set_options_list:The list of options to set.
project_set_options_list
The option list specifies options for the project. Specify the options in thefollowing format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
default_kms_key_name | STRING | The default Cloud Key Management Service key for encrypting table data, including temporary or anonymous tables. For more information, seeCustomer-managed Cloud KMS keys. Example: This property is equivalent to the |
default_time_zone | STRING | The default time zone to use in time zone-dependent SQL functions, when a time zone is not specified as an argument. For more information, seetime zones. Example: |
default_query_job_timeout_ms | INT64 | The default time after which a query job times out. The timeout period must be between 5 minutes and 48 hours. Example: |
default_interactive_query_queue_timeout_ms | INT64 | The default amount of time that an interactive query is queued. If unset, the default is 6 hours. The minimum value is 1 millisecond. The maximum value is 48 hours. To disable interactive query queueing, set the value to -1. Example: |
default_batch_query_queue_timeout_ms | INT64 | The default amount of time that a batch query is queued. If unset, the default is 24 hours. The minimum value is 1 millisecond. The maximum value is 48 hours. To disable batch query queueing, set the value to -1. Example: |
default_query_optimizer_options | STRING | The history-based query optimizations. This option can be one of the following:
Example: |
default_cloud_resource_connection_id | STRING | The default connection to use when creating tables and models ([Preview](/products#product-launch-stages)). Only specify the connection's ID, and exclude the attached project ID and region prefixes. Using default connections can cause the permissions granted to the connection's service account to be updated, depending on the type of table or model you create. For more information, see theDefault connection overview. Example: |
default_sql_dialect_option | STRING | The default sql query dialect for executing query jobs using the bq command-line tool or BigQuery API. Changing this setting doesn't affect the default dialect in the console. This option can be one of the following:
Example: |
query_runtime | STRING | Specifies whether the BigQuery query processor uses theadvanced runtime. Set the Example: |
enable_reservation_based_fairness | BOOL | Determines how idle slots are shared. If Example: |
Setting the value replaces the existing value of that option for the project, if there was one. Setting the value toNULL clears theproject's value for that option.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.config.update | The project to alter. |
Examples
The following example sets the default time zone toAmerica/New_York and the default query job timeout to 30 minutes for a project in theus region.
ALTERPROJECTproject_idSETOPTIONS(`region-us.default_time_zone`="America/New_York",`region-us.default_job_query_timeout_ms`=1800000);
The following example sets the default time zone, the default query job timeout, the default Cloud KMS key toNULL, and the default interactive and batch queue timeouts and default sql dialect, clearing the project level default settings:
ALTERPROJECTproject_idSETOPTIONS(`region-us.default_time_zone`=NULL,`region-us.default_kms_key_name`=NULL,`region-us.default_query_job_timeout_ms`=NULL,`region-us.default_interactive_query_queue_timeout_ms`=NULL,`region-us.default_batch_query_queue_timeout_ms`=NULL,`region-us.default_sql_dialect_option`=NULL);
ALTER BI_CAPACITY SET OPTIONS statement
Sets the options on BigQuery BI Engine capacity.
Syntax
ALTERBI_CAPACITY`project_id.location_id.default`SETOPTIONS(bi_capacity_options_list)
Arguments
project_id: Optional project ID of the project that will benefit from BI Engine acceleration. If omitted, the query project ID is used.location_id: Thelocation where data needs to be cached, prefixed withregion-. Examples:region-us,region-us-central1.bi_capacity_options_list:The list of options to set.
bi_capacity_options_list
The option list specifies a set of options for BigQuery BI Engine capacity.
Specify a column option list in the following format:
NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
size_gb | INT64 | Specifies the size of the reservation in gigabytes. |
preferred_tables | <ARRAY<STRING>> | List of tables that acceleration should be applied to. Format:project.dataset.table ordataset.table. If project is omitted, query project is used. |
SettingVALUE replaces the existing value of that option for the BI Enginecapacity, if there is one. SettingVALUE toNULL clears the valuefor that option.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.bireservations.update | BI Engine reservation |
Examples
Allocating BI Engine capacity without preferred tables
ALTERBI_CAPACITY`my-project.region-us.default`SETOPTIONS(size_gb=250)
Deallocating BI capacity
ALTERBI_CAPACITY`my-project.region-us.default`SETOPTIONS(size_gb=0)
Removing a set of preferred tables from reservation
ALTERBI_CAPACITY`my-project.region-us.default`SETOPTIONS(preferred_tables=NULL)
Allocating BI Capacity with preferred tables list
ALTERBI_CAPACITY`my-project.region-us.default`SETOPTIONS(size_gb=250,preferred_tables=["data_project1.dataset1.table1","data_project2.dataset2.table2"])
Overwriting list of preferred tables without changing the size
ALTERBI_CAPACITY`region-us.default`SETOPTIONS(preferred_tables=["dataset1.table1","data_project2.dataset2.table2"])
ALTER CAPACITY SET OPTIONS statement
Alters an existing capacity commitment.
Syntax
ALTERCAPACITY`project_id.location_id.commitment_id`SETOPTIONS(alter_capacity_commitment_option_list);
Arguments
project_id: The project ID of the administration project that maintains ownership of this commitment.location_id: Thelocation ofthe commitment.commitment_id: The ID of the commitment. The value must be unique to theproject and location. It must start and end with a lowercase letter or anumber and contain only lowercase letters, numbers and dashes.alter_capacity_commitment_option_list: The options you can set to alter the capacity commitment.
alter_capacity_commitment_option_list
The option list specifies options for the dataset. Specify the options in the following format:NAME=VALUE, ...
The following options are supported:
NAME | TYPE | Details |
|---|---|---|
plan | String | The commitment plan to purchase. Supported values include:ANNUAL,THREE_YEAR, andTRIAL. For more information, seeslot commitments. |
renewal_plan | String | The plan this capacity commitment is converted to aftercommitment_end_time passes. Once the plan is changed, the committed period is extended according to the commitment plan. Applicable for ANNUAL, THREE_YEAR, and TRIAL commitments. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.capacityCommitments.update | The administration project thatmaintains ownership of thecommitments. |
Example
The following example changes a capacity commitment to a three-year plan that islocated in theregion-us region and managed by a projectadmin_project:
ALTERCAPACITY`admin_project.region-us.my-commitment`SETOPTIONS(plan='THREE_YEAR');
ALTER RESERVATION SET OPTIONS statement
Alters an existing reservation.
Syntax
ALTERRESERVATION`project_id.location_id.reservation_id`SETOPTIONS(alter_reservation_option_list);
Arguments
project_id: The project ID of the administration project that maintains ownership of this reservation.location_id: Thelocation ofthe reservation.reservation_id: The ID of the reservation. The value must be unique to theproject and location. It must start and end with a lowercase letter or anumber and contain only lowercase letters, numbers and dashes.alter_reservation_option_list: The options you can set to alter the reservation.
alter_reservation_option_list
The option list specifies options for the dataset. Specify the options in the following format:NAME=VALUE, ...
The following options are supported:
NAME | TYPE | Details |
|---|---|---|
ignore_idle_slots | BOOLEAN | If the value istrue, then the reservation uses only the slots that are provisioned to it. The default value isfalse. For more information, seeIdle slots. |
slot_capacity | INTEGER | The number of slots to allocate to the reservation. If this reservation was created with anedition, this is equivalent to the amount ofbaseline slots. |
target_job_concurrency | INTEGER | A soft upper bound on the number of jobs that can run concurrently in this reservation. |
autoscale_max_slots | INTEGER | The maximum number of slots that can be added to the reservation by autoscaling. |
secondary_location | STRING | The secondary location to use in the case of disaster recovery. |
is_primary | BOOLEAN | If the value istrue, the reservation is set to be the primary reservation. |
labels | <ARRAY<STRUCT<STRING, STRING>>> | An array of labels for the reservation, expressed as key-value pairs. |
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.reservations.update | The administration project thatmaintains ownership of thecommitments. |
Examples
Autoscaling example
The following example changes an autoscaling reservation to 300 baseline slotsand 400 autoscaling slots for a max reservation size of 700. These slots arelocated in theregion-us region and managed by a projectadmin_project:
ALTERRESERVATION`admin_project.region-us.my-reservation`SETOPTIONS(slot_capacity=300,autoscale_max_slots=400);
ALTER VECTOR INDEX REBUILD statement
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: To provide feedback or request support for this feature, send email tobq-vector-search@google.com.Rebuild avector index on a table.
Syntax
ALTERVECTORINDEX[IFEXISTS]index_nameONtable_nameREBUILD;
Arguments
IF EXISTS: If no vector index exists with that name, the statement hasno effect.index_name: The name of the vector index to rebuild.table_name: The name of the table that the vector index is on.SeeTable path syntax.If the table doesn't contain a vector index, or if the table contains adifferent vector index than the one specified in the
index_nameargument, the query fails.REBUILD: Indicates that the statement rebuilds the vector index. Thisargument is required.
Details
Use theALTER VECTOR INDEX REBUILD statement to rebuild an active vectorindex on a table without having to drop the vector index, and without anyindex downtime. When you run the statement, BigQuery creates ashadow index on the table and trains it in the background.BigQuery promotes the shadow index to be the active indexwhen the shadow index has enough coverage.
To run theALTER VECTOR INDEX REBUILD statement, you mustcreate a reservation assignmentwith a job type ofBACKGROUND for the project that contains the table.If you run the statement without an appropriate reservation, the query fails.
You can have only one vector index rebuild in progress at a time. TheALTER VECTOR INDEX REBUILD statement completes before the shadow indexreplaces the active index, because the shadow index training and cutoverhappen asynchronously. If you start another vector indexrebuild before the shadow index replaces the initial index, the second rebuildrequest fails.
Required permissions
To get the permissions that you need to alter vector indexes, ask your administrator to grant you the BigQuery Data Editor (roles/bigquery.dataEditor) or BigQuery Data Owner (roles/bigquery.dataOwner) IAM role on your table. For more information about granting roles, seeManage access to projects, folders, and organizations.
You might also be able to get the required permissions throughcustom roles or otherpredefined roles.
Examples
The following example rebuilds theindex1 vector index on thesales table:
ALTERVECTORINDEXIFEXISTSindex1ONmydataset.salesREBUILD;
ALTER DATA_POLICY statement
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: To provide feedback or request support for this feature, send an email tobigquery-security@google.com.Syntax
ALTERDATA_POLICY[IFEXISTS]`project_id.region-location_id.data_policy_id`SETOPTIONS(alter_option_list);
Arguments
IF EXISTS: If no data policy exists with that name, the statement hasno effect.project_id: The project ID of the project where the data policy will residein.location_id: The location of the data policy.data_policy_id: The name of the data policy to be updated.alter_option_list: The list of options to update on the data policy.
alter_option_list
The option list specifies options for the data policy. Specify the options inthe following format:NAME=VALUE, ...
The following options are supported:
NAME | VALUE | Details |
|---|---|---|
data_policy_type | STRING | Set it toDATA_MASKING_POLICY. |
masking_expression | STRING | Specifies thepredefined masking rule or acustom masking routine. |
Required permissions
The user or service account that updates a data policy must have thebigquery.dataPolicies.update permission.
Thebigquery.dataPolicies.update permission is included in theBigQuery Data Policy Admin, BigQuery Admin and BigQuery Data Owner roles.
DROP SCHEMA statement
Deletes a dataset.
Syntax
DROP[EXTERNAL]SCHEMA[IFEXISTS][project_name.]dataset_name[CASCADE|RESTRICT]Arguments
EXTERNAL: Specifies if that dataset is a federated dataset. TheDROP EXTERNALstatement only removes the external definition fromBigQuery. The data stored in the external location is notaffected.IF EXISTS: If no dataset exists with that name, the statement has no effect.project_name: The name of the project that contains the dataset. Defaultsto the project that runs this DDL statement.dataset_name: The name of the dataset to delete.CASCADE: Deletes the dataset and all resources within the dataset, such astables, views, and functions. You must have permission to delete theresources, or else the statement returns an error. For a list ofBigQuery permissions, seePredefined roles and permissions.RESTRICT: Deletes the dataset only if it's empty. Otherwise, returns anerror. If you don't specify eitherCASCADEorRESTRICT, then the defaultbehavior isRESTRICT.
Details
The statement runs in the location of the dataset if it exists, unless youspecify the location in the query settings. For more information, seeSpecifying your location.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.datasets.delete | The dataset to delete. |
bigquery.tables.delete | The dataset to delete. If the dataset is empty,then this permission is not required. |
Examples
The following example deletes the dataset namedmydataset. If the dataset doesnot exist or is not empty, then the statement returns an error.
DROPSCHEMAmydatasetThe following example drops the dataset namedmydataset and any resourcesin that dataset. If the dataset does not exist, then no error is returned.
DROPSCHEMAIFEXISTSmydatasetCASCADEUNDROP SCHEMA statement
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Undeletes a datasetwithin your time travel window.
Syntax
UNDROPSCHEMA[IFNOTEXISTS][project_name.]dataset_name[OPTIONS(location="us")]Arguments
IF NOT EXISTS: If a dataset already exists with that name, the statement hasno effect.project_name: The name of the project that contained the deleted dataset.Defaults to the project that runs this DDL statement.dataset_name: The name of the dataset to undelete.location: Original location of the deleted dataset.
Details
When you run this statement, you mustspecify the locationwhere the dataset was deleted. If you don't, theUS multi-region is used.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.datasets.create | The project where you are undeleting thedataset. |
bigquery.datasets.get | The dataset that you are undeleting. |
Examples
The following example undeletes the dataset namedmydataset. If the datasetalready exists or has passed the time travel window, then the statement returnsan error.
UNDROPSCHEMAmydataset;DROP TABLE statement
Deletes a table ortable clone.
Syntax
DROPTABLE[IFEXISTS]table_nameArguments
IF EXISTS: If no table exists with that name, the statement has no effect.table_name: The name of the table to delete. SeeTable path syntax.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.delete | The table to delete. |
bigquery.tables.get | The table to delete. |
Examples
Deleting a table
The following example deletes a table namedmytable in themydataset:
DROPTABLEmydataset.mytable
If the table name does not exist in the dataset, the following error isreturned:
Error: Not found: Table myproject:mydataset.mytable
Deleting a table only if the table exists
The following example deletes a table namedmytable inmydataset only ifthe table exists. If the table name does not exist in the dataset, no error isreturned, and no action is taken.
DROPTABLEIFEXISTSmydataset.mytable
DROP SNAPSHOT TABLE statement
Deletes atable snapshot.
Syntax
DROPSNAPSHOTTABLE[IFEXISTS]table_snapshot_nameArguments
IF EXISTS: If no table snapshot exists with that name, then the statement has noeffect.table_snapshot_name: The name of the table snapshot to delete. SeeTable path syntax.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.deleteSnapshot | The table snapshot to delete. |
Examples
Delete a table snapshot: fail if it doesn't exist
The following example deletes the table snapshot namedmytablesnapshot in themydataset dataset:
DROPSNAPSHOTTABLEmydataset.mytablesnapshot
If the table snapshot does not exist in the dataset, then the following error isreturned:
Error: Not found: Table snapshot myproject:mydataset.mytablesnapshot
Delete a table snapshot: ignore if it doesn't exist
The following example deletes the table snapshot namedmytablesnapshot in themydataset dataset.
DROPSNAPSHOTTABLEIFEXISTSmydataset.mytablesnapshot
If the table snapshot doesn't exist in the dataset, then no action is taken, andno error is returned.
For information about creating table snapshots, seeCREATE SNAPSHOT TABLE.
For information about restoring table snapshots, seeCREATE TABLE CLONE.
DROP EXTERNAL TABLE statement
Deletes an external table.
Syntax
DROPEXTERNALTABLE[IFEXISTS]table_nameArguments
IF EXISTS: If no external table exists with that name, then the statement has noeffect.table_name: The name of the external table to delete. SeeTable path syntax.
Details
Iftable_name exists but is not an external table, the statement returns the followingerror:
Cannot droptable_name which has typeTYPE. Anexternal table was expected.
TheDROP EXTERNAL statement only removes the external table definition fromBigQuery. The data stored in the external location is notaffected.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.delete | The external table to delete. |
bigquery.tables.get | The external table to delete. |
Examples
The following example drops the external table namedexternal_table from thedatasetmydataset. It returns an error if the external table does not exist.
DROPEXTERNALTABLEmydataset.external_tableThe following example drops the external table namedexternal_table from thedatasetmydataset. If the external table does not exist, no error is returned.
DROPEXTERNALTABLEIFEXISTSmydataset.external_tableDROP VIEW statement
Deletes a view.
Syntax
DROPVIEW[IFEXISTS]view_nameArguments
IF EXISTS: If no view exists with that name, the statement has no effect.view_name: The name of the view to delete. SeeTable path syntax.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.delete | The view to delete. |
bigquery.tables.get | The view to delete. |
Examples
Deleting a view
The following example deletes a view namedmyview inmydataset:
DROPVIEWmydataset.myview
If the view name does not exist in the dataset, the following error is returned:
Error: Not found: Table myproject:mydataset.myview
Deleting a view only if the view exists
The following example deletes a view namedmyview inmydataset only ifthe view exists. If the view name does not exist in the dataset, no error isreturned, and no action is taken.
DROPVIEWIFEXISTSmydataset.myview
DROP MATERIALIZED VIEW statement
Deletes a materialized view.
Syntax
DROPMATERIALIZEDVIEW[IFEXISTS]mv_nameArguments
IF EXISTS: If no materialized view exists with that name, the statement hasno effect.mv_name: The name of the materialized view to delete. SeeTable path syntax.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.delete | The materialized view to delete. |
bigquery.tables.get | The materialized view to delete. |
Examples
Deleting a materialized view
The following example deletes a materialized view namedmy_mv inmydataset:
DROPMATERIALIZEDVIEWmydataset.my_mv
If the materialized view name does not exist in the dataset, the following erroris returned:
Error: Not found: Table myproject:mydataset.my_mv
If you are deleting a materialized view in another project, you must specify theproject, dataset, and materialized view in the following format:`project_id.dataset.materialized_view`(including the backticks ifproject_id contains special characters); for example,`myproject.mydataset.my_mv`.
Deleting a materialized view only if it exists
The following example deletes a materialized view namedmy_mv inmydatasetonly if the materialized view exists. If the materialized view name does notexist in the dataset, no error is returned, and no action is taken.
DROPMATERIALIZEDVIEWIFEXISTSmydataset.my_mv
If you are deleting a materialized view in another project, you must specify theproject, dataset, and materialized view in the following format:`project_id.dataset.materialized_view`,(including the backticks ifproject_id contains special characters); for example,`myproject.mydataset.my_mv`.
DROP FUNCTION statement
Deletes a persistent user-defined function (UDF) oruser-defined aggregate function (UDAF).
Syntax
DROPFUNCTION[IFEXISTS][[project_name.]dataset_name.]function_nameArguments
IF EXISTS: If no function exists with that name, the statement has noeffect.project_name: The name of the project containing the function to delete.Defaults to the project that runs this DDL query. If the project namecontains special characters such as colons, it should be quoted in backticks`(example:`google.com:my_project`).dataset_name: The name of the dataset containing the function to delete.Defaults to thedefaultDatasetin the request.function_name: The name of the function you're deleting.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.routines.delete | The function to delete. |
Examples
The following example statement deletes the functionparseJsonAsStructcontained in the datasetmydataset.
DROPFUNCTIONmydataset.parseJsonAsStruct;The following example statement deletes the functionparseJsonAsStruct fromthe datasetsample_dataset in the projectother_project.
DROPFUNCTION`other_project`.sample_dataset.parseJsonAsStruct;DROP TABLE FUNCTION
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Deletes atable function.
Syntax
DROPTABLEFUNCTION[IFEXISTS][[project_name.]dataset_name.]function_name
Arguments
IF EXISTS: If no table function exists with this name, the statementhas no effect.project_name: The name of the project containing the table functionto delete. Defaults to the project that runs this DDL query.dataset_name: The name of the dataset containing the table function todelete.function_name: The name of the table function to delete.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.routines.delete | The table function to delete. |
Example
The following example deletes a table function namedmy_table_function:
DROPTABLEFUNCTIONmydataset.my_table_function;DROP PROCEDURE statement
Deletes a stored procedure.
Syntax
DROPPROCEDURE[IFEXISTS][[project_name.]dataset_name.]procedure_nameArguments
IF EXISTS: If no procedure exists with that name, the statement has noeffect.project_name: The name of the project containing the procedure to delete.Defaults to the project that runs this DDL query. If the project namecontains special characters such as colons, it should be quoted in backticks`(example:`google.com:my_project`).dataset_name: The name of the dataset containing the procedure to delete.Defaults to thedefaultDatasetin the request.procedure_name: The name of the procedure you're deleting.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.routines.delete | The procedure to delete. |
Examples
The following example statement deletes the proceduremyprocedurecontained in the datasetmydataset.
DROPPROCEDUREmydataset.myProcedure;The following example statement deletes the proceduremyProcedure fromthe datasetsample_dataset in the projectother_project.
DROPPROCEDURE`other-project`.sample_dataset.myprocedure;DROP ROW ACCESS POLICY statement
Deletes a row-level access policy.
Important: You cannot delete the last row-level access policy from a table usingDROP ROW ACCESS POLICY. Attempting to do so results in an error. Todelete the last row-level access policy on table, you must useDROP ALL ROWACCESS POLICIES instead.Syntax
DROPROWACCESSPOLICY[IFEXISTS]row_access_policy_nameONtable_name;DROPALLROWACCESSPOLICIESONtable_name;Arguments
IF EXISTS: If no row-level access policy exists with that name, thestatement has no effect.row_access_policy_name: The name of the row-level access policy that you aredeleting. Each row-level access policy on a table has a unique name.table_name: The name of the table with the row-level access policy orpolicies that you want to delete.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.rowAccessPolicies.delete | The row-level access policy to delete. |
bigquery.rowAccessPolicies.setIamPolicy | The row-level access policy to delete. |
bigquery.rowAccessPolicies.list | The table to delete all row-level access policies on. Only required forDROP ALL statements. |
Examples
Delete a row-level access policy from a table:
DROPROWACCESSPOLICYmy_row_filterONproject.dataset.my_table;
Delete all the row-level access policies from a table:
DROPALLROWACCESSPOLICIESONproject.dataset.my_table;
DROP CAPACITY statement
Deletes a capacity commitment.
Syntax
DROPCAPACITY[IFEXISTS]project_id.location.capacity-commitment-idArguments
IF EXISTS: If no capacity commitment exists with that ID, the statement hasno effect.project_id: The project ID of the administration project where thereservation was created.location: Thelocation ofthe commitment.capacity-commitment-id: The capacity commitment ID.
To find the capacity commitment ID, query theINFORMATION_SCHEMA.CAPACITY_COMMITMENTS_BY_PROJECTtable.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.capacityCommitments.delete | The administration project thatmaintains ownership of thecommitments. |
Example
The following example deletes the capacity commitment:
DROPCAPACITY`admin_project.region-us.1234`DROP RESERVATION statement
Deletes a reservation.
Syntax
DROPRESERVATION[IFEXISTS]project_id.location.reservation_idArguments
IF EXISTS: If no reservation exists with that ID, the statement has noeffect.project_id: The project ID of the administration project where thereservation was created.location: Thelocation ofthe reservation.reservation_id: The reservation ID.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.reservations.delete | The administration project that maintainsownership of the commitments. |
Example
The following example deletes the reservationprod:
DROPRESERVATION`admin_project.region-us.prod`DROP ASSIGNMENT statement
Deletes a reservation assignment.
Syntax
DROPASSIGNMENT[IFEXISTS]project_id.location.reservation_id.assignment_idArguments
IF EXISTS: If no assignment exists with that ID, the statement has noeffect.project_id: The project ID of the administration project where thereservation was created.location: Thelocation ofthe reservation.reservation_id: The reservation ID.assignment_id: The assignment ID.
To find the assignment ID, query theINFORMATION_SCHEMA.ASSIGNMENTS view.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.reservationAssignments.delete | The administration project andthe assignee. |
Example
The following example deletes an assignment from the reservation namedprod:
DROPASSIGNMENT`admin_project.region-us.prod.1234`DROP SEARCH INDEX statement
Deletes a search index on a table.
Syntax
DROPSEARCHINDEX[IFEXISTS]index_nameONtable_nameArguments
IF EXISTS: If no search index exists with that name on the table, thestatement has no effect.index_name: The name of the search index to be deleted.table_name: The name of the table with the index.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.deleteIndex | The table with the search index to delete. |
Example
The following example deletes a search indexmy_index frommy_table:
DROPSEARCHINDEXmy_indexONdataset.my_table;DROP VECTOR INDEX statement
Deletes a vector index on a table.
Syntax
DROPVECTORINDEX[IFEXISTS]index_nameONtable_nameArguments
IF EXISTS: If no vector index exists with that name on the table, thestatement has no effect.index_name: The name of the vector index to be deleted.table_name: The name of the table with the vector index.
Required permissions
This statement requires the followingIAM permissions:
| Permission | Resource |
|---|---|
bigquery.tables.deleteIndex | The table with the vector index to delete. |
Example
The following example deletes a vector indexmy_index frommy_table:
DROPVECTORINDEXmy_indexONdataset.my_table;DROP DATA_POLICY statement
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: To provide feedback or request support for this feature, send an email tobigquery-security@google.com.Deletes a data policy in a project.
Syntax
DROPDATA_POLICY[IFEXISTS]`myproject.region-us.data_policy_name`;Arguments
IF EXISTS: If no data policy exists with that name, the statement hasno effect.project_id: The project ID of the project where the data policy will residein.location_id: The location of the data policy.data_policy_id: The name of the data policy to be deleted.
Required permissions
The user or service account that creates a data policy must have thebigquery.dataPolicies.delete permission. This permission is included in theBigQuery Data Policy Admin, BigQuery Admin and BigQuery Data Owner roles.
Table path syntax
Use the following syntax when specifying the path of atable resource,including standard tables, views, materialized views, external tables, andtable snapshots.
table_path:=[[project_name.]dataset_name.]table_nameproject_name: The name of the project that contains the table resource.Defaults to the project that runs the DDL query. If the project name containsspecial characters such as colons, quote the name in backticks`(example:`google.com:my_project`).dataset_name: The name of the dataset that contains the table resource.Defaults to thedefaultDatasetin the request.table_name: The name of the table resource.
When you create a table in BigQuery, the table name mustbe unique per dataset. The table name can:
- Contain characters with a total of up to 1,024 UTF-8 bytes.
- Contain Unicode characters in category L (letter), M (mark), N (number),Pc (connector, including underscore), Pd (dash), Zs (space). For moreinformation, seeGeneral Category.
The following are all examples of valid table names:table 01,ग्राहक,00_お客様,étudiant-01.
Caveats:
- Table names are case-sensitive by default.
mytableandMyTablecancoexist in the same dataset, unless they are part of adataset withcase-sensitivity turned off. - Some table names and table name prefixes are reserved. Ifyou receive an error saying that your table name or prefix isreserved, then select a different name and try again.
If you include multiple dot operators (
.) in a sequence, the duplicateoperators are implicitly stripped.For example, this:
project_name....dataset_name..table_nameBecomes this:
project_name.dataset_name.table_name
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.