Updating partitioned table data using DML

This page provides an overview of data manipulation language (DML) support forpartitioned tables.

For more information on DML, see:

Tables used in examples

The following JSON schema definitions represent the tables used in the exampleson this page.

mytable: aningestion-time partitioned table

    [      {"name": "field1", "type": "INTEGER"},      {"name": "field2", "type": "STRING"}    ]

mytable2: a standard (non-partitioned) table

    [      {"name": "id", "type": "INTEGER"},      {"name": "ts", "type": "TIMESTAMP"}    ]

mycolumntable: apartitioned tablethat is partitioned by using thetsTIMESTAMP column

    [      {"name": "field1", "type": "INTEGER"},      {"name": "field2", "type": "STRING"}      {"name": "field3", "type": "BOOLEAN"}      {"name": "ts", "type": "TIMESTAMP"}    ]

In examples whereCOLUMN_ID appears, replace it with the name of thecolumn you wish to operate on.

Inserting data

You use a DMLINSERT statementto add rows to a partitioned table.

Inserting data into ingestion-time partitioned tables

When you use a DML statement to add rows to an ingestion-time partitioned table,you can specify the partition to which the rows should be added. You referencethe partition using the_PARTITIONTIME pseudocolumn.

For example, the followingINSERT statement adds a row to the May 1, 2017partition ofmytable —“2017-05-01”.

INSERTINTOproject_id.dataset.mytable(_PARTITIONTIME,field1,field2)SELECTTIMESTAMP("2017-05-01"),1,"one"

Only timestamps that correspond to exact date boundaries can be used. Forexample, the following DML statement returns an error:

INSERTINTOproject_id.dataset.mytable(_PARTITIONTIME,field1,field2)SELECTTIMESTAMP("2017-05-01 21:30:00"),1,"one"

Note: The_PARTITIONTIME pseudocolumn can also be modified using an UPDATE statement.

Inserting data into partitioned tables

Inserting data into a partitioned table using DML is the same as inserting datainto a non-partitioned table.

For example, the followingINSERT statement adds rows to partitioned tablemycolumntable by selecting data frommytable2 (a non-partitioned table).

INSERTINTOproject_id.dataset.mycolumntable(ts,field1)SELECTts,idFROMproject_id.dataset.mytable2

Deleting data

You use a DMLDELETE statementto delete rows from a partitioned table.

Deleting data in ingestion-time partitioned tables

The followingDELETE statement deletes all rows from the June 1, 2017partition ("2017-06-01") ofmytable wherefield1 is equal to21. Youreference the partition using the_PARTITIONTIME pseudocolumn.

DELETEproject_id.dataset.mytableWHEREfield1=21AND_PARTITIONTIME="2017-06-01"

Deleting data in partitioned tables

Deleting data in a partitioned table using DML is the same as deleting datafrom a non-partitioned table.

For example, the followingDELETE statement deletes all rows from theJune 1, 2017 partition ("2017-06-01") ofmycolumntable wherefield1 isequal to21.

DELETEproject_id.dataset.mycolumntableWHEREfield1=21ANDDATE(ts)="2017-06-01"

Using DML DELETE to delete partitions

If a qualifyingDELETE statement covers all rows in a partition,BigQuery removes the entire partition. This removal is donewithout scanning bytes or consuming slots. The following example of aDELETEstatement covers the entire partition of a filter on the_PARTITIONDATEpseudocolumn:

DELETEmydataset.mytableWHERE_PARTITIONDATEIN('2076-10-07','2076-03-06');

Common disqualifications

Queries with the following characteristics may not benefit from the optimization:

Partial partition coverage
References to non-partitioning columns
Recently ingested data through the BigQueryStorage Write API or thelegacy streaming API
Filters with subqueries or unsupported predicates

Eligibility for optimization can vary with the type of partitioning, theunderlying storage metadata, and the filter predicates. As a best practice,perform a dry run to verify that the query results in 0 bytes processed.

Multi-statement transaction

This optimization works within amulti-statement transaction. The following query example replaces a partition with data from another table in a single transaction, without scanning the partition fortheDELETE statement.

DECLAREREPLACE_DAYDATE;BEGINTRANSACTION;-- find the partition which we want to replaceSETREPLACE_DAY=(SELECTMAX(d)FROMmydataset.mytable_staging);-- delete the entire partition from mytableDELETEFROMmydataset.mytableWHEREpart_col=REPLACE_DAY;-- insert the new data into the same partition in mytableINSERTINTOmydataset.mytableSELECT*FROMmydataset.mytable_stagingWHEREpart_col=REPLACE_DAY;COMMITTRANSACTION;

Updating data

You use anUPDATE statementto update rows in a partitioned table.

Updating data in ingestion-time partitioned tables

The followingUPDATE statement moves rows from one partition to another.Rows in the May 1, 2017 partition (“2017-05-01”) ofmytable wherefield1is equal to21 are moved to the June 1, 2017 partition (“2017-06-01”).

UPDATEproject_id.dataset.mytableSET_PARTITIONTIME="2017-06-01"WHERE_PARTITIONTIME="2017-05-01"ANDfield1=21

Updating data in partitioned tables

Updating data in a partitioned table using DML is the same as updating datafrom a non-partitioned table. For example, the followingUPDATEstatement moves rows from one partition to another. Rows in the May 1, 2017partition (“2017-05-01”) ofmytable wherefield1 is equal to21 aremoved to the June 1, 2017 partition (“2017-06-01”).

UPDATEproject_id.dataset.mycolumntableSETts="2017-06-01"WHEREDATE(ts)="2017-05-01"ANDfield1=21

DML in hourly, monthly, and yearly partitioned tables

You can use DML statements to modify an hourly, monthly, or yearly partitioned table.Provide the hour, month, or year range of the relevant dates/timestamps/datetimes, asin the following example for monthly partitioned tables:

bqquery--nouse_legacy_sql'DELETE FROM my_dataset.my_table WHERE    TIMESTAMP_TRUNC(ts_column, MONTH) = "2020-01-01 00:00:00";'

Or another example for partitioned tables withDATETIME column:

bqquery--nouse_legacy_sql'DELETE FROM my_dataset.my_table WHERE    dt_column BETWEEN DATETIME("2020-01-01")    AND DATETIME("2020-05-01");'

Using a`MERGE` statement

You use a DMLMERGE statementto combineINSERT,UPDATE, andDELETE operations for a partitioned tableinto one statement and perform them atomically.

Pruning partitions when using a`MERGE` statement

When you run aMERGE statement against a partitioned table, you can limitwhich partitions are scanned by including the partitioning column in eithera subquery filter, asearch_condition filter, or amerge_condition filter.Pruning can occur when scanning the source table or the target table, or both.

Each of the examples below queries an ingestion-time partitioned table usingthe_PARTITIONTIME pseudocolumn as a filter.

Using a subquery to filter source data

In the followingMERGE statement, the subquery in theUSING clause filterson the_PARTITIONTIME pseudocolumn in the source table.

MERGEdataset.targetTUSING(SELECT*FROMdataset.sourceWHERE_PARTITIONTIME='2018-01-01')SONT.COLUMN_ID=S.COLUMN_IDWHENMATCHEDTHENDELETE

Looking at the query execution plan, the subquery runs first. Only the rows inthe'2018-01-01' partition in the source table are scanned. Here is therelevant stage in the query plan:

READ $10:name, $11:_PARTITIONTIMEFROM temp.sourceWHERE equal($11, 1514764800.000000000)

Using a filter in the`search_condition` of a`when_clause`

If asearch_condition contains a filter, then the query optimizer attempts toprune partitions. For example, in the followingMERGE statement, eachWHENMATCHED andWHEN NOT MATCHED clause contains a filter on the_PARTITIONTIMEpseudocolumn.

MERGEdataset.targetTUSINGdataset.sourceSONT.COLUMN_ID=S.COLUMN_IDWHENMATCHEDANDT._PARTITIONTIME='2018-01-01'THENUPDATESETCOLUMN_ID=S.COLUMN_IDWHENMATCHEDANDT._PARTITIONTIME='2018-01-02'THENUPDATESETCOLUMN_ID=S.COLUMN_ID+10WHENNOTMATCHEDBYSOURCEANDT._PARTITIONTIME='2018-01-03'THENDELETE

During the join stage, only the following partitions are scanned in the targettable:'2018-01-01','2018-01-02', and'2018-01-03' — that is, theunion of all thesearch_condition filters.

From the query execution plan:

READ$1:COLUMN_ID, $2:_PARTITIONTIME, $3:$file_temp_id, $4:$row_temp_idFROM temp.targetWHERE or(equal($2, 1514764800.000000000), equal($2, 1514851200.000000000), equal($2, 1514937600.000000000))

However, in the following example, theWHEN NOT MATCHED BY SOURCE clause doesnot have a filter expression:

MERGEdataset.targetTUSINGdataset.sourceSONT.COLUMN_ID=S.COLUMN_IDWHENMATCHEDANDT._PARTITIONTIME='2018-01-01'THENUPDATESETCOLUMN_ID=S.COLUMN_IDWHENNOTMATCHEDBYSOURCETHENUPDATESETCOLUMN_ID=COLUMN_ID+1

This query must scan the entire target table to compute theWHEN NOT MATCHED BYSOURCE clause. As a result, no partitions are pruned.

Using a constant false predicate in a`merge_condition`

If you use theWHEN NOT MATCHED andWHEN NOT MATCHED BY SOURCE clausestogether, then BigQuery usually performs a full outer join, whichcannot be pruned. However, if the merge condition uses a constant falsepredicate, then BigQuery can use the filter condition forpartition pruning. For more information about the use of constant falsepredicates, see the description of themerge_condition clause in theMERGE statementdocumentation.

The following example scans only the'2018-01-01' partition in both the targetand source tables.

MERGEdataset.targetTUSINGdataset.sourceSONFALSEWHENNOTMATCHEDAND_PARTITIONTIME='2018-01-01'THENINSERT(COLUMN_ID)VALUES(COLUMN_ID)WHENNOTMATCHEDBYSOURCEAND_PARTITIONTIME='2018-01-01'THENDELETE

Using a filter in a`merge_condition`

The query optimizer attempts to use a filter in amerge_condition to prunepartitions. The query optimizer might or might not be able to push the predicatedown to the table scanning stage, depending on the type of join.

In the following example, themerge_condition is used as a predicate to jointhe source and target tables. The query optimizer can push this predicate downwhen it scans both tables. As a result, the query only scans the'2018-01-01'partition in both the target and source tables.

MERGEdataset.targetTUSINGdataset.sourceSONT.COLUMN_ID=S.COLUMN_IDANDT._PARTITIONTIME='2018-01-01'ANDS._PARTITIONTIME='2018-01-01'WHENMATCHEDTHENUPDATESETCOLUMN_ID=NEW_VALUE

In the next example, themerge_condition does not contain a predicate for thesource table, so no partition pruning can be performed on the source table. Thestatement does contain a predicate for the target table, but the statement usesaWHEN NOT MATCHED BY SOURCE clause, rather than aWHEN MATCHED clause. Thatmeans the query has to scan the entire target table for the rows that don'tmatch.

MERGEdataset.targetTUSINGdataset.sourceSONT.COLUMN_ID=S.COLUMN_IDANDT._PARTITIONTIME='2018-01-01'WHENNOTMATCHEDBYSOURCETHENUPDATESETCOLUMN_ID=NEW_VALUE

Limitations

For information about DML limitations, seeLimitationson theDML referencepage.

Quotas

For information about DML quota information, seeDML statements on theQuotas and limits page.

Pricing

For information about DML pricing, see how to compute the query size for DMLstatements run onPartitioned tables.

Table security

To control access to tables in BigQuery, seeControl access to resources with IAM.

What's next

Learn how tocreate partitioned tables
Learn how toquery partitioned tables
Get anintroduction to DML
Learn how to compose DML statements usingDML syntax

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換