Partitioned Data Manipulation Language

PartitionedData Manipulation Language (partitionedDML) is designed for the following types of bulk updates and deletes:

  • Periodic cleanup and garbage collection. Examples are deleting old rows orsetting columns toNULL.
  • Backfilling new columns with default values. An example is using anUPDATEstatement to set a new column's value toFalse where it is currentlyNULL.

Partitioned DML isn't suitable for small-scale transaction processing. If youwant to run a statement on a few rows, use transactional DMLs with identifiableprimary keys. For more information, seeUsing DML.

If you need to commit a large number of blind writes, but don't require anatomic transaction, you can bulk modify your Spanner tablesusing batch write. For more information, seeModify data using batch writes.

You can get insights on active partitioned DML queries and their progress fromstatistics tables in your Spanner database. For more information,seeActive partitioned DMLs statistics.

DML and partitioned DML

Spanner supports two execution modes for DML statements:

  • DML, which is suitable for transaction processing. For more information, seeUsing DML.

  • Partitioned DML, which enables large-scale, database-wide operations withminimal impact on concurrent transaction processing by partitioning the keyspace and running the statement over partitions in separate, smaller-scopedtransactions. For more information, seeUsing partitioned DML.

The following table highlights some of the differences between the two executionmodes.

DMLPartitioned DML
Rows that don't match theWHERE clause might be locked.Only rows that match theWHERE clause are locked.
Transaction size limits apply.Spanner handles the transaction limits and per-transaction concurrency limits.
Statements don't need to be idempotent.A DML statement must beidempotent to ensure consistent results.
A transaction can include multiple DML and SQL statements.A partitioned transaction can include only one DML statement.
There are no restrictions on complexity of statements.Statements must befully partitionable.
You create read-write transactions in your client code.Spanner creates the transactions.

Partitionable and idempotent

When a partitioned DML statement runs, rows in one partition don't have accessto rows in other partitions, and you cannot choose how Spanner createsthe partitions. Partitioning ensures scalability, but it also means thatpartitioned DML statements must befully partitionable. That is, thepartitioned DML statement must be expressible as the union of a set ofstatements, where each statement accesses a single row of the table and eachstatement accesses no other tables. For example, a DML statement that accessesmultiple tables or performs a self-join is not partitionable. If the DMLstatement is not partitionable, Spanner returns the errorBadUsage.

These DML statements are fully partitionable, because each statement canbe applied to a single row in the table:

UPDATE Singers SET LastName = NULL WHERE LastName = '';DELETE FROM Albums WHERE MarketingBudget > 10000;

This DML statement is not fully partitionable, because it accesses multipletables:

# Not fully partitionableDELETE FROM Singers WHERESingerId NOT IN (SELECT SingerId FROM Concerts);

Spanner might execute a partitioned DML statement multiple timesagainst some partitions due to network-level retries. As a result, a statementmight be executed more than once against a row. The statement must therefore beidempotent to yield consistent results. A statement is idempotent if executingit multiple times against a single row leads to the same result.

This DML statement is idempotent:

UPDATE Singers SET MarketingBudget = 1000 WHERE true;

This DML statement is not idempotent:

UPDATE Singers SET MarketingBudget = 1.5 * MarketingBudget WHERE true;

Delete rows from parent tables with indexed child tables

When you use a partitioned DML statement to delete rows in a parent table, theoperation might fail with the error:The transaction contains too manymutations. This occurs if the parent table has interleaved child tables thatcontain a global index. Mutations to the child table rows themselves aren'tcounted against the transaction'smutation limit.However, the corresponding mutations to the index entries are counted. If alarge number of child table index entries are affected, the transaction mightexceed the mutation limit.

To avoid this error, delete the rows in two separate partitioned DML statements:

  1. Run a partitioned delete on the child tables.
  2. Run a partitioned delete on the parent table.

This two-step process helps keep the mutation count within the allowed limitsfor each transaction. Alternatively, you can drop the global index on the childtable before deleting the parent rows.

Row locking

Spanner acquires a lock only if a row is a candidate for update ordeletion. This behavior is different fromDML execution, which might read-lockrows that don't match theWHERE clause.

Execution and transactions

Whether a DML statement is partitioned or not depends on the client librarymethod that you choose for execution. Each client library provides separatemethods forDML execution andPartitioned DML execution.

You can execute only one partitioned DML statement in a call to the clientlibrary method.

Spanner does not apply the partitioned DML statements atomicallyacross the entire table. Spanner does, however, apply partitionedDML statements atomically across each partition.

Partitioned DML does not support commit or rollback. Spannerexecutes and applies the DML statement immediately.

  • If you cancel the operation, Spanner cancels the executingpartitions and doesn't start the remaining partitions.Spanner does not roll back any partitions that have alreadyexecuted.
  • If the execution of the statement causes an error, then execution stopsacross all partitions and Spanner returns that error for theentire operation. Some examples of errors are violations of data typeconstraints, violations ofUNIQUE INDEX, and violations ofON DELETE NOACTION. Depending on the point in time when the execution failed, thestatement might have successfully run against some partitions, and mightnever have been run against other partitions.

If the partitioned DML statement succeeds, then Spanner ran thestatement at least once against each partition of the key range.

Count of modified rows

A partitioned DML statement returns a lower bound on the number of modifiedrows. It might not be an exact count of the number of rows modified, becausethere is no guarantee that Spanner counts all the modified rows.

Transaction limits

Spanner creates the partitions and transactions that it needs toexecute a partitioned DML statement. Transaction limits or per-transactionconcurrency limits apply, but Spanner attempts to keep thetransactions within the limits.

Spanner allows a maximum of 20,000concurrent partitioned DML statements per database.

Unsupported features

Spanner does not support some features for partitioned DML:

  • INSERT is not supported.
  • Google Cloud console: You can't execute partitioned DML statements in theGoogle Cloud console.
  • Query plans and profiling: The Google Cloud CLI and the clientlibraries don't support query plans and profiling.
  • Subqueries that read from another table, or a different row of the same table.

For complex scenarios, such as moving a table or transformations that requirejoins across tables, considerusing the Dataflow connector.

Best practices

Apply the following best practices to improve the performance of yourpartitioned DML statements:

  • Avoid high concurrency: Running a large number of partitioned DMLstatements concurrently (for example, more than 100) might lead to lockcontention on internal system tables, degrading performance. Instead ofrunning a high number of concurrent statements, use a single partitioned DMLstatement.
  • UtilizePDML_MAX_PARALLELISM:To increase the throughput of a single partitioned DML statement, especiallyon tables with many splits, set a higher value for thePDML_MAX_PARALLELISM statement hint. This allows the single statement touse more parallelism internally. Setting a higher value forPDML_MAX_PARALLELISM results in more compute usage, so you should try tobalance compute usage and increased processing speed.
  • Let Spanner handle partitioning: Avoid manually shardingyour data (for example, using primary key ranges) and running separatepartitioned DML statements on each shard. Partitioned DML is designed toefficiently partition the work across the entire table. Custom shardingoften increases overhead and might worsen contention.
  • Understand partitioning scope: Partitioned DML operations areparallelized across all splits in the entire database, not just the splitscontaining data for the table being modified. This means that for databaseswith a large number of splits, there might be overhead even if the targettable is small or the modified data is localized. Partitioned DML might notbe the most efficient choice for modifying a very small portion of a largedatabase.
  • Consider alternatives for frequent, small deletions: For use casesinvolving frequent deletions of a small number of known rows, using DMLstatements within transactions or theBatchWrite API might offer betterperformance and lower overhead than using partitioned DML.

Examples

The following code example updates theMarketingBudget column of theAlbumstable.

C++

You use theExecutePartitionedDml() function to execute a partitioned DML statement.

voidDmlPartitionedUpdate(google::cloud::spanner::Clientclient){namespacespanner=::google::cloud::spanner;autoresult=client.ExecutePartitionedDml(spanner::SqlStatement("UPDATE Albums SET MarketingBudget = 100000""  WHERE SingerId > 1"));if(!result)throwstd::move(result).status();std::cout <<"Updated at least " <<result->row_count_lower_bound            <<" row(s) [spanner_dml_partitioned_update]\n";}

C#

You use theExecutePartitionedUpdateAsync() method to execute a partitioned DML statement.

usingGoogle.Cloud.Spanner.Data;usingSystem;usingSystem.Threading.Tasks;publicclassUpdateUsingPartitionedDmlCoreAsyncSample{publicasyncTask<long>UpdateUsingPartitionedDmlCoreAsync(stringprojectId,stringinstanceId,stringdatabaseId){stringconnectionString=$"Data Source=projects/{projectId}/instances/{instanceId}/databases/{databaseId}";usingvarconnection=newSpannerConnection(connectionString);awaitconnection.OpenAsync();usingvarcmd=connection.CreateDmlCommand("UPDATE Albums SET MarketingBudget = 100000 WHERE SingerId > 1");longrowCount=awaitcmd.ExecutePartitionedUpdateAsync();Console.WriteLine($"{rowCount} row(s) updated...");returnrowCount;}}

Go

You use thePartitionedUpdate() method to execute a partitioned DML statement.

import("context""fmt""io""cloud.google.com/go/spanner")funcupdateUsingPartitionedDML(wio.Writer,dbstring)error{ctx:=context.Background()client,err:=spanner.NewClient(ctx,db)iferr!=nil{returnerr}deferclient.Close()stmt:=spanner.Statement{SQL:"UPDATE Albums SET MarketingBudget = 100000 WHERE SingerId > 1"}rowCount,err:=client.PartitionedUpdate(ctx,stmt)iferr!=nil{returnerr}fmt.Fprintf(w,"%d record(s) updated.\n",rowCount)returnnil}

Java

You use theexecutePartitionedUpdate() method to execute a partitioned DML statement.

staticvoidupdateUsingPartitionedDml(DatabaseClientdbClient){Stringsql="UPDATE Albums SET MarketingBudget = 100000 WHERE SingerId > 1";longrowCount=dbClient.executePartitionedUpdate(Statement.of(sql));System.out.printf("%d records updated.\n",rowCount);}

Node.js

You use therunPartitionedUpdate() method to execute a partitioned DML statement.

// Imports the Google Cloud client libraryconst{Spanner}=require('@google-cloud/spanner');/** * TODO(developer): Uncomment the following lines before running the sample. */// const projectId = 'my-project-id';// const instanceId = 'my-instance';// const databaseId = 'my-database';// Creates a clientconstspanner=newSpanner({projectId:projectId,});// Gets a reference to a Cloud Spanner instance and databaseconstinstance=spanner.instance(instanceId);constdatabase=instance.database(databaseId);try{const[rowCount]=awaitdatabase.runPartitionedUpdate({sql:'UPDATE Albums SET MarketingBudget = 100000 WHERE SingerId > 1',});console.log(`Successfully updated${rowCount} records.`);}catch(err){console.error('ERROR:',err);}finally{// Close the database when finished.database.close();}

PHP

You use theexecutePartitionedUpdate() method to execute a partitioned DML statement.

use Google\Cloud\Spanner\SpannerClient;/** * Updates sample data in the database by partition with a DML statement. * * This updates the `MarketingBudget` column which must be created before * running this sample. You can add the column by running the `add_column` * sample or by running this DDL statement against your database: * *     ALTER TABLE Albums ADD COLUMN MarketingBudget INT64 * * Example: * ``` * update_data($instanceId, $databaseId); * ``` * * @param string $instanceId The Spanner instance ID. * @param string $databaseId The Spanner database ID. */function update_data_with_partitioned_dml(string $instanceId, string $databaseId): void{    $spanner = new SpannerClient();    $instance = $spanner->instance($instanceId);    $database = $instance->database($databaseId);    $rowCount = $database->executePartitionedUpdate(        'UPDATE Albums SET MarketingBudget = 100000 WHERE SingerId > 1'    );    printf('Updated %d row(s).' . PHP_EOL, $rowCount);}

Python

You use theexecute_partitioned_dml() method to execute a partitioned DML statement.

# instance_id = "your-spanner-instance"# database_id = "your-spanner-db-id"spanner_client=spanner.Client()instance=spanner_client.instance(instance_id)database=instance.database(database_id)row_ct=database.execute_partitioned_dml("UPDATE Albums SET MarketingBudget = 100000 WHERE SingerId > 1")print("{} records updated.".format(row_ct))

Ruby

You use theexecute_partitioned_update() method to execute a partitioned DML statement.

# project_id  = "Your Google Cloud project ID"# instance_id = "Your Spanner instance ID"# database_id = "Your Spanner database ID"require"google/cloud/spanner"spanner=Google::Cloud::Spanner.newproject:project_idclient=spanner.clientinstance_id,database_idrow_count=client.execute_partition_update("UPDATE Albums SET MarketingBudget = 100000 WHERE SingerId > 1")puts"#{row_count} records updated."

The following code example deletes rows from theSingers table, based on theSingerId column.

C++

voidDmlPartitionedDelete(google::cloud::spanner::Clientclient){namespacespanner=::google::cloud::spanner;autoresult=client.ExecutePartitionedDml(spanner::SqlStatement("DELETE FROM Singers WHERE SingerId > 10"));if(!result)throwstd::move(result).status();std::cout <<"Deleted at least " <<result->row_count_lower_bound            <<" row(s) [spanner_dml_partitioned_delete]\n";}

C#

usingGoogle.Cloud.Spanner.Data;usingSystem;usingSystem.Threading.Tasks;publicclassDeleteUsingPartitionedDmlCoreAsyncSample{publicasyncTask<long>DeleteUsingPartitionedDmlCoreAsync(stringprojectId,stringinstanceId,stringdatabaseId){stringconnectionString=$"Data Source=projects/{projectId}/instances/{instanceId}/databases/{databaseId}";usingvarconnection=newSpannerConnection(connectionString);awaitconnection.OpenAsync();usingvarcmd=connection.CreateDmlCommand("DELETE FROM Singers WHERE SingerId > 10");longrowCount=awaitcmd.ExecutePartitionedUpdateAsync();Console.WriteLine($"{rowCount} row(s) deleted...");returnrowCount;}}

Go

import("context""fmt""io""cloud.google.com/go/spanner")funcdeleteUsingPartitionedDML(wio.Writer,dbstring)error{ctx:=context.Background()client,err:=spanner.NewClient(ctx,db)iferr!=nil{returnerr}deferclient.Close()stmt:=spanner.Statement{SQL:"DELETE FROM Singers WHERE SingerId > 10"}rowCount,err:=client.PartitionedUpdate(ctx,stmt)iferr!=nil{returnerr}fmt.Fprintf(w,"%d record(s) deleted.",rowCount)returnnil}

Java

staticvoiddeleteUsingPartitionedDml(DatabaseClientdbClient){Stringsql="DELETE FROM Singers WHERE SingerId > 10";longrowCount=dbClient.executePartitionedUpdate(Statement.of(sql));System.out.printf("%d records deleted.\n",rowCount);}

Node.js

// Imports the Google Cloud client libraryconst{Spanner}=require('@google-cloud/spanner');/** * TODO(developer): Uncomment the following lines before running the sample. */// const projectId = 'my-project-id';// const instanceId = 'my-instance';// const databaseId = 'my-database';// Creates a clientconstspanner=newSpanner({projectId:projectId,});// Gets a reference to a Cloud Spanner instance and databaseconstinstance=spanner.instance(instanceId);constdatabase=instance.database(databaseId);try{const[rowCount]=awaitdatabase.runPartitionedUpdate({sql:'DELETE FROM Singers WHERE SingerId > 10',});console.log(`Successfully deleted${rowCount} records.`);}catch(err){console.error('ERROR:',err);}finally{// Close the database when finished.database.close();}

PHP

use Google\Cloud\Spanner\SpannerClient;/** * Delete sample data in the database by partition with a DML statement. * * This updates the `MarketingBudget` column which must be created before * running this sample. You can add the column by running the `add_column` * sample or by running this DDL statement against your database: * *     ALTER TABLE Albums ADD COLUMN MarketingBudget INT64 * * Example: * ``` * update_data($instanceId, $databaseId); * ``` * * @param string $instanceId The Spanner instance ID. * @param string $databaseId The Spanner database ID. */function delete_data_with_partitioned_dml(string $instanceId, string $databaseId): void{    $spanner = new SpannerClient();    $instance = $spanner->instance($instanceId);    $database = $instance->database($databaseId);    $rowCount = $database->executePartitionedUpdate(        'DELETE FROM Singers WHERE SingerId > 10'    );    printf('Deleted %d row(s).' . PHP_EOL, $rowCount);}

Python

# instance_id = "your-spanner-instance"# database_id = "your-spanner-db-id"spanner_client=spanner.Client()instance=spanner_client.instance(instance_id)database=instance.database(database_id)row_ct=database.execute_partitioned_dml("DELETE FROM Singers WHERE SingerId > 10")print("{} record(s) deleted.".format(row_ct))

Ruby

# project_id  = "Your Google Cloud project ID"# instance_id = "Your Spanner instance ID"# database_id = "Your Spanner database ID"require"google/cloud/spanner"spanner=Google::Cloud::Spanner.newproject:project_idclient=spanner.clientinstance_id,database_idrow_count=client.execute_partition_update("DELETE FROM Singers WHERE SingerId > 10")puts"#{row_count} records deleted."

What's next?

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.