Introduction to data transformation

This document describes the different ways you can transform data in yourBigQuery tables.

For more information about data integrations, seeIntroduction to loading, transforming, and exporting data.

Methods of transforming data

You can transform data in BigQuery in the following ways:

The following table shows the different characteristics of each transformationmethod.

Transform methodTransformation targetDefinition methodTransformation frequency
Data manipulation language (DML)Table (in place)SQL DMLUser-initiated or scheduled
Materialized viewsMaterialized viewSQL queryAutomatic or manual refresh
Continuous queriesTable,Pub/Sub topic,Bigtable tableSQL query with EXPORT DATAContinuous
DataformTableDataform core (SQLX)Scheduled (pipelines)
BigQuery pipelinesTableBigQuery pipelinesScheduled (pipelines)
Data preparationTableVisual editorScheduled

You can alsoreview the change history of a BigQuery tableto examine the transformations made to a table in a specified time range.

Transform data with DML

You can usedata manipulation language (DML)to transform data in your BigQuery tables. DML statements areGoogleSQL queries thatmanipulate existing table data to add or delete rows, modify datain existing rows, or merge data with values from another table. DMLtransformations are also supported inpartitioned tables.

You can run multiple DML statements concurrently, where BigQueryqueues several DML statements that transform your data one after the other.BigQuery manageshow concurrent DML statements are run,based upon the transformation type.

Transform data with materialized views

Materialized views views areprecomputed views that periodically cache the results of a SQL query for increasedperformance and efficiency. BigQuery leverages precomputedresults from materialized views and whenever possible reads only changes fromthe base tables to compute up-to-date results.

Materialized views are precomputed in the background when the base tables change.Any incremental data changes from the base tables are automatically added to thematerialized views, with no user action required.

Transform data with continuous queries

Preview

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Continuous queriesare SQL statements that run continuously. Continuous queries let you analyzeincoming data in BigQuery in real time. You can insert the outputrows produced by a continuous query into a BigQuery table orexport them to Pub/Sub or Bigtable.

Transform data with Dataform

Dataform lets you manage data transformation in the extract,load, and transform (ELT) process for data integration. After extractingraw data from source systems and loading it into BigQuery, you canuse Dataform to transform it into an organized, tested, and documentedsuite of tables. While in DML you take an imperative approach by telling BigQueryhow exactly to transform your data, in Dataform you writedeclarative statements where Dataform then determines thetransformation needed to achieve that state.

In Dataform, you can develop, test, and version controlSQL workflows for data transformationfrom data source declarations to output tables, views, or materialized views.You can develop SQL workflows with Dataform core or pure JavaScript.Dataform core is an open sourcemeta-language that extends SQL with SQLX and JavaScript. You can useDataform core to manage dependencies, set up automated dataquality testing, and document table or column descriptions within the code.

Dataform stores your SQL workflow code inrepositoriesand uses Git to track file changes. Development workspaces in Dataformlet you work on the contents of the repository without affecting the work ofothers who are working in the same repository. You can connect Dataformrepositories to third-party Git providers, including Azure DevOps Services,Bitbucket, GitHub, and GitLab.

You can run or schedule SQL workflowswith Dataform release configurations and workflow configurations.Alternatively, you can schedule executions either with Cloud Composer, orwith Workflows and Cloud Scheduler. During execution,Dataform executes SQL queries in BigQuery inorder of object dependencies in your SQL workflow. After execution, you can useyour defined tables and views for analysis in BigQuery.

To learn more about creating data transformation SQL workflows inDataform, seeDataform overview andDataform features.

Transform data with BigQuery pipelines

BigQuery pipelines are powered by Dataformand let you create and manage data transformationin extract, load, transform (ELT) or extract, transform, load (ETL) processes.

You can create and manage BigQuery pipelines in a visual wayin BigQuery Studio.

To learn more about creating BigQuery pipelines,seeCreate pipelines.

Prepare data in BigQuery

To reduce the toil of data preparation, BigQuery lets you cleandata with Gemini-generated transformation suggestions. Data preparationin BigQuery offers the following assistance:

  • Applying transformations and data quality rules
  • Standardizing and enriching data
  • Automating schema mapping

You can validate the results in a preview of your data before executing thechanges on all your data.

For more information, seeIntroduction to BigQuery data preparation.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.