Create pipelines

This document describes how to createpipelines in BigQuery.Pipelines are powered byDataform.

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.

Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the BigQuery, Dataform, and Vertex AI APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

Enable the APIs

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the BigQuery, Dataform, and Vertex AI APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

Enable the APIs

Required roles for pipelines

To get the permissions that you need to create pipelines, ask your administrator to grant you the following IAM roles on the project:

To create pipelines:Code Creator (roles/dataform.codeCreator)
To edit and run pipelines:Dataform Editor (roles/dataform.editor)

For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

For more information about Dataform IAM, seeControl access with IAM.

Note: When you create a pipeline, BigQuery grants you the Dataform Admin role(roles/dataform.admin) on that pipeline. All users with theDataform Admin role granted on the Google Cloud project have owner access to allthe pipelines created in the project. To override this behavior, seeGrant a specific role upon resource creation.

Required roles for notebook options

To get the permissions that you need to select a runtime template in notebook options, ask your administrator to grant you theNotebook Runtime User (roles/aiplatform.notebookRuntimeUser) IAM role on the project. For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

If you don't have this role, you can select the default notebook runtimespecification.

Set the default region for code assets

If this is the first time you are creating a code asset, you should set thedefault region for code assets. You can't change the region for a code assetafter it is created.

Note: If you create a pipeline and choose a different default region than theone you have been using for code assets—for example, choosingus-west1when you have been usingus-central1—then that pipeline and all codeassets you create afterwards use that new region by default. Existing codeassets continue to use the region they were assigned when they were created.

All code assets in BigQuery Studio use the same default region.To set the default region for code assets, follow these steps:

Go to theBigQuery page.
Go to BigQuery
In theExplorer pane, find the project in which you have enabled codeassets.
ClickView actions next to the project, and then clickChange my default code region.
ForRegion, select the region that you want to use for code assets.
ClickSelect.

For a list of supported regions, seeBigQuery Studio locations.

Create a pipeline

To create a pipeline, follow these steps:

Go to theBigQuery page.
Go to BigQuery
In the tab bar of the editor pane, click thearrow next to the+ sign, and then clickPipeline.
Optional: To rename the pipeline,click the pipeline name, and then type a new name.
ClickGet started, and then go to theSettings tab.
In theAuthentication section, choose to authorize thepipeline with your Google Account user credentials or a serviceaccount.
- To use your Google Account user credentials(Preview), selectRun with my user credentials.
  Note: Authenticating API-based runs with user credentials isn't supported. To run all the tasks in a pipeline using the Dataform API, you must configure the pipeline to use a service account.
- To use a service account, selectRun with selected service account, and then select aservice account. If you need to create a service account, clickNew service account.
In theProcessing location section, select a processing location for thepipeline.
- To enable the automatic selection of a location, selectAutomatic location selection. This option selects a locationbased on the datasets referenced in the request. The selection process isas follows:
  - If your query references datasets from the same location,BigQuery uses that location.
  - If your query references datasets from two or more different locations,an error occurs. For details about this limitation, seeCross-region dataset replication.
  - If your query doesn't reference any datasets, BigQuerydefaults to theUS multi-region.
- To pick a specific region, selectRegion, then choose aregion in theRegion menu. Alternatively, you can use the@@location system variablein your query. For more information, seeSpecify locations.
- To pick a multi-region, selectMulti-region, then choose amulti-region in theMulti-region menu.
The pipeline processing location doesn't need to match your defaultstorage location for code assets.

SQLX options

To configure the SQLX settings for your pipeline, do the following in theSQLX options section:

In theDefault project field, enter the name of an existingGoogle Cloud project. This value is used fordefaultProject in theworkflow_settings.yaml file and fordefaultDatabase in thedataform.json file. The default project is used by pipeline tasks during theirexecution.
Note: The project name isn't validated, so it's possible to enter any non-empty string. However, if the project doesn't exist, the pipeline execution fails.
Optional: In theDefault dataset field, search for and select an existingdataset. The list of available datasets is filtered based on the selectedproject and processing location. This value is used fordefaultDataset intheworkflow_settings.yaml file. The default dataset is used by pipeline tasksduring their execution.
Note: Setting the default dataset and then changing the pipeline's region invalidates the dataset selection. Changing the project can also invalidate the dataset selection. If a given dataset doesn't exist in the selected project, it is created.

Notebook options

To add a notebook to your pipeline, do the following in theNotebook optionssection:

In theRuntime template field, either accept the default notebookruntime, or search for and select an existing runtime.
- To view specifications for the default runtime, click the adjacentarrow.
- To create a new runtime, see Create a runtime template.
Note: A notebook runtime template must be located in the same region asthe pipeline that specifies it.Note: When you include a notebook in a BigQuery pipeline,you can't change the network of the Vertex AI runtime instance.The runtime is restricted to the default network, and selecting adifferent network isn't supported.
In theCloud Storage bucket field, clickBrowseand select or create a Cloud Storage bucket for storing the outputof notebooks in your pipeline.
Follow the steps in Add a principal to a bucket-level policyto add your custom Dataform service account as a principal to theCloud Storage bucket that you plan to use for storing output ofscheduled pipeline runs, and grant theStorage Admin role(roles/storage.admin) to this principal.
The selected custom Dataform service account must be granted theStorage Admin IAM role on the selected bucket.

Add a pipeline task

To add a task to a pipeline, follow these steps:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
To add a code asset, select one of the following options:
SQL query
1. ClickAdd task, and then selectQuery.You can either create a new query or import an existing one.
2. Optional: In theQuery task details pane, in theRun after menu, select a task to precede your query.
Create a new query
1. Click thearrow menu next toEdit Query andselect eitherIn context orIn new tab.
2. Search for an existing query.
3. Select a query name and then pressEnter.
4. ClickSave.
5. Optional: To rename the query,click the query name on the pipeline pane,clickEdit Query, click the existing query name at the top of thescreen, and then type a new name.
Import an existing query
1. Click thearrow menu next toEdit Query andclickImport a copy.
2. Search for an existing query to import or select an existingquery from the search pane. When you import aquery, the original remains unchanged because the query's sourcefile is copied into the pipeline.
3. ClickEdit to open the imported query.
4. ClickSave.
Notebook
1. ClickAdd task, and then selectNotebook.You can either create a new notebook or import an existing one.To change settings for notebook runtime templates, seeNotebook options.
2. Optional: In theNotebook task details pane, in theRun after menu, select a task to precede your notebook.
Create a new notebook
1. Click thearrow menu next toEdit Notebook andselect eitherIn context orIn new tab.
2. Search for an existing notebook.
3. Select a notebook name and then pressEnter.
4. ClickSave.
5. Optional: To rename the notebook,click the notebook name on the pipeline pane,clickEdit Notebook, click the existing notebook name at the top ofthe screen, and then type a new name.
Import an existing notebook
1. Click thearrow menu next toEdit Notebook andclickImport a copy.
2. Search for an existing notebook to import or select an existingnotebook from the search pane. When you import anotebook, the original remains unchanged because the notebook'ssource file is copied into the pipeline.
3. To open the imported notebook, clickEdit.
4. ClickSave.
Data preparation
1. ClickAdd task, and then selectData preparation.You can either create a new data preparation or import an existing one.
2. Optional: In theData preparation task details pane, in theRun after menu, select a task to precede your data preparation.
Create a new data preparation
1. Click thearrow menu adjacent toEdit Data preparation andselect eitherIn context orIn new tab.
2. Search for an existing data preparation.
3. Select a data preparation name and press enter.
4. ClickSave.
5. Optional: To rename the data preparation, click the data preparationname on the pipeline pane, clickEdit Data preparation, click thename at the top of the screen, and enter a new name.
Import an existing data preparation
1. Click thearrow drop-down menu next toEdit Data preparation andclickImport a copy.
2. Search for an existing data preparation to import or select an existingdata preparation from the search pane. When you import a datapreparation, the original remains unchanged because the datapreparation's source file is copied into the pipeline.
3. To open the imported data preparation, clickEdit.
4. ClickSave.
Table
Preview
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: To provide feedback or request support, contactdataform-preview-support@google.com.
1. ClickAdd task, and then selectTable.
2. In theCreate new pane, selectTable orIncremental table.
3. Verify the default project for the table, or select a new project.
4. Verify the default dataset for the table, or select a new dataset.
5. Enter a name for the table.
6. In theTable task details pane, clickOpen to open the task.
7. Configure the task using the settings inDetails> Configuration or in theconfig block of thecode editor for the table.
  For metadata changes, use theConfiguration tab. This tab letsyou edit a specific value in theconfig block from the code editor,such as a string or an array, that is formatted like a JavaScriptobject. Using this tab helps you avoid syntax errors and verify thatyour settings are correct.
  Optional: In theRun after menu, select a task to precede yourtable.
  You can also define the metadata for your pipeline task in theconfig block in the editor. For more information, seeCreating tables.
  The editor validates your code and displays the validation status.
  Note: When you use JavaScript functions as values in theconfigblock, you can't edit the JavaScript functions on theConfiguration tab.
8. InDetails> Compiled queries, view the SQL compiledfrom the SQLX code.
9. ClickRun to run the SQL in your pipeline.
10. InQuery results, inspect the data preview.
View
Preview
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: To provide feedback or request support, contactdataform-preview-support@google.com.
1. ClickAdd task, and then selectView.
2. In theCreate new pane, selectView orMaterialized view.
3. Verify the default project for the view, or select a new project.
4. Verify the default dataset for the view, or select a new dataset.
5. Enter a name for the view.
6. In theView task details pane, clickOpen to open the task.
7. Configure the task using the settings inDetails> Configuration or in theconfig block of thecode editor for the view.
  For metadata changes, use theConfiguration tab. This tab letsyou edit a specific value in theconfig block from the code editor,such as a string or an array, that is formatted like a JavaScriptobject. Using this tab helps you avoid syntax errors and verify thatyour settings are correct.
  Optional: In theRun after menu, select a task to precede yourview.
  You can also define the metadata for your pipeline task in theconfig block in the editor. For more information, seeCreating a view with Dataform core.
  The editor validates your code and displays the validation status.
  Note: When you use JavaScript functions as values in theconfigblock, you can't edit the JavaScript functions on theConfiguration tab.
8. InDetails> Compiled queries, view the SQL compiledfrom the SQLX code.
9. ClickRun to run the SQL in your pipeline.
10. InQuery results, inspect the data preview.

Edit a pipeline task

To edit a pipeline task, follow these steps:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
Click the selected task.
To change the preceding task, in theRun after menu, select a task thatwill precede your task.
To edit the contents of the selected task, clickEdit.
In the new tab that opens, edit the task contents, and then save changes tothe task.

Delete a pipeline task

To delete a task from a pipeline, follow these steps:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
Click the selected task.
In theTask details pane, clickDelete Delete.

Share a pipeline

Important: If you enhance security by setting theenable_private_workspace field (Preview)totrue in theprojects.locations.updateConfig Dataform API method,only the pipeline creator can read and write code in that pipeline.For more information, seeEnable private workspaces.Note: You can share a pipeline but not a task within the pipeline.

To share a pipeline, follow these steps:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
ClickShare, and then selectManage permissions.
ClickAdd user/group.
In theNew principals field, enter the name of at least one user or group.
ForAssign Roles, select a role.
ClickSave.

Share a link to a pipeline

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
ClickShare, and then selectShare link. The URL for your pipelineis copied to your computer's clipboard.

Run a pipeline

When running a pipeline, you can choose to run all the tasks in the pipeline,manually select specific tasks to run, or run tasks with selected tags.

Run all the tasks in a pipeline

To manually run the current version of a pipeline, select one of thefollowing options:

Console

To run all the tasks in a pipeline, do the following:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
Click Run>Run all tasks. If you selectedRun with my user credentials for yourauthentication, you mustauthorize your Google Account(Preview).
Optional: To inspect the run,view past manual runs.

API

Note: The Dataform API doesn't support user credentials forpipeline runs. You must select a service account in theAuthenticationsection of your pipeline settings to use the API.

To run a pipeline manually, compile the default workspace and use thecompilation result to create a workflow invocation.

To create a compilation result for the default workspace, use the projects.locations.repositories.compilationResults.create method.
Run the API request with the following information:
```
curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\-d'{      "workspace": "projects/PROJECT_ID/locations/LOCATION/repositories/REPOSITORY_ID/workspaces/default"   }'\"https://dataform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/repositories/REPOSITORY_ID/compilationResults"
```
Replace the following:
- LOCATION: the Google Cloud region for yourrepository, for example,us-central1. You can find the repositorylocation in the Google Cloud console by navigating to theExplorer pane, selecting the pipeline, opening theSettings tab,and clickingOpen pipeline in Dataform. The location is in theURL in the format of/locations/LOCATION/.
- PROJECT_ID: the unique identifier of yourGoogle Cloud project.
- REPOSITORY_ID: the unique identifier for yourDataform repository, for example,my-secure-repo.You can find the repository ID in the Google Cloud console bynavigating to theExplorer pane, selecting the pipeline,opening theSettings tab, and viewing theDataform repository ID field.
In the response body, locate thename field and copy its value,for example,projects/my-project/locations/us-central1/repositories/my-repo/compilationResults/12345-67890.
Trigger the pipeline run using theprojects.locations.repositories.workflowInvocations.create method.
Run the API request with the following information:
```
curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\-d'{      "compilationResult": "COMPILATION_RESULT"   }'\"https://dataform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/repositories/REPOSITORY_ID/workflowInvocations"
```
Replace the following:
- COMPILATION_RESULT: the full resource name ofthe compilation result that you copied in the previous step.
- LOCATION: the Google Cloud region for yourrepository, for example,us-central1.
- PROJECT_ID: the unique identifier of yourGoogle Cloud project.
- REPOSITORY_ID: the unique identifier for yourDataform repository, for example,my-secure-repo.

Run selected tasks in a pipeline

To run selected tasks in a pipeline, do the following:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
Click Run>Select tasks to run.
In theRun pane, in theAuthentication section, authorize theexecution with your Google Account user credentials or a service account.
- To use your Google Account user credentials(Preview),selectRun with user credentials.
- To use a custom service account, selectRun with selected service account, and then select a customservice account. If you need to create a service account, clickNew service account.
EnsureSelection of tasks is selected.
In theSelect tasks to run menu, search for specific tasksand select the tasks that you want to run.
TheTasks table lists the tasks that you've selected. Click a task nameto open it directly in the SQL editor.
Optional: Configure the following execution options:
- Include dependencies: select this option to run the selected tasksand their dependencies.
- Include dependents: select this option to run the selected tasksand their transitive downstream dependents.
- Run with full refresh: select this option to rebuild all tables fromscratch.
- Run as interactive job with high priority (default): select thisoption to set the BigQuery query job priority. Bydefault, BigQuery runs queries asinteractive query jobs,which are intended to start running as quickly as possible.Clearing this option runs the queries asbatch query jobs,which have lower priority.
ClickRun. If you selectedRun with user credentialsfor your authentication method, you mustauthorize your Google Account(Preview).
Optional: To inspect the run,view past manual runs.

Run tasks with selected tags in a pipeline

To run tasks with selected tags in a pipeline, do the following:

In the Google Cloud console, go to theBigQuery page.
Go to BigQuery
In the left pane, click Explorer:
If you don't see the left pane, click Expand left paneto open the pane.
In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.
Click Run>Run by tag, then do either of the following:
- Click a tag that you want to run.
- Click Select tags to run.
In theRun pane, in theAuthentication section, authorize the executionwith your Google Account user credentials or a service account.
- To use your Google Account user credentials(Preview),selectRun with user credentials.
- To use a custom service account, selectRun with selected service account, and then select a customservice account. If you need to create a service account, clickNew service account.
EnsureSelection of tags is selected.
In theSelect tags to run menu, search for specific tags and select thetags that you want to run.
TheTasks table lists the tasks that you've selected. Click a task nameto open it directly in the SQL editor.
Optional: Configure the following execution options:
- Include dependencies: select this option to run the selected tasksand their dependencies.
- Include dependents: select this option to run the selected tasksand their transitive downstream dependents.
- Run with full refresh: select this option to rebuild all tables fromscratch.
- Run as interactive job with high priority (default): select thisoption to set the BigQuery query job priority. Bydefault, BigQuery runs queries asinteractive query jobs,which are intended to start running as quickly as possible.Clearing this option runs the queries asbatch query jobs,which have lower priority.
ClickRun. If you selectedRun with user credentialsfor your authentication method, you mustauthorize your Google Account(Preview).
Optional: To inspect the run,view past manual runs.

Authorize your Google Account

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Note: To request support or provide feedback for this feature, contact dataform-preview-support@google.com.

To authenticate the resource with yourGoogle Accountuser credentials, you must manually grant permission for BigQuerypipelines to get the access token for your Google Account and access the sourcedata on your behalf. You can grant manual approval with the OAuth dialoginterface.

You only need to give permission to BigQuery pipelines once.

To revoke the permission that you granted, follow these steps:

Go to yourGoogle Account page.
ClickBigQuery Pipelines.
ClickRemove access.

Warning: Revoking access permissions prevents any future pipeline runsthat this Google Account owns across all regions.

If your pipeline contains a notebook, you must also manually grantpermission for Colab Enterprise to get the access token for yourGoogle Account and access the source data on your behalf. You only needto give permission once. You can revoke this permission on the Google Account page.

What's next

Learn more aboutBigQuery pipelines.
Learn how tomanage pipelines.
Learn how toschedule pipelines.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換