Create pipelines

This document describes how to createpipelines in BigQuery.Pipelines are powered byDataform.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. Enable the BigQuery, Dataform, and Vertex AI APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the BigQuery, Dataform, and Vertex AI APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the APIs

Required roles for pipelines

To get the permissions that you need to create pipelines, ask your administrator to grant you the following IAM roles on the project:

For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

For more information about Dataform IAM, seeControl access with IAM.

Note: When you create a pipeline, BigQuery grants you theDataform Admin role(roles/dataform.admin) on that pipeline. All users with theDataform Admin role granted on the Google Cloud project have owner access to allthe pipelines created in the project. To override this behavior, seeGrant a specific role upon resource creation.

Required roles for notebook options

To get the permissions that you need to select a runtime template in notebook options, ask your administrator to grant you theNotebook Runtime User (roles/aiplatform.notebookRuntimeUser) IAM role on the project. For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

If you don't have this role, you can select the default notebook runtimespecification.

Set the default region for code assets

If this is the first time you are creating a code asset, you should set thedefault region for code assets. You can't change the region for a code assetafter it is created.

Note: If you create a pipeline and choose a different default region than theone you have been using for code assets—for example, choosingus-west1when you have been usingus-central1—then that pipeline and all codeassets you create afterwards use that new region by default. Existing codeassets continue to use the region they were assigned when they were created.

All code assets in BigQuery Studio use the same default region.To set the default region for code assets, follow these steps:

  1. Go to theBigQuery page.

    Go to BigQuery

  2. In theExplorer pane, find the project in which you have enabled codeassets.

  3. ClickView actions next to the project, and then clickChange my default code region.

  4. ForRegion, select the region that you want to use for code assets.

  5. ClickSelect.

For a list of supported regions, seeBigQuery Studio locations.

Create a pipeline

To create a pipeline, follow these steps:

  1. Go to theBigQuery page.

    Go to BigQuery

  2. In the tab bar of the editor pane, click thearrow next to the+ sign, and then clickPipeline.

  3. Optional: To rename the pipeline,click the pipeline name, and then type a new name.

  4. ClickGet started, and then go to theSettings tab.

  5. In theAuthentication section, choose to authorize thepipeline with your Google Account user credentials or a serviceaccount.

    • To use your Google Account user credentials(Preview), selectExecute with my user credentials.
    • To use a service account, selectExecute with selected service account, and then select a service account.
  6. In theProcessing location section, select a processing location for thepipeline.

    • To enable the automatic selection of a location, selectAutomatic location selection. This option selects a locationbased on the datasets referenced in the request. The selection process isas follows:

      • If your query references datasets from the same location,BigQuery uses that location.
      • If your query references datasets from two or more different locations,an error occurs. For details about this limitation, seeCross-region dataset replication.
      • If your query doesn't reference any datasets, BigQuerydefaults to theUS multi-region.
    • To pick a specific region, selectRegion, then choose aregion in theRegion menu. Alternatively, you can use the@@location system variablein your query. For more information, seeSpecify locations.

    • To pick a multi-region, selectMulti-region, then choose amulti-region in theMulti-region menu.

    The pipeline processing location doesn't need to match your defaultstorage location for code assets.

SQLX options

To configure the SQLX settings for your pipeline, do the following in theSQLX options section:

  1. In theDefault project field, enter the name of an existingGoogle Cloud project. This value is used fordefaultProject in theworkflow_settings.yaml file and fordefaultDatabase in thedataform.json file. The default project is used by pipeline tasks during theirexecution.

    Note: The project name isn't validated, so it's possible to enter any non-empty string. However, if the project doesn't exist, the pipeline execution fails.
  2. Optional: In theDefault dataset field, search for and select an existingdataset. The list of available datasets is filtered based on the selectedproject and processing location. This value is used fordefaultDataset intheworkflow_settings.yaml file. The default dataset is used by pipeline tasksduring their execution.

    Note: Setting the default dataset and then changing the pipeline's region invalidates the dataset selection. Changing the project can also invalidate the dataset selection. If a given dataset doesn't exist in the selected project, it is created.

Notebook options

To add a notebook to your pipeline, do the following in theNotebook optionssection:

  1. In theRuntime template field, either accept the default notebookruntime, or search for and select an existing runtime.

    Note: A notebook runtime template must be located in the same region asthe pipeline that specifies it.Note: When you include a notebook in a BigQuery pipeline,you can't change the network of the Vertex AI runtime instance.The runtime is restricted to the default network, and selecting adifferent network isn't supported.
  2. In theCloud Storage bucket field, clickBrowseand select or create a Cloud Storage bucket for storing the outputof notebooks in your pipeline.

  3. Follow the steps inAdd a principal to a bucket-level policyto add your custom Dataform service account as a principal to theCloud Storage bucket that you plan to use for storing output ofscheduled pipeline runs, and grant theStorage Admin role(roles/storage.admin) to this principal.

    The selected custom Dataform service account must be granted theStorage Admin IAM role on the selected bucket.

Add a pipeline task

To add a task to a pipeline, follow these steps:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

    If you don't see the left pane, clickExpand left pane to open the pane.

  3. In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.

  4. To add a code asset, select one of the following options:

    SQL query

    1. ClickAdd task, and then selectQuery.You can either create a new query or import an existing one.

    2. Optional: In theQuery task details pane, in theRun after menu, select a task to precede your query.

    Create a new query

    1. Click thearrow menu next toEdit Query andselect eitherIn context orIn new tab.

    2. Search for an existing query.

    3. Select a query name and then pressEnter.

    4. ClickSave.

    5. Optional: To rename the query,click the query name on the pipeline pane,clickEdit Query, click the existing query name at the top of thescreen, and then type a new name.

    Import an existing query

    1. Click thearrow menu next toEdit Query andclickImport a copy.

    2. Search for an existing query to import or select an existingquery from the search pane. When you import aquery, the original remains unchanged because the query's sourcefile is copied into the pipeline.

    3. ClickEdit to open the imported query.

    4. ClickSave.

    Notebook

    1. ClickAdd task, and then selectNotebook.You can either create a new notebook or import an existing one.To change settings for notebook runtime templates, seeNotebook options.

    2. Optional: In theNotebook task details pane, in theRun after menu, select a task to precede your notebook.

    Create a new notebook

    1. Click thearrow menu next toEdit Notebook andselect eitherIn context orIn new tab.

    2. Search for an existing notebook.

    3. Select a notebook name and then pressEnter.

    4. ClickSave.

    5. Optional: To rename the notebook,click the notebook name on the pipeline pane,clickEdit Notebook, click the existing notebook name at the top ofthe screen, and then type a new name.

    Import an existing notebook

    1. Click thearrow menu next toEdit Notebook andclickImport a copy.

    2. Search for an existing notebook to import or select an existingnotebook from the search pane. When you import anotebook, the original remains unchanged because the notebook'ssource file is copied into the pipeline.

    3. To open the imported notebook, clickEdit.

    4. ClickSave.

    Data preparation

    1. ClickAdd task, and then selectData preparation.You can either create a new data preparation or import an existing one.

    2. Optional: In theData preparation task details pane, in theRun after menu, select a task to precede your data preparation.

    Create a new data preparation

    1. Click thearrow menu adjacent toEdit Data preparation andselect eitherIn context orIn new tab.

    2. Search for an existing data preparation.

    3. Select a data preparation name and press enter.

    4. ClickSave.

    5. Optional: To rename the data preparation, click the data preparationname on the pipeline pane, clickEdit Data preparation, click thename at the top of the screen, and enter a new name.

    Import an existing data preparation

    1. Click thearrow drop-down menu next toEdit Data preparation andclickImport a copy.

    2. Search for an existing data preparation to import or select an existingdata preparation from the search pane. When you import a datapreparation, the original remains unchanged because the datapreparation's source file is copied into the pipeline.

    3. To open the imported data preparation, clickEdit.

    4. ClickSave.

    Table

    Preview

    This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

    Note: To provide feedback or request support, contactdataform-preview-support@google.com.
    1. ClickAdd task, and then selectTable.

    2. In theCreate new pane, selectTable orIncremental table.

    3. Verify the default project for the table, or select a new project.

    4. Verify the default dataset for the table, or select a new dataset.

    5. Enter a name for the table.

    6. In theTable task details pane, clickOpen to open the task.

    7. Configure the task using the settings inDetails> Configuration or in theconfig block of thecode editor for the table.

      For metadata changes, use theConfiguration tab. This tab letsyou edit a specific value in theconfig block from the code editor,such as a string or an array, that is formatted like a JavaScriptobject. Using this tab helps you avoid syntax errors and verify thatyour settings are correct.

      Optional: In theRun after menu, select a task to precede yourtable.

      You can also define the metadata for your pipeline task in theconfig block in the editor. For more information, seeCreating tables.

      The editor validates your code and displays the validation status.

      Note: When you use JavaScript functions as values in theconfigblock, you can't edit the JavaScript functions on theConfiguration tab.
    8. InDetails> Compiled queries, view the SQL compiledfrom the SQLX code.

    9. ClickRun to run the SQL in your pipeline.

    10. InQuery results, inspect the data preview.

    View

    Preview

    This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

    Note: To provide feedback or request support, contactdataform-preview-support@google.com.
    1. ClickAdd task, and then selectView.

    2. In theCreate new pane, selectView orMaterialized view.

    3. Verify the default project for the view, or select a new project.

    4. Verify the default dataset for the view, or select a new dataset.

    5. Enter a name for the view.

    6. In theView task details pane, clickOpen to open the task.

    7. Configure the task using the settings inDetails> Configuration or in theconfig block of thecode editor for the view.

      For metadata changes, use theConfiguration tab. This tab letsyou edit a specific value in theconfig block from the code editor,such as a string or an array, that is formatted like a JavaScriptobject. Using this tab helps you avoid syntax errors and verify thatyour settings are correct.

      Optional: In theRun after menu, select a task to precede yourview.

      You can also define the metadata for your pipeline task in theconfig block in the editor. For more information, seeCreating a view with Dataform core.

      The editor validates your code and displays the validation status.

      Note: When you use JavaScript functions as values in theconfigblock, you can't edit the JavaScript functions on theConfiguration tab.
    8. InDetails> Compiled queries, view the SQL compiledfrom the SQLX code.

    9. ClickRun to run the SQL in your pipeline.

    10. InQuery results, inspect the data preview.

Edit a pipeline task

To edit a pipeline task, follow these steps:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

  3. In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.

  4. Click the selected task.

  5. To change the preceding task, in theRun after menu, select a task thatwill precede your task.

  6. To edit the contents of the selected task, clickEdit.

  7. In the new tab that opens, edit the task contents, and then save changes tothe task.

Delete a pipeline task

To delete a task from a pipeline, follow these steps:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

  3. In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.

  4. Click the selected task.

  5. In theTask details pane, click theDeleteDelete icon.

Share a pipeline

Important: If you enhance security by setting theenable_private_workspace field(Preview)totrue in theprojects.locations.updateConfig Dataform API method,only the pipeline creator can read and write code in that pipeline.For more information, seeEnable private workspaces.Note: You can share a pipeline but not a task within the pipeline.

To share a pipeline, follow these steps:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

  3. In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.

  4. ClickShare, and then selectManage permissions.

  5. ClickAdd user/group.

  6. In theNew principals field, enter the name of at least one user or group.

  7. ForAssign Roles, select a role.

  8. ClickSave.

Share a link to a pipeline

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

  3. In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.

  4. ClickShare, and then selectShare link. The URL for your pipelineis copied to your computer's clipboard.

Run a pipeline

To manually run the current version of a pipeline, follow these steps:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

  3. In theExplorer pane, expand your project, clickPipelines,and then select a pipeline.

  4. ClickRun. If you selectedExecute with my user credentialsfor yourauthentication, you mustauthorize your Google Account(Preview).

  5. Optional: To inspect the run,view past manual runs.

Authorize your Google Account

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Note: To request support or provide feedback for this feature, contactdataform-preview-support@google.com.

To authenticate the resource with yourGoogle Accountuser credentials, you must manually grant permission for BigQuerypipelines to get the access token for your Google Account and access the sourcedata on your behalf. You can grant manual approval with the OAuth dialoginterface.

You only need to give permission to BigQuery pipelines once.

To revoke the permission that you granted, follow these steps:

  1. Go to yourGoogle Account page.
  2. ClickBigQuery Pipelines.
  3. ClickRemove access.
Warning: Revoking access permissions prevents any future pipeline runsthat this Google Account owns across all regions.

If your pipeline contains a notebook, you must also manually grantpermission for Colab Enterprise to get the access token for yourGoogle Account and access the source data on your behalf. You only needto give permission once. You can revoke this permission on theGoogle Account page.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.