Connect to a third-party Git repository

This document shows you how to connect a remote repository to aDataform repository. After you connect the repositories, thechanges you make in a Dataform development workspace can be pushed to and pulled from the remote Git repository.

Important: Dataform can't connect directly to aprivate IP address of anon-premises Git host.

You can connect a remote repository through HTTPS or SSH.

The following table lists supported Git providers and connectionmethods that are available for their repositories:

Git providerConnection method
Azure DevOps ServicesSSH
BitbucketSSH
GitHubSSH or HTTPS
GitLabSSH or HTTPS
Important: To connect your remote Git repository to Dataform,verify that your Git host has apublic IP address.If your Git host is configured with only a private IP address (for example,10.x.x.x, 172.16.x.x-172.31.x.x, or 192.168.x.x), Dataform isunable to establish a connection. Connecting a remote Git repository to a Dataform repository can fail if the remote repository isn't opento the public internet, for example, if it's behind a firewall. In this case,add the requiredDataform egress IP address rangesto your firewall rules to enable connections to protected remote repositories.

Before you begin

  1. If your organization or project restricts remote Git repositorieswith thedataform.restrictGitRemotes Organization Policy, ensure that theremote Git repository is added to the allowlist in the policy before youcreate a Dataform repositorywhich you want to connect to a remote repository. For more information, seeRestrict remote repositories.
  2. Select orcreate a Dataform repository.You need it later to share a secret with your defaultDataform service agent.

Required roles

To get the permissions that you need to connect a Dataform repository to a remote Git repository, ask your administrator to grant you theDataform Admin (roles/dataform.admin) IAM role on repositories. For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

Connect a remote repository through SSH

To connect a remote repository through SSH, you need togenerate an SSH key and a Secret Manager secret. The SSH key consistsof a public SSH key and a private SSH key.You need to share the public SSH key with your Git provider, and create aSecret Manager secret with the private SSH key. Then, share the secretwith your default Dataform service agent.

Dataform uses the secret with the private SSH key to sign in to your Gitprovider to commit changes on behalf of the developers. Dataform makesthese commits using the developer's Google Cloud email address so you can tellwho made each commit.

Warning: The private SSH key is shared among all Dataform users whouse the corresponding service agent or service account. We recommend that youcreate a machine user with your Git provider and limit itsaccess to the remote Git repositories you plan to use with Dataform.Only Google Cloud project owners and Dataform users with theDataform Admin role can use the key toconnect repositories. Dataform users are not able to see the key itself.

To connect a remote repository to aDataform repository through SSH, follow these steps:

  1. In your Git provider, do the following:

    Azure DevOps Services

    1. In Azure DevOps Services,create a private SSH key.
    2. Upload the public SSH keyto your Azure DevOps Services repository.

    Bitbucket

    1. In Bitbucket,create a private SSH key.
    2. Upload the public SSH keyto your Bitbucket repository.

    GitHub

    1. In GitHub,create a private SSH key.
    2. Upload the GitHub public SSH keyto your GitHub repository.

    GitLab

    1. In GitLab,create a private SSH key.
    2. Upload the GitLab public SSH keyto your GitLab repository.
  2. In Secret Manager,create a secretand set your private SSH key as the secret value.

    1. Grant access to the secret to your default Dataform service agent.

      Your default Dataform service agent is in the following format:

      service-PROJECT_NUMBER@gcp-sa-dataform.iam.gserviceaccount.com
    2. Grant theroles/secretmanager.secretAccessor roleto the service agent or service account.

  3. In the Google Cloud console, go to theDataform page.

    Go to Dataform

  4. Select the Dataform repository that you want to connectto the remote repository.

  5. On the repository page, clickSettings > Connect with Git.

  6. In theLink to remote repository pane, in theRemote Git repository URL field, enter the URL of the remote Gitrepository, ending with.git.

    The URL of the remote Git repository must be in one ofthe following formats:

    • Absolute URL:ssh://git@{host_name}[:{port}]/{repository_path},port is optional.
    • SCP-like URL:git@{host_name}:{repository_path}.
  7. In theDefault remote branch name field, enter the nameof the main development branch of the remote Git repository.

  8. In theSecret drop-down, select your secret for the remote Gitrepository.

  9. In theSSH public host key value field, enter the public host key of yourGit provider.

    Azure DevOps Services

    1. To retrieve the Azure DevOps Services public host key, run the following command in the terminal:

      ssh-keyscan -t rsa ssh.dev.azure.com
    2. Copy one of the outputted keys, omittingssh.dev.azure.com from the beginning of the line.The value that you copy must be in the following format:

      ALGORITHMBASE64_KEY_VALUE

      For example:

      ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7Hr1oTWqNqOlzGJOfGJ4NakVyIzf1rXYd4d7wo6jBlkLvCA4odBlL0mDUyZ0/QUfTTqeu+tm22gOsv+VrVTMk6vwRU75gY/y9ut5Mb3bR5BV58dKXyq9A9UeB5Cakehn5Zgm6x1mKoVyf+FFn26iYqXJRgzIZZcZ5V6hrE0Qg39kZm4az48o0AUbf6Sp4SLdvnuMa2sVNwHBboS7EJkm57XQPVU3/QpyNLHbWDdzwtrlS+ez30S3AdYhLKEOxAG8weOnyrtLJAUen9mTkol8oII1edf7mWWbWVf0nBmly21+nZcmCTISQBtdcyPaEno7fFQMDD26/s0lfKob4Kw8H

      Verify this key is still up-to-date with Azure DevOps Services.

    Bitbucket

    1. To retrieve the Bitbucket public host key, run the following command in the terminal:

      curl https://bitbucket.org/site/ssh
    2. The command returns a list of public host keys. Choose one of the keys from the list, and copy it, omittingbitbucket.org from the beginning of the line.The value that you copy must be in the following format:

      ALGORITHMBASE64_KEY_VALUE

      For example:

      ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIazEu89wgQZ4bqs3d63QSMzYVa0MuJ2e2gKTKqu+UUO

      Verify this key is still up-to-date with Bitbucket.

    GitHub

    1. To retrieve the GitHub public host key, seeGitHub's SSH key fingerprints.
    2. The page contains a list of public host keys. Choose one of them, and copy it, omittinggithub.com from the beginning of the line.The value that you copy must be in the following format:

      ALGORITHMBASE64_KEY_VALUE

      For example:

      ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOMqqnkVzrm0SdG6UOoqKLsabgH5C9okWi0dh2l9GKJl

      Verify this key is still up-to-date with GitHub.

    GitLab

    1. To retrieve the GitLab public host key, seeSSHknown_hosts entries.
    2. The page contains a list of public host keys. Choose one of them, and copy it, omittinggitlab.com from the beginning of the line.The value that you copy must be in the following format:

      ALGORITHMBASE64_KEY_VALUE

      For example:

      ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAfuCHKVTjquxvt6CM6tdG4SLp1Btn/nOeHHE5UOzRdf

      Verify this key is still up-to-date with GitLab.

  10. ClickLink.

Connect a remote repository through HTTPS

To connect a remote repository through HTTPS, you need to create aSecret Manager secret with a personal access token, and share thesecret with your default Dataform service agent.

Dataform then uses the access token to sign in to your Git provider tocommit changes on behalf of the developers. Dataform makes thesecommits using the developer's Google Cloud email address so you can tellwho made each commit.

Warning: The access token is shared among all Dataform users who usethe corresponding service agent or service account. We recommend that you createa machine user with your Git provider and limit its access to the remote Gitrepositories you plan to use with Dataform. Only Google Cloud projectowners and Dataform users with theDataform Adminrole can use the token to connect repositories. Dataform users arenot able to see the token itself.

To connect a remote repository to a Dataform repository through HTTPS,follow these steps:

  1. In your Git provider, do the following:

    GitHub

    1. In GitHub, create afine-grained personal access tokenor aclassic personal access token.

      • For a fine-grained GitHub personal access token, do the following:
      1. Select repository access to only selected repositories, then select therepository that you want to connect to.

      2. Grant read and write access on contents of the repository.

      3. Set a token expiration time appropriate to your needs.

      • For a classic GitHub personal access token, do the following:
      1. Grant Dataform therepo permission.

      2. Set a token expiration time appropriate to your needs.

    2. If your organization uses SAML single sign-on (SSO),authorize the token.

    GitLab

    1. In GitLab, create aGitLab personal access token.

    2. Name the tokendataform.

      The GitLab personal access token must be nameddataform.

    3. Grant Dataform theapi,read_repository,andwrite_repository permissions.

    4. Set a token expiration time appropriate to your needs.

  2. In Secret Manager,create a secretcontaining the personal access token of your remote repository.

  3. Grant access to the secret to your default Dataform service agent.

    Your default Dataform service agent is in the following format:

    service-PROJECT_NUMBER@gcp-sa-dataform.iam.gserviceaccount.com
    1. Grant theroles/secretmanager.secretAccessor roleto the service agent.
  4. In the Google Cloud console, go to theDataform page.

    Go to Dataform

  5. Select the Dataform repository that you want to connectto the remote repository.

  6. On the repository page, clickSettings > Connect with Git.

  7. In theLink to remote repository pane, in theRemote Git repository URL field, enter the URL of the remote Gitrepository, ending with.git.

    The URL of the remote Git repository cannot contain usernames or passwords.

  8. In theDefault remote branch name field, enter the nameof the main development branch of the remote Git repository.

  9. In theSecret drop-down, select your secret for the remote Gitrepository.

  10. ClickLink.

Edit the remote repository connection

To edit a connection between a Dataform repository and a remoteGit repository, follow these steps:

  1. In the Google Cloud console, go to theDataform page.

    Go to Dataform

  2. Click the repository that you want to edit.

  3. On the repository page, clickSettings > Edit Git connection.

  4. On theLink to remote repository pane, edit connection settings.

  5. ClickUpdate.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.