Translate SQL queries with the translation API

This document describes how to use the translation API inBigQuery to translate scripts written in other SQL dialects intoGoogleSQL queries. The translation API can simplify the process ofmigrating workloads to BigQuery.

Before you begin

Before you submit a translation job, complete the following steps:

  1. Ensure that you have all the required permissions.
  2. Enable the BigQuery Migration API.
  3. Collect the source files containing the SQL scripts and queries to betranslated.
  4. Upload the source files to Cloud Storage.

Required permissions

To get the permissions that you need to create translation jobs using the translation API, ask your administrator to grant you theMigrationWorkflow Editor (roles/bigquerymigration.editor) IAM role on theparent resource. For more information about granting roles, seeManage access to projects, folders, and organizations.

This predefined role contains the permissions required to create translation jobs using the translation API. To see the exact permissions that are required, expand theRequired permissions section:

Required permissions

The following permissions are required to create translation jobs using the translation API:

  • bigquerymigration.workflows.create
  • bigquerymigration.workflows.get

You might also be able to get these permissions withcustom roles or otherpredefined roles.

Enable the BigQuery Migration API

If your Google Cloud CLI project was created before February 15, 2022, enablethe BigQuery Migration API as follows:

  1. In the Google Cloud console, go to theBigQuery Migration API page.

    Go to BigQuery Migration API

  2. ClickEnable.

Note: Projects created after February 15, 2022 have this API enabled automatically.

Upload input files to Cloud Storage

If you want to use the Google Cloud console or the BigQuery Migration APIto perform a translation job, you must upload the source files containingthe queries and scripts you want to translate to Cloud Storage. Youcan also uploadany metadata files orconfiguration YAML filesto the same Cloud Storage bucket containing the source files.For more information about creating buckets and uploading files toCloud Storage, seeCreate bucketsandUpload objects from a filesystem.

Supported task types

The translation API can translate thefollowing SQL dialects into GoogleSQL:

  • Amazon Redshift SQL -Redshift2BigQuery_Translation
  • Apache HiveQL and Beeline CLI -HiveQL2BigQuery_Translation
  • Apache Impala -Impala2BigQuery_Translation
  • Apache Spark SQL -SparkSQL2BigQuery_Translation
  • Azure Synapse T-SQL -AzureSynapse2BigQuery_Translation
  • GoogleSQL (BigQuery) -Bigquery2Bigquery_Translation
  • Greenplum SQL -Greenplum2BigQuery_Translation
  • IBM Db2 SQL -Db22BigQuery_Translation
  • IBM Netezza SQL and NZPLSQL -Netezza2BigQuery_Translation
  • MySQL SQL -MySQL2BigQuery_Translation
  • Oracle SQL, PL/SQL, Exadata -Oracle2BigQuery_Translation
  • PostgreSQL SQL -Postgresql2BigQuery_Translation
  • Presto or Trino SQL -Presto2BigQuery_Translation
  • Snowflake SQL -Snowflake2BigQuery_Translation
  • SQLite -SQLite2BigQuery_Translation
  • SQL Server T-SQL -SQLServer2BigQuery_Translation
  • Teradata and Teradata Vantage -Teradata2BigQuery_Translation
  • Vertica SQL -Vertica2BigQuery_Translation

Handling unsupported SQL functions with helper UDFs

When translating SQL from a source dialect to BigQuery, some functions might not have a direct equivalent. To address this, the BigQuery Migration Service (and the broader BigQuery community) provide helper user-defined functions (UDFs) that replicate the behavior of these unsupported source dialect functions.

These UDFs are often found in thebqutil public dataset, allowing translated queries to initially reference them using the formatbqutil.<dataset>.<function>(). For example,bqutil.fn.cw_count().

Important considerations for production environments:

Whilebqutil offers convenient access to these helper UDFs for initial translation and testing, direct reliance onbqutil for production workloads is not recommended for several reasons:

  1. Version control: Thebqutil project hosts the latest version of these UDFs, which means their definitions can change over time. Relying directly onbqutil could lead to unexpected behavior or breaking changes in your production queries if a UDF's logic is updated.
  2. Dependency isolation: Deploying UDFs to your own project isolates your production environment from external changes.
  3. Customization: You might need to modify or optimize these UDFs to better suit your specific business logic or performance requirements. This is only possible if they are within your own project.
  4. Security and governance: Your organization's security policies might restrict direct access to public datasets likebqutil for production data processing. Copying UDFs to your controlled environment aligns with such policies.

Deploying helper UDFs to your project:

For reliable and stable production use, you should deploy these helper UDFs into your own project and dataset. This gives you full control over their version, customization, and access.For detailed instructions on how to deploy these UDFs, refer to theUDFs deployment guide on GitHub. This guide provides the necessary scripts and steps to copy the UDFs into your environment.

Locations

The translation API is available in the followingprocessing locations:

Region descriptionRegion nameDetails
Asia Pacific
Delhiasia-south2
Hong Kongasia-east2
Jakartaasia-southeast2
Melbourneaustralia-southeast2
Mumbaiasia-south1
Osakaasia-northeast2
Seoulasia-northeast3
Singaporeasia-southeast1
Sydneyaustralia-southeast1
Taiwanasia-east1
Tokyoasia-northeast1
Europe
Belgiumeurope-west1leaf iconLow CO2
Berlineurope-west10
EU multi-regioneu
Finlandeurope-north1leaf iconLow CO2
Frankfurteurope-west3
Londoneurope-west2leaf iconLow CO2
Madrideurope-southwest1leaf iconLow CO2
Milaneurope-west8
Netherlandseurope-west4leaf iconLow CO2
Pariseurope-west9leaf iconLow CO2
Stockholmeurope-north2leaf iconLow CO2
Turineurope-west12
Warsaweurope-central2
Züricheurope-west6leaf iconLow CO2
Americas
Columbus, Ohious-east5
Dallasus-south1leaf iconLow CO2
Iowaus-central1leaf iconLow CO2
Las Vegasus-west4
Los Angelesus-west2
Mexiconorthamerica-south1
Northern Virginiaus-east4
Oregonus-west1leaf iconLow CO2
Québecnorthamerica-northeast1leaf iconLow CO2
São Paulosouthamerica-east1leaf iconLow CO2
Salt Lake Cityus-west3
Santiagosouthamerica-west1leaf iconLow CO2
South Carolinaus-east1
Torontonorthamerica-northeast2leaf iconLow CO2
US multi-regionus
Africa
Johannesburgafrica-south1
MiddleEast
Dammamme-central2
Dohame-central1
Israelme-west1

Submit a translation job

To submit a translation job using the translation API, use theprojects.locations.workflows.createmethod and supply an instance of theMigrationWorkflowresource with asupported task type.

Once the job is submitted, you canissue a query to get results.

Create a batch translation

The followingcurl command creates a batch translation job where the inputand output files are stored in Cloud Storage. Thesource_target_mapping fieldcontains a list that maps the sourceliteral entries to an optional relativepath for the target output.

curl -d "{  \"tasks\": {      string: {        \"type\": \"TYPE\",        \"translation_details\": {            \"target_base_uri\": \"TARGET_BASE\",            \"source_target_mapping\": {              \"source_spec\": {                  \"base_uri\": \"BASE\"              }            },            \"target_types\": \"TARGET_TYPES\",        }      }  }  }" \  -H "Content-Type:application/json" \  -H "Authorization: BearerTOKEN" -X POST https://bigquerymigration.googleapis.com/v2alpha/projects/PROJECT_ID/locations/LOCATION/workflows

Replace the following:

  • TYPE: thetask type of the translation, whichdetermines the source and target dialect.
  • TARGET_BASE: the base URI for all translationoutputs.
  • BASE: the base URI for all files readas sources for translation.
  • TARGET_TYPES (optional): the generated output types. If not specified, SQL is generated.

    • sql (default): The translated SQL query files.
    • suggestion: AI generated suggestions.

    The output is stored in a subfolder in the output directory. The subfolder is named based on the value inTARGET_TYPES.

  • TOKEN: the token for authentication. To generatea token, use thegcloud auth print-access-token command or theOAuth 2.0 playground (use the scopehttps://www.googleapis.com/auth/cloud-platform).

  • PROJECT_ID: the project to process thetranslation.

  • LOCATION: thelocation where the job is processed.

The preceding command returns a response that includes a workflow ID written in the formatprojects/PROJECT_ID/locations/LOCATION/workflows/WORKFLOW_ID.

Example batch translation

To translate the Teradata SQL scripts in the Cloud Storage directorygs://my_data_bucket/teradata/input/ and store the results in theCloud Storage directorygs://my_data_bucket/teradata/output/, you might usethe following query:

{"tasks":{"task_name":{"type":"Teradata2BigQuery_Translation","translation_details":{"target_base_uri":"gs://my_data_bucket/teradata/output/","source_target_mapping":{"source_spec":{"base_uri":"gs://my_data_bucket/teradata/input/"}},}}}}
Note: The string"task_name" in this example is an identifier for thetranslation task and can be set to any value you prefer.

This call will return a message containing the created workflow ID in the"name" field:

{"name":"projects/123456789/locations/us/workflows/12345678-9abc-def1-2345-6789abcdef00","tasks":{"task_name":{/*...*/}},"state":"RUNNING"}

To get the updated status for the workflow,run aGET query.The job sends outputs to Cloud Storage as it progresses. The jobstatechanges toCOMPLETED after all the requestedtarget_types are generated.If the task succeeds, you can find the translated SQL query ings://my_data_bucket/teradata/output.

Example batch translation with AI suggestions

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Note: The translation API can call Gemini using BigQueryVertex AI integration to generate suggestions to your translatedSQL query based on your AI configuration YAML file.

The following example translates the Teradata SQL scripts located in thegs://my_data_bucket/teradata/input/ Cloud Storage directory and storesresults in the Cloud Storage directorygs://my_data_bucket/teradata/output/with additional AI suggestion:

{"tasks":{"task_name":{"type":"Teradata2BigQuery_Translation","translation_details":{"target_base_uri":"gs://my_data_bucket/teradata/output/","source_target_mapping":{"source_spec":{"base_uri":"gs://my_data_bucket/teradata/input/"}},"target_types":"suggestion",}}}}
Note: To generate AI suggestions, the Cloud Storage source directory mustcontain at least one configuration YAML file with a suffix of.ai_config.yaml.To learn how to write the configuration YAML file for AI suggestions, seeCreate a Gemini-based configuration YAML file.

After the task runs successfully, AI suggestions can be found ings://my_data_bucket/teradata/output/suggestion Cloud Storage directory.

Create an interactive translation job with string literal inputs and outputs

The followingcurl command creates a translation job with string literalinputs and outputs. Thesource_target_mapping field contains a list that maps thesource directories to an optional relative path for the target output.

curl -d "{  \"tasks\": {      string: {        \"type\": \"TYPE\",        \"translation_details\": {        \"source_target_mapping\": {            \"source_spec\": {              \"literal\": {              \"relative_path\": \"PATH\",              \"literal_string\": \"STRING\"              }            }        },        \"target_return_literals\": \"TARGETS\",        }      }  }  }" \  -H "Content-Type:application/json" \  -H "Authorization: BearerTOKEN" -X POST https://bigquerymigration.googleapis.com/v2alpha/projects/PROJECT_ID/locations/LOCATION/workflows

Replace the following:

  • TYPE: thetask type of the translation, whichdetermines the source and target dialect.
  • PATH: the identifier of the literal entry, similar to a filename or path.
  • STRING: string of literal input data (for example, SQL) to be translated.
  • TARGETS: the expected targets that the user wants to be directly returned in the response in theliteral format. These should be in the target URI format (for example,GENERATED_DIR +target_spec.relative_path +source_spec.literal.relative_path). Anything not in this list is not returned in the response. The generated directory,GENERATED_DIR for general SQL translations issql/.
  • TOKEN: the token for authentication. To generatea token, use thegcloud auth print-access-token command or theOAuth 2.0 playground (use the scopehttps://www.googleapis.com/auth/cloud-platform).
  • PROJECT_ID: the project to process thetranslation.
  • LOCATION: thelocation where the jobis processed.

The preceding command returns a response that includes a workflow ID written in the formatprojects/PROJECT_ID/locations/LOCATION/workflows/WORKFLOW_ID.

When your job completes, you can view the results by byquerying the joband examining the inlinetranslation_literals field in the response after theworkflow completes.

Example Interactive Translation

To translate the Hive SQL stringselect 1 interactively, you might use thefollowing query:

"tasks":{string:{"type":"HiveQL2BigQuery_Translation","translation_details":{"source_target_mapping":{"source_spec":{"literal":{"relative_path":"input_file","literal_string":"select 1"}}},"target_return_literals":"sql/input_file",}}}
Note: The string"task_name" in this example is an identifier for thetranslation task and can be set to any value you prefer.

You can use anyrelative_path you would like for your literal, but thetranslated literal will only appear in the results if you includesql/$relative_path in yourtarget_return_literals. You can also includemultiple literals in a single query, in which case each of their relative pathsmust be included intarget_return_literals.

This call will return a message containing the created workflow ID in the"name" field:

{"name":"projects/123456789/locations/us/workflows/12345678-9abc-def1-2345-6789abcdef00","tasks":{"task_name":{/*...*/}},"state":"RUNNING"}

To get the updated status for the workflow,run aGET query.The job is complete when"state" changes toCOMPLETED. If the task succeeds,you will find the translated SQL in the response message:

{"name":"projects/123456789/locations/us/workflows/12345678-9abc-def1-2345-6789abcdef00","tasks":{"string":{"id":"0fedba98-7654-3210-1234-56789abcdef","type":"HiveQL2BigQuery_Translation",/* ... */"taskResult":{"translationTaskResult":{"translatedLiterals":[{"relativePath":"sql/input_file","literalString":"-- Translation time: 2023-10-05T21:50:49.885839Z\n-- Translation job ID: projects/123456789/locations/us/workflows/12345678-9abc-def1-2345-6789abcdef00\n-- Source: input_file\n-- Translated from: Hive\n-- Translated to: BigQuery\n\nSELECT\n    1\n;\n"}],"reportLogMessages":[...]}},/* ... */}},"state":"COMPLETED","createTime":"2023-10-05T21:50:49.543221Z","lastUpdateTime":"2023-10-05T21:50:50.462758Z"}

Explore the translation output

After running the translation job, retrievethe results by specifying the translation job workflow ID using the followingcommand:

curl \-H "Content-Type:application/json" \-H "Authorization:BearerTOKEN" -X GET https://bigquerymigration.googleapis.com/v2alpha/projects/PROJECT_ID/locations/LOCATION/workflows/WORKFLOW_ID

Replace the following:

  • TOKEN: the token for authentication. To generate a token, use thegcloud auth print-access-token command or theOAuth 2.0 playground (use the scopehttps://www.googleapis.com/auth/cloud-platform).
  • PROJECT_ID: the project to process thetranslation.
  • LOCATION: thelocation where the jobis processed.
  • WORKFLOW_ID: the ID generated when you create atranslation workflow.

The response contains the status of your migration workflow, and any completedfiles intarget_return_literals.

The response will contain the status of your migration workflow, and anycompleted files intarget_return_literals. You can poll this endpoint to checkyour workflow's status.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.