Customize Python functions for BigQuery DataFrames

BigQuery DataFrames lets you turn your custom Python functions intoBigQuery artifacts that you can run onBigQuery DataFrames objects at scale. This extensibility supportlets you perform operations beyond what is possible withBigQuery DataFrames and SQL APIs, so you can potentially take advantageof open source libraries.

There are two variants of this extensibility mechanism:user-definedfunctions andremote functions.

Required roles

To get the permissions that you need to complete the tasks in this document, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

User-defined functions (UDFs)

With UDFs (Preview),you can turn your custom Python function into aPython UDF.For an example usage, seeCreate a persistent Python UDF.

Creating a UDF in BigQuery DataFrames creates a BigQueryroutine as the Python UDF in the specified dataset. For a full set ofsupported parameters, seebigframes.pandas.udf.

Requirements

To use a BigQuery DataFrames UDF, enable theBigQuery APIin your project. If you provide thebigquery_connection parameter inyour project, you must also enable theBigQuery Connection API.

Clean up

In addition to cleaning up the cloud artifacts directly in the Google Cloud consoleor with other tools, you can clean up the BigQuery DataFrames UDFs thatwere created with an explicit name argument by using thebigframes.pandas.get_global_session().bqclient.delete_routine(routine_id)command.

Limitations

  • The code in the UDF must be self-contained, meaning, it must not contain anyreferences to an import or variable defined outside of the function body.
  • The code in the UDF must be compatible with Python 3.11, as that is theenvironment in which the code is executed in the cloud.
  • Re-running the UDF definition code after trivial changes in the functioncode—for example, renaming a variable or inserting a new line—causes the UDF tobe re-created, even if these changes are inconsequential to the behavior of thefunction.
  • The user code is visible to users with read access on theBigQuery routines, so you should include sensitive content onlywith caution.
  • A project can have up to 1,000 Cloud Run functions at a time in aBigQuery location.

The BigQuery DataFrames UDF deploys a user-definedBigQuery Python function, and the relatedlimitationsapply.

Remote functions

BigQuery DataFrames lets you turn your custom scalar functions intoBigQuery remote functions.For an example usage, seeCreate a remote function.For a full set of supported parameters, seeremote_function.

Creating a remote function in BigQuery DataFrames creates the following:

  • ACloud Run function.
  • ABigQuery connection.

    By default, a connection namedbigframes-default-connection is used. You canuse a pre-configured BigQuery connection if you prefer, in whichcase the connection creation is skipped. The service account for the defaultconnection is granted theCloud Run role(roles/run.invoker).

  • A BigQuery remote function that uses the Cloud Runfunction that's been created with the BigQuery connection.

Requirements

To use BigQuery DataFrames remote functions, you must enable thefollowing APIs:

When you use BigQuery DataFrames remote functions, you need theProject IAM Admin role (roles/resourcemanager.projectIamAdmin)if you're using a default BigQuery connection, or theBrowser role (roles/browser)if you're using a pre-configured connection. You can avoid this requirement bysetting thebigframes.pandas.options.bigquery.skip_bq_connection_check optiontoTrue, in which case the connection (default or pre-configured) is usedas-is without any existence or permission check. If you're using thepre-configured connection and skipping the connection check, verify thefollowing:

  • The connection is created in the right location.
  • If you're using BigQuery DataFrames remote functions, the serviceaccount has theCloud Run Invoker role (roles/run.invoker) on the project.

View and manage connections

BigQuery connections are created in the same location as theBigQuery DataFrames session, using the name you provide in the customfunction definition. To view and manage connections, do the following:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. Select the project in which you created the remote function.

  3. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

  4. In theExplorer pane, expand the project, and then clickConnections.

BigQuery remote functions are created in the dataset you specify,or they are created in an anonymous dataset, which is a type ofhidden dataset.If you don't set a name for a remote function during its creation,BigQuery DataFrames applies a default name that begins with thebigframes prefix. To view and manage remote functions created in auser-specified dataset, do the following:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. Select the project in which you created the remote function.

  3. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

  4. In theExplorer pane, expand the project, and then clickDatasets.

  5. Click the dataset in which you created the remote function.

  6. Click theRoutines tab.

To view and manage Cloud Run functions, do the following:

  1. Go to theCloud Run page.

    Go to Cloud Run

  2. Select the project in which you created the function.

  3. In the list of available services, filter onFunction Deployment type.

  4. To identify functions created by BigQuery DataFrames, look forfunction names with thebigframes prefix.

Clean up

In addition to cleaning up the cloud artifacts directly in the Google Cloud consoleor with other tools, you can clean up the BigQuery remotefunctions that were created without an explicit name argument and theirassociated Cloud Run functions in the following ways:

  • For a BigQuery DataFrames session, use thesession.close() command.
  • For the default BigQuery DataFrames session, use thebigframes.pandas.close_session() command.
  • For a past session withsession_id, use thebigframes.pandas.clean_up_by_session_id(session_id) command.

You can also clean up the BigQuery remote functions that werecreated with an explicit name argument and their associatedCloud Run functions by using thebigframes.pandas.get_global_session().bqclient.delete_routine(routine_id)command.

Limitations

  • Remote functions take about 90 seconds to become usable when you first createthem. Additional package dependencies might add to the latency.
  • Re-running the remote function definition code after trivial changes in andaround the function code—for example, renaming a variable, inserting a new line,or inserting a new cell in the notebook—might cause the remote function to bere-created, even if these changes are inconsequential to the behavior of thefunction.
  • The user code is visible to users with read access on theCloud Run functions, so you should include sensitive contentonly with caution.
  • A project can have up to 1,000 Cloud Run functions at a timein a region. For more information, seeQuotas.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.