Model inference overview

This document describes the types of batch inference that BigQuery MLsupports, which include:

Machine learning inference is the process of running data points intoa machine learning model to calculate an output such as a single numericalscore. This process is also referred to as "operationalizing a machine learningmodel" or "putting a machine learning model into production."

Batch prediction

The following sections describe the available ways of performing prediction inBigQuery ML.

Inference using BigQuery ML trained models

Prediction in BigQuery ML is used not only for supervised learning models, butalso unsupervised learning models.

BigQuery ML supports prediction functionalities through theML.PREDICT function,with the following models:

Model Category	Model Types	What`ML.PREDICT` does
Supervised Learning	Linear & logistic regression Boosted trees Random forest Deep Neural Networks Wide-and-Deep AutoML Tables	Predict the label, either a numerical value for regression tasks or a categorical value for classification tasks.
Unsupervised Learning	K-means	Assign the cluster to the entity.
	PCA	Apply dimensionality reduction to the entity by transforming it into the space spanned by the eigenvectors.
	Autoencoder	Transform the entity into the embedded space.

Inference using imported models

With this approach, you create and train a model outside ofBigQuery, import it by using theCREATE MODEL statement,and then run inference on it by using theML.PREDICT function.All inference processing occurs in BigQuery, using data fromBigQuery. Imported models can perform supervised orunsupervised learning.

BigQuery ML supports the followingtypes of imported models:

Open Neural Network Exchange (ONNX) for models trained in PyTorch, scikit-learn, and other popular ML frameworks.
TensorFlow
TensorFlow Lite
XGBoost

Use this approach to make use of custom models developed with a rangeof ML frameworks while taking advantage of BigQuery ML'sinference speed and co-location with data.

To learn more, try one of the following tutorials:

Inference using remote models

With this approach, you can create a reference to a modelhosted inVertex AI Inferenceby using theCREATE MODEL statement,and then run inference on it by using theML.PREDICT function.All inference processing occurs in Vertex AI, using data fromBigQuery. Remote models can perform supervised orunsupervised learning.

Use this approach to run inference against large models that require the GPUhardware support provided by Vertex AI. If most of yourmodels are hosted by Vertex AI, this also lets you runinference against these models by using SQL, without having to manually builddata pipelines to take data to Vertex AI and bring predictionresults back to BigQuery.

For step-by-step instructions, seeMake predictions with remote models on Vertex AI.

Batch inference with BigQuery models in Vertex AI

BigQuery ML has built-in support for batch prediction, without theneed to use Vertex AI. It is also possible to register aBigQuery ML model to Model Registry in order toperform batch prediction in Vertex AI using aBigQuery table as input. However, this can onlybe done by using the Vertex AI API and settingInstanceConfig.instanceTypetoobject.

Online prediction

The built-in inference capability of BigQuery ML is optimized forlarge-scale use cases, such as batch prediction. While BigQuery MLdelivers low latency inference results when handling small input data, you canachieve faster online prediction through seamless integration withVertex AI.

You canmanage BigQuery ML models within the Vertex AI environment,which eliminates the need to export models from BigQuery ML beforedeploying them as Vertex AI endpoints. By managing models withinVertex AI, you get access to all of the Vertex AI MLOpscapabilities, and also to features such asVertex AI Feature Store.

Additionally, you have the flexibility toexport BigQuery ML models toCloud Storage for availability on other model hosting platforms.

What's next

For more information about using Vertex AI models togenerate text and embeddings, seeGenerative AI overview.
For more information about using Cloud AI APIs to perform AI tasks, seeAI application overview.
For more information about supported SQL statements and functions fordifferent model types, see the following documents:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換