The CREATE MODEL statement for remote models over LLMs

This document describes theCREATE MODEL statement for creating remote modelsin BigQuery over models in Vertex AI by using SQL.Alternatively, you can use the Google Cloud console user interface tocreate a model by using a UI(Preview) instead of constructing the SQLstatement yourself.

After you create the remote model, you can use one of the following functionsto perform generative AI with that model:

CREATE MODEL syntax

{CREATE MODEL |CREATE MODEL IF NOT EXISTS |CREATE OR REPLACE MODEL}`project_id.dataset.model_name`REMOTE WITH CONNECTION {`project_id.region.connection_id` | DEFAULT}OPTIONS(  ENDPOINT = 'vertex_ai_llm_endpoint'  [, PROMPT_COL = 'prompt_col']  [, INPUT_LABEL_COLS =input_label_cols]  [, MAX_ITERATIONS =max_iterations]  [, LEARNING_RATE_MULTIPLIER =learning_rate_multiplier]  [, DATA_SPLIT_METHOD = 'data_split_method']  [, DATA_SPLIT_EVAL_FRACTION =data_split_eval_fraction]  [, DATA_SPLIT_COL = 'data_split_col']  [, EVALUATION_TASK = 'evaluation_task'])[AS SELECTprompt_column,label_column FROM `project_id.dataset.table_name`]

CREATE MODEL

Creates and trains a new model in the specified dataset. If the model nameexists,CREATE MODEL returns an error.

CREATE MODEL IF NOT EXISTS

Creates and trains a new model only if the model doesn't exist in thespecified dataset.

CREATE OR REPLACE MODEL

Creates and trains a model and replaces an existing model with the same name inthe specified dataset.

model_name

The name of the model you're creating or replacing. The modelname must be unique in the dataset: no other model or table can have the samename. The model name must follow the same naming rules as aBigQuery table. A model name can:

  • Contain up to 1,024 characters
  • Contain letters (upper or lower case), numbers, and underscores

model_name is case-sensitive.

If you don't have a default project configured, then you must prepend theproject ID to the model name in the following format, including backticks:

`[PROJECT_ID].[DATASET].[MODEL]`

For example, `myproject.mydataset.mymodel`.

REMOTE WITH CONNECTION

Syntax

`[PROJECT_ID].[LOCATION].[CONNECTION_ID]`

BigQuery uses aCloud resource connectionto interact withthe Vertex AI endpoint.

The connection elements are as follows:

  • PROJECT_ID: the project ID of the project that contains the connection.
  • LOCATION: thelocation used by the connection. The connection must be in the same location as the dataset that contains the model.
  • CONNECTION_ID: the connection ID—for example,myconnection.

    To find your connection ID,view the connection details in the Google Cloud console. The connection ID is the value in the last section of the fully qualified connection ID that is shown inConnection ID—for exampleprojects/myproject/locations/connection_location/connections/myconnection.

    To use adefault connection, specifyDEFAULT instead of the connection string containingPROJECT_ID.LOCATION.CONNECTION_ID.

If you are creating a remote model over a Vertex AI model thatuses supervised tuning, you need to grant theVertex AI Service Agent roleto the connection's service account in the project where you create the model.Otherwise, you need to grant theVertex AI User roleto the connection's service account in the project where you create the model.

If you are using the remote model to analyze unstructured data from anobject table, you must also grant theVertex AI Service Agent roleto the service account of the connection associated with the object table.You can find the object table's connection in the Google Cloud console, on theDetails pane for the object table.

Example

`myproject.us.my_connection`

ENDPOINT

Syntax

ENDPOINT = 'vertex_ai_llm_endpoint'

Description

The Vertex AI endpoint for the remote model to use. You canspecify the name of the Vertex AI model, for examplegemini-2.5-flash, or you can specify the Vertex AI model'sendpoint URL, for examplehttps://europe-west6-aiplatform.googleapis.com/v1/projects/myproject/locations/europe-west6/publishers/google/models/gemini-2.5-flash. If you specify the model name, BigQuery MLautomatically identifies and uses the full endpoint of theVertex AI model. The resource portion of the URL is saved asthe remote endpoint value inthemodel metadata.For example,projects/myproject/locations/europe-west6/publishers/google/models/gemini-2.5-flash.

Arguments

ASTRING value that contains the model name of the targetVertex AI LLM. The following LLMs are supported:

Gemini multimodal models

All of thegenerally availableandpreviewGemini models are supported.

Note: To provide feedback or request support for the models in preview,send an email tobqml-feedback@google.com.

To always use the latest version of a Gemini model, specify one of theGemini auto-updated aliases.For more information on Gemini model versioning, seeModel versions and lifecycle.

Forsupported Gemini models,you can specify theglobal endpoint,as shown in the following example:

https://aiplatform.googleapis.com/v1/projects/test-project/locations/global/publishers/google/models/gemini-2.0-flash-001

Using the global endpoint for your requests can improve overallavailability while reducing resource exhausted (429) errors, which occurwhen you exceed your quota for a regional endpoint.If you want to use Gemini 2.0+ in a region where it isn'tavailable, you can avoid migrating your data to a different region byusing the global endpoint instead. You can onlyuse the global endpoint with theAI.GENERATE_TEXT function.

Note: Don't use the global endpoint if you have requirements for the dataprocessing location, because when you use the global endpoint, you can'tcontrol or know the region where your processing requests are handled.

After you create a remote model based on a Gemini model, you cando the following:

Note: Using Gemini 2.5 models with any of these functions incurscharges for thethinking process.With some functions, you can set a budget for the thinking process forGemini 2.5 Flash and Gemini 2.5 Flash-Lite models. You can'tset a budget for Gemini 2.5 Pro models. See the documentation fora given function for details.

Claude models

The followingAnthropic Claude modelsare supported:

  • claude-haiku-4-5@20251001
  • claude-sonnet-4-5@20250929
  • claude-opus-4-1@20250805
  • claude-opus-4@20250514
  • claude-sonnet-4@20250514
  • claude-3-7-sonnet@20250219
  • claude-3-5-haiku@20241022
  • claude-3-haiku@20240307

You must enable Claude models in Vertex AI before you can usethem. For more information, seeEnable a partner model.

Although Claude models are multimodal, you can only use text input withClaude models in BigQuery ML.

After you create a remote model based on a Claude model, you can use themodel with theAI.GENERATE_TEXT functionto generate text based on a prompt you provide in a query or from a column in astandard table.

Mistral AI models

The followingMistral AI modelsare supported:

  • mistral-large-2411
  • mistral-nemo
  • mistral-small-2503

Don't use a version suffix with any Mistral AI model.

You must enable Mistral AI models in Vertex AI before you can usethem. For more information, seeEnable a partner model.

After you create a remote model based on a Mistral AI model, you can use themodel with theAI.GENERATE_TEXT functionto generate text based on a prompt you provide in a query or from a column in astandard table.

Llama models

To create a Llama model in BigQuery ML, you must specify it asanOpenAI APIendpoint in the formatopenapi/<publisher_name>/<model_name>.

The followingLlama modelsare supported:

  • Llama 4 Scout 17B-16E, endpointmeta/llama-4-scout-17b-16e-instruct-maas
  • Llama 4 Maverick 17B-128E, endpointmeta/llama-4-maverick-17b-128e-instruct-maas
  • Llama 3.3 70B (Preview), endpointopenapi/meta/llama-3.3-70b-instruct-maas
  • Llama 3.2 90B (Preview), endpointopenapi/meta/llama-3.2-90b-vision-instruct-maas
  • Llama 3.1 405B (GA), endpointopenapi/meta/llama-3.1-405b-instruct-maas
  • Llama 3.1 70B (Preview), endpointopenapi/meta/llama-3.1-70b-instruct-maas
  • Llama 3.1 8B (Preview), endpointopenapi/meta/llama-3.1-8b-instruct-maas
Important: For Llama 4.0 and greater models, you must create the dataset andconnection for the remote model in the same region as the Llama modelendpoint.

You must enable Llama models in Vertex AI before you can usethem. For more information, seeEnable a partner model.

After you create a remote model based on a Llama model, you can use themodel with theAI.GENERATE_TEXT functionto generate text based on a prompt you provide in a query or from a column in astandard table.

multimodal embedding models

Themultimodalembeddingembedding modelis supported. You must specify the001 version of the model,multimodalembedding@001.

After you create a remote model based on amultimodalembedding embeddingmodel, you can use the model with theAI.GENERATE_EMBEDDING functionto generate embeddings from text data in a BigQuery tableor from visual content in a BigQueryobject table.

text embedding models

The followingtext embedding modelsare supported:

  • gemini-embedding-001, which supports both English and multilingualinput. (Preview)
  • text-embedding
  • text-multilingual-embedding

Fortext-embedding andtext-multilingual-embedding models, youmust specify asupported model version.

After you create a remote model based on an embeddingmodel, you can use the model with theAI.GENERATE_EMBEDDING functionto generate embeddings from text data in a BigQuery table.

For information that can help you choose between the supported models, seeModel information.

Retired models

The following Vertex AI models are retired as ofApril 9, 2025:

  • PaLM models
  • Gemini 1.0 models
  • textembedding-gecko@001 andtextembedding-gecko@002 models

The following Vertex AI models are retired as ofMay 24, 2025:

  • gemini-1.5-pro-001 andgemini-1.5-flash-001 models
  • textembedding-gecko@003 andtextembedding-gecko-multilingual@001 models

The following Vertex AI models are retired as ofSeptember 24, 2025:

  • gemini-1.5-pro-002 andgemini-1.5-flash-002 models

For more information on retired Vertex AI models, seeRetired models.

For more information on retired Vertex AI models, seeRetired models.

PROMPT_COL

Syntax

PROMPT_COL = 'prompt_col'

Description

The name of the prompt column in the training data table to use when performingsupervised tuning. If you don't specify a value for this option, you must havea column named or aliased asprompt in your input data.You can only use this option when performingsupervised tuning with a supported model.If you specify this option, you must also specify theAS SELECT clause.

Arguments

ASTRING value. The default value isprompt.

INPUT_LABEL_COLS

Syntax

INPUT_LABEL_COLS =input_label_cols

Description

The name of the label column in the training data table to use when performingsupervised tuning. If you don't specify a value for this option, you must havea column named or aliased aslabel in your input data.You can only use this option when performingsupervised tuning with a supported model.If you specify this option, you must also specify theAS SELECT clause.

Arguments

A one-elementARRAY<STRING> value. The default value is an empty array.

MAX_ITERATIONS

Syntax

MAX_ITERATIONS =max_iterations

Description

The number of steps to run when performing supervised tuning.You can only use this option when performingsupervised tuning with a supported model.If you specify this option, you must also specify theAS SELECT clause.

When you use a Gemini model,BigQuery ML automatically converts theMAX_ITERATIONS value toepochs, which is what Gemini models use for training. Thedefault value forMAX_ITERATIONS is the number of rows in the input data,which is equivalent to one epoch. To use multiple epochs, specify a multiple ofthe number of rows in your training data. For example, if you have 100 rows ofinput data and you want to use two epochs, specify200 for the argument value.If you provide a value that isn't a multiple of the number of rows in the inputdata, BigQuery ML rounds up to the nearest epoch.For example, if you have 100 rows of input data and you specify101 for theMAX_ITERATIONS value, training is performed with two epochs.

For more information about the parameters that are used to tuneGemini models, seeCreate a tuning job.

For more guidance on choosing the number of epochs for Geminimodels, seeRecommended configurations.

Arguments

AnINT64 value between1 and ∞. Typically, 100 steps takes about anhour to complete. The default value is300.

LEARNING_RATE_MULTIPLIER

Syntax

LEARNING_RATE_MULTIPLIER =learning_rate_multiplier

Description

A multiplier to apply to the recommended learning rate when performingsupervised tuning. You can only use this option with a remote model that targetsa Vertex AIgemini-1.5-pro-002 orgemini-1.5-flash-002 model.If you specify this option, you must also specify theAS SELECT clause.

Arguments

A positiveFLOAT64 value. The default value is1.0.

DATA_SPLIT_METHOD

Syntax

DATA_SPLIT_METHOD={'AUTO_SPLIT'|'RANDOM'|'CUSTOM'|'SEQ'|'NO_SPLIT'}

Description

The method used to split input data into training and evaluation sets whenperforming supervised tuning. You can only use this option when performingsupervised tuning with a supported model.If you specify this option, you must also specify theAS SELECT clause.

Training data is used to train the model. Evaluation data is used to avoidoverfittingby using early stopping.

The percentage sizes of the data sets produced by the various arguments forthis option are approximate. Larger input data sets come closer to thepercentages described than smaller input data sets do.

You can see the model's data split information in the following ways:

  • The data split method and percentage are shown in theTraining Optionssection of the model'sDetails page on theBigQuerypage of the Google Cloud console.
  • Links to temporary tables that contain the split data are available in theModel Details section of the model'sDetails page on theBigQuery page of the Google Cloud console. You can also returnthis information from theDataSplitResult fieldin the BigQuery API. These tables are saved for 48 hours. If youneed this information for more than 48 hours, then you should exportthis data or copy it to permanent tables.

Arguments

This option accepts the following values:

  • AUTO_SPLIT: This is the default value. This option splits the data asfollows:

    • If there are fewer than 500 rows in the input data, then all rows are usedas training data.
    • If there are more than 500 rows in the input data, then data is randomizedand split as follows:

      • If there are between 500 and 50,000 rows in the input data, then 20% ofthe data is used as evaluation data and 80% is used as training data.
      • If there are more than 50,000 rows, then 10,000 rows are used asevaluation data and the remaining rows are used as training data.
  • RANDOM: Data is randomized before being split into sets. To customize thedata split, you can use this option with theDATA_SPLIT_EVAL_FRACTION option. If you don'tspecify that option, data is split in the same way as for theAUTO_SPLIT option.

    A random split is deterministic: different training runs produce the samesplit results if the same underlying training data is used.

    Note: A random split is based on theFARM_FINGERPRINTof the data (including the column name and schema), sotables with the same content but different column names and schemas might getdifferent splitting and different evaluation metrics.
  • CUSTOM: Split data using the value provided in theDATA_SPLIT_COL option. TheDATA_SPLIT_COL value must bethe name of a column of typeBOOL. Rows with a value ofTRUE orNULL areused as evaluation data, and rows with a value ofFALSE are used astraining data.

  • SEQ: Split data sequentially by using the value in a specified column of oneof the following types:

    • NUMERIC
    • BIGNUMERIC
    • STRING
    • TIMESTAMP

    The data is sorted smallest to largest based on the specified column.

    The firstn rows are used as evaluation data, wherenis the value specified forDATA_SPLIT_EVAL_FRACTION. The remaining rowsare used as training data.

    All rows with split values smaller than the threshold are used astraining data. The remaining rows, including those withNULL values,are used as evaluation data.

    Use theDATA_SPLIT_COL option option to identify thecolumn that contains the data split information.

  • NO_SPLIT: No data split; all input data is used as training data.

DATA_SPLIT_EVAL_FRACTION

Syntax

DATA_SPLIT_EVAL_FRACTION =data_split_eval_fraction

Description

The fraction of the data to use as evaluation data when performing supervisedtuning. Use when you specifyRANDOM orSEQ as the value for theDATA_SPLIT_METHOD option. You can only use thisoption when performingsupervised tuning with asupported model. If you specify this option, you must also specifytheAS SELECT clause.

Arguments

AFLOAT64 value in the range[0, 1.0]. The default is0.2. The servicemaintains the accuracy of the input value to two decimal places.

DATA_SPLIT_COL

Syntax

DATA_SPLIT_COL = 'data_split_col'

Description

The name of the column to use to sort input data into the training orevaluation set when performing supervised tuning. Use when you are specifyingCUSTOM orSEQ as the value forDATA_SPLIT_METHOD.You can only use this option when performingsupervised tuning with a supported model.If you specify this option, youmust also specify theAS SELECT clause.

If you are specifyingSEQ as the value forDATA_SPLIT_METHOD, then the datais first sorted smallest to largest based on the specified column. The lastn rows are used as evaluation data, wheren is the valuespecified forDATA_SPLIT_EVAL_FRACTION. Theremaining rows are used as training data.

If you are specifyingCUSTOM as the value forDATA_SPLIT_COL, then you mustprovide the name of a column of typeBOOL. Rows with a value ofTRUE orNULLare used as evaluation data, rows with a value ofFALSE are used astraining data.

The column you specify forDATA_SPLIT_COL can't be used as a feature orlabel, and the column is excluded from features automatically.

Arguments

ASTRING value.

EVALUATION_TASK

Syntax

EVALUATION_TASK = 'evaluation_task'

Description

When performing supervised tuning, the type of task that you want to tunethe model to perform. You can only use this option when performingsupervised tuning with a supported model.If you specify this option, youmust also specify theAS SELECT clause.

Arguments

ASTRING value. The valid options are the following:

  • TEXT_GENERATION
  • CLASSIFICATION
  • SUMMARIZATION
  • QUESTION_ANSWERING
  • UNSPECIFIED

The default value isUNSPECIFIED.

AS SELECT

Syntax

AS SELECTprompt_column,label_column FROM  `project_id.dataset.table_name`

Description

Provides the training data to use when performing supervised tuning.You can only use this option when performingsupervised tuning with a supported model.If you specify this option, youmust also specify theAS SELECT clause.

Arguments

  • prompt_column: The name of the column in the training data table thatcontains the prompt for evaluating the content in thelabel_column column. This column must be ofSTRING type or be casttoSTRING. If you specify a value for thePROMPT_COL option, you must specify the same value forprompt_column.Otherwise this value must beprompt. If your table does not have apromptcolumn, use an alias to specify an existing table column. For example,SELECT AS hint AS prompt, label FROM mydataset.mytable.
  • label_column: The name of the column in the training data table thatcontains the examples to train the model with. This column must be ofSTRING type or be cast toSTRING. If you specify a value for theINPUT_LABEL_COLS option, you must specify the same value forlabel_column.Otherwise this value must belabel. If your table does not have alabelcolumn, use an alias to specify an existing table column. For example,SELECT AS prompt, feature AS label FROM mydataset.mytable.
  • project_id: The project ID of the project that contains the training datatable.
  • dataset: The dataset name of the dataset that contains the training datatable. After optional data splitting, the number of rows in the trainingdataset has to be greater or equal to 10.
  • table_name: The name of the training data table.

Supervised tuning

If you create a remote model that references any of the following models,you can optionally configuresupervised tuningat the same time:

  • gemini-2.5-pro
  • gemini-2.5-flash-lite
  • gemini-2.0-flash-001
  • gemini-2.0-flash-lite-001

To configure supervised tuning, specifytheAS SELECT clause, and optionally some of the otherCREATE MODELarguments that affect supervised tuning. Supervised tuning lets you train themodel on your own data to make it better suited for your use case. However,not all models have their performance improved by tuning. To learn more aboutwhether tuning would make sense for your use case, seeUse cases for using supervised fine-tuning.

After you create a tuned model, use theEVALUATE functionto evaluate whether the tuned model performs well for your use case. To learnmore, try theUse tuning and evaluation to improve LLM performancetutorial.

Costs

When using supervised tuning with remote models over Vertex AILLMs, costs are calculated based on the following:

  • The bytes processed from the training data table specified in theAS SELECT clause. These charges are billed fromBigQuery to your project. For more information, seeBigQuery pricing.
  • The number of tokens processed to tune the LLM. These charges are billed fromVertex AI to your project. For more information, seeVertex AI pricing.

Locations

For information about supported locations, seeLocations for remote models.

Examples

The following examples create BigQuery ML remote models.

Create a model without tuning

The following example creates a BigQuery ML remote model over aVertex AI model:

CREATE OR REPLACE MODEL `mydataset.flash_model`  REMOTE WITH CONNECTION `myproject.us.test_connection`  OPTIONS(ENDPOINT = 'gemini-2.0-flash-001');

Create a tuned model

The following example creates a BigQuery ML remote model over atuned version of a Vertex AI model:

CREATE OR REPLACE MODEL `mydataset.tuned_model`  REMOTE WITH CONNECTION `myproject.us.test_connection`  OPTIONS (    endpoint = 'gemini-2.0-flash-001',    max_iterations = 500,    prompt_col = 'prompt',    input_label_cols = ['label'])ASSELECT  CONCAT(    'Please do sentiment analysis on the following text and only output a number from 0 to 5 where 0 means sadness, 1 means joy, 2 means love, 3 means anger, 4 means fear, and 5 means surprise. Text: ',    sentiment_column) AS prompt,  text_column AS labelFROM `mydataset.emotion_classification_train`;

Create a partner model that uses the default connection

The following example creates a BigQuery ML remote model over aMistral AI model:

CREATE OR REPLACE MODEL `mydataset.mistral_model`REMOTE WITH CONNECTION DEFAULTOPTIONS(ENDPOINT = 'mistral-large-2411');

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-11-25 UTC.