The AI.GENERATE_TEXT function

This document describes theAI.GENERATE_TEXT function, a table-valued functionthat lets you performgenerative natural language tasks by using any combination of text andunstructured data from BigQuerystandard tables, orunstructured data from BigQueryobject tables.

The function works by sending requests to a BigQuery ML remote modelthat represents a Vertex AI model, and then returning that model'sresponse. The following types of remote models are supported:

Several of theAI.GENERATE_TEXT function's arguments provide theparameters that shape the Vertex AI model's response.

You can use theAI.GENERATE_TEXT function to perform tasks such asclassification, sentiment analysis, image captioning, and transcription.

Prompt design can strongly affect the responses returned by theVertex AI model. For more information, seeIntroduction to promptingorDesign multimodal prompts.

Input

The input you can provide toAI.GENERATE_TEXT varies depending on theVertex AI model that you reference from your remote model.

Input for Gemini models

When you use the Gemini models, you can use the following typesof input:

When you analyze unstructured data, that data must meet the followingrequirements:

Input for other models

For all other types of models, you can analyze text data from a standard table.

Syntax for standard tables

AI.GENERATE_TEXT syntax differs depending on the Vertex AImodel that your remote model references. Choose the option appropriate for youruse case.

Gemini

AI.GENERATE_TEXT(MODEL `PROJECT_ID.DATASET.MODEL`,{ TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) },STRUCT(  {    {      [MAX_OUTPUT_TOKENS AS max_output_tokens]      [,TOP_P AS top_p]      [,TEMPERATURE AS temperature]      [,STOP_SEQUENCES AS stop_sequences]      [,GROUND_WITH_GOOGLE_SEARCH AS ground_with_google_search]      [,SAFETY_SETTINGS AS safety_settings]    }    |    [,MODEL_PARAMS AS model_params]  }  [,REQUEST_TYPE AS request_type]))

Arguments

AI.GENERATE_TEXT takes the following arguments:

  • PROJECT_ID: the project that contains theresource.

  • DATASET: the dataset that contains theresource.

  • MODEL: the name of the remote model over theVertex AI model. For more information about how to createthis type of remote model, seeTheCREATE MODEL statement for remote models over LLMs.

    You can confirm what model is used by the remote model by opening theGoogle Cloud console and looking at theRemote endpoint field inthe model details page.

Note: Using a remote model based on a Gemini 2.5 model incurs charges for thethinking process.
  • TABLE: the name of the BigQuerytable that contains the prompt data. The text in the column that's namedprompt is sent to the model. If your table does not have apromptcolumn, use theQUERY_STATEMENT argument instead and provide aSELECTstatement that includes an alias for an existing table column. An erroroccurs if noprompt column is available.

  • QUERY_STATEMENT: the GoogleSQL querythat generates the prompt data. The query must produce a column namedprompt. Within the query, you can provide the prompt value in thefollowing ways:

    • Specify aSTRING value. For example,('Write a poem about birds').
    • Specify aSTRUCT value that contains one or more fields. You can usethe following types of fields within theSTRUCT value:

      Field typeDescriptionExamples
      STRINGA string literal, or the name of aSTRING column.String literal:
      'Is Seattle a US city?'

      String column name:
      my_string_column
      ARRAY<STRING>You can only use string literals in the array.Array of string literals:
      ['Is ', 'Seattle', ' a US city']
      ObjectRefRuntime

      AnObjectRefRuntime value returned by theOBJ.GET_ACCESS_URL function. TheOBJ.GET_ACCESS_URL function takes anObjectRef value as input, which you can provide by either specifying the name of a column that containsObjectRef values, or by constructing anObjectRef value.

      ObjectRefRuntime values must have theaccess_url.read_url anddetails.gcs_metadata.content_type elements of the JSON value populated.

      Function call withObjectRef column:
      OBJ.GET_ACCESS_URL(my_objectref_column, 'r')

      Function call with constructedObjectRef value:
      OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image.jpg', 'myconnection'), 'r')
      ARRAY<ObjectRefRuntime>

      ObjectRefRuntime values returned from multiple calls to theOBJ.GET_ACCESS_URL function. TheOBJ.GET_ACCESS_URL function takes anObjectRef value as input, which you can provide by either specifying the name of a column that containsObjectRef values, or by constructing anObjectRef value.

      ObjectRefRuntime values must have theaccess_url.read_url anddetails.gcs_metadata.content_type elements of the JSON value populated.

      Function calls withObjectRef columns:
      [OBJ.GET_ACCESS_URL(my_objectref_column1, 'r'), OBJ.GET_ACCESS_URL(my_objectref_column2, 'r')]

      Function calls with constructedObjectRef values:
      [OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image1.jpg', 'myconnection'), 'r'), OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image2.jpg', 'myconnection'), 'r')]

      The function combinesSTRUCT fields similarly to aCONCAToperation and concatenates the fields in their specified order. Thesame is true for the elements of any arrays used within the struct.The following table shows some examples ofSTRUCT prompt values and howthey are interpreted:

      Struct field typesStruct valueSemantic equivalent
      STRUCT<STRING>('Describe the city of Seattle')'Describe the city of Seattle'
      STRUCT<STRING, STRING, STRING>('Describe the city ', my_city_column, ' in 15 words')'Describe the citymy_city_column_value in 15 words'
      STRUCT<STRING, ARRAY<STRING>>('Describe ', ['the city of', 'Seattle'])'Describe the city of Seattle'
      STRUCT<STRING, ObjectRefRuntime>('Describe this city', OBJ.GET_ACCESS_URL(image_objectref_column, 'r'))'Describe this city'image
      STRUCT<STRING, ObjectRefRuntime, ObjectRefRuntime>('If the city in the first image is within the country of the second image, provide a ten word description of the city',
      OBJ.GET_ACCESS_URL(city_image_objectref_column, 'r'),
      OBJ.GET_ACCESS_URL(country_image_objectref_column, 'r'))
      'If the city in the first image is within the country of the second image, provide a ten word description of the city'city_imagecountry_image
Note: We recommend against using theLIMIT and OFFSET clause in the prompt query. Using this clause causes the query to process all of the input data first and then applyLIMIT andOFFSET.Note: To minimize Vertex AI charges, write query results to a table and then reference that table in theAI.GENERATE_TEXT function. This can help you ensure that you are sending as few rows as possible to the model.
  • MAX_OUTPUT_TOKENS: anINT64 value that setsthe maximum number of tokens that can be generated in the response.A token might be smaller than a word and isapproximately four characters. One hundred tokens correspond toapproximately 60-80 words.This value must be in the range[1,8192].Specify a lower value for shorter responses and a higher value forlonger responses.The default is1024.
  • TOP_P: aFLOAT64 value in the range[0.0,1.0]that changes how the model selects tokens for output. Specify a lowervalue for less random responses and a higher value for more randomresponses.The default is0.95.

    Tokens are selected from the most to leastprobable until the sum of their probabilities equals theTOP_P value.For example, if tokens A, B, and C have a probability of0.3,0.2, and0.1, and theTOP_P value is0.5, then the model selects either A orB as the next token by using theTEMPERATURE value and doesn'tconsider C.

  • TEMPERATURE:aFLOAT64 value in the range[0.0,1.0]that controls the degree of randomness in token selection.LowerTEMPERATURE values are good for prompts that require amore deterministic and less open-ended or creative response, while higherTEMPERATURE values can lead to more diverse or creative results. ATEMPERATURE value of0 is deterministic, meaning that the highestprobability response is always selected.The default is0.

  • STOP_SEQUENCES: anARRAY<STRING> value thatremoves the specified strings if they are included in responses from themodel. Strings are matched exactly, including capitalization. The defaultis an empty array.

  • SAFETY_SETTINGS: anARRAY<STRUCT<STRING AScategory, STRING AS threshold>> value that configures content safetythresholds to filter responses. The first element in the struct specifiesa harm category, and the second element in the struct specifies acorresponding blocking threshold. The model filters out content thatviolate these settings. You can only specify each category once. Forexample, you can't specify bothSTRUCT('HARM_CATEGORY_DANGEROUS_CONTENT'AS category, 'BLOCK_MEDIUM_AND_ABOVE' AS threshold) andSTRUCT('HARM_CATEGORY_DANGEROUS_CONTENT' AS category, 'BLOCK_ONLY_HIGH'AS threshold). If there is no safety setting for a given category, theBLOCK_MEDIUM_AND_ABOVE safety setting is used.

    Supported categories are as follows:

    • HARM_CATEGORY_HATE_SPEECH
    • HARM_CATEGORY_DANGEROUS_CONTENT
    • HARM_CATEGORY_HARASSMENT
    • HARM_CATEGORY_SEXUALLY_EXPLICIT

    Supported thresholds are as follows:

    • BLOCK_NONE (Restricted)
    • BLOCK_LOW_AND_ABOVE
    • BLOCK_MEDIUM_AND_ABOVE (Default)
    • BLOCK_ONLY_HIGH
    • HARM_BLOCK_THRESHOLD_UNSPECIFIED

    For more information, refer to the definition ofsafety categoryandblocking threshold.

  • REQUEST_TYPE: aSTRING value that specifies thetype of inference request to send to the Gemini model. Therequest type determines what quota the request uses. Valid values are asfollows:

    • DEDICATED: TheAI.GENERATE_TEXT function only usesProvisioned Throughput quota. TheAI.GENERATE_TEXT functionreturns the errorProvisioned throughput is not purchased or is notactive if Provisioned Throughput quota isn't available.
    • SHARED: TheAI.GENERATE_TEXT function only usesdynamic shared quota (DSQ),even if you have purchased Provisioned Throughput quota.
    • UNSPECIFIED: TheAI.GENERATE_TEXT function uses quota as follows:

      • If you haven't purchased Provisioned Throughput quota,theAI.GENERATE_TEXT function uses DSQ quota.
      • If you have purchased Provisioned Throughput quota,theAI.GENERATE_TEXT function uses theProvisioned Throughput quota first. If requests exceedthe Provisioned Throughput quota, the overflow trafficuses DSQ quota.

    The default value isUNSPECIFIED.

  • MODEL_PARAMS: a JSON-formatted string literalthat provides parameters to the model. The value must conformto thegenerateContent request bodyformat. You can provide a value for any field in the request body exceptfor thecontents[] field. If you set this field, then you can't alsospecify any model parameters in the top-level struct argument to theAI.GENERATE_TEXT function. You must either specify every model parameterin theMODEL_PARAMS field, or omit this field and specify eachparameter separately.

Details

The model and input table must be in the same region.

Claude

You must enable Claude models in Vertex AI before you canuse them. For more information, seeEnable a partner model.

AI.GENERATE_TEXT(MODEL `PROJECT_ID.DATASET.MODEL`,{ TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) },STRUCT(  {    {      [MAX_OUTPUT_TOKENS AS max_output_tokens]      [,TOP_K AS top_k]      [,TOP_P AS top_p]    }    |    [,MODEL_PARAMS AS model_params]  }  ))

Arguments

AI.GENERATE_TEXT takes the following arguments:

  • PROJECT_ID: the project that contains theresource.

  • DATASET: the dataset that contains theresource.

  • MODEL: the name of the remote model over theVertex AI model. For more information about how to createthis type of remote model, seeTheCREATE MODEL statement for remote models over LLMs.

    You can confirm what model is used by the remote model by opening theGoogle Cloud console and looking at theRemote endpoint field inthe model details page.

  • TABLE: the name of the BigQuerytable that contains the prompt data. The text in the column that's namedprompt is sent to the model. If your table does not have apromptcolumn, use theQUERY_STATEMENT argument instead and provide aSELECTstatement that includes an alias for an existing table column. An erroroccurs if noprompt column is available.

  • QUERY_STATEMENT: the GoogleSQL querythat generates the prompt data. The query must produce a column namedprompt.

Note: We recommend against using theLIMIT and OFFSET clause in the prompt query. Using this clause causes the query to process all of the input data first and then applyLIMIT andOFFSET.Note: To minimize Vertex AI charges, write query results to a table and then reference that table in theAI.GENERATE_TEXT function. This can help you ensure that you are sending as few rows as possible to the model.
  • MAX_OUTPUT_TOKENS: anINT64 value that setsthe maximum number of tokens that can be generated in the response.A token might be smaller than a word and isapproximately four characters. One hundred tokens correspond toapproximately 60-80 words.This value must be in the range[1,4096].Specify a lower value for shorter responses and a higher value forlonger responses.The default is1024.
  • TOP_K: anINT64 value in the range[1,40] thatchanges how the model selects tokens for output. Specify a lower value forless random responses and a higher value for more random responses.If you don't specify a value, the model determines an appropriate value.

    ATOP_K value of1 means the next selected token is the most probableamong all tokens in the model's vocabulary, while aTOP_K value of3means that the next token is selected from among the threemost probable tokens by using theTEMPERATURE value.

    For each token selection step, theTOP_K tokens with the highestprobabilities are sampled. Then tokens are further filtered based on theTOP_P value, with the final token selected using temperature sampling.

  • TOP_P: aFLOAT64 value in the range[0.0,1.0]that changes how the model selects tokens for output. Specify a lowervalue for less random responses and a higher value for more randomresponses.If you don't specify a value, the model determines an appropriate value.

    Tokens are selected from the most to leastprobable until the sum of their probabilities equals theTOP_P value.For example, if tokens A, B, and C have a probability of0.3,0.2, and0.1, and theTOP_P value is0.5, then the model selects either A orB as the next token by using theTEMPERATURE value and doesn'tconsider C.

Details

The model and input table must be in the same region.

Llama

You must enable Llama models in Vertex AI before you canuse them. For more information, seeEnable a partner model.

AI.GENERATE_TEXT(MODEL `PROJECT_ID.DATASET.MODEL`,{ TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) },STRUCT(  {    {      [MAX_OUTPUT_TOKENS AS max_output_tokens]      [,TOP_P AS top_p]      [,TEMPERATURE AS temperature]      [,STOP_SEQUENCES AS stop_sequences]    |    }    [,MODEL_PARAMS AS model_params]  }  ))

Arguments

AI.GENERATE_TEXT takes the following arguments:

  • PROJECT_ID: the project that contains theresource.

  • DATASET: the dataset that contains theresource.

  • MODEL: the name of the remote model over theVertex AI model. For more information about how to createthis type of remote model, seeTheCREATE MODEL statement for remote models over LLMs.

    You can confirm what model is used by the remote model by opening theGoogle Cloud console and looking at theRemote endpoint field inthe model details page.

  • TABLE: the name of the BigQuerytable that contains the prompt data. The text in the column that's namedprompt is sent to the model. If your table does not have apromptcolumn, use theQUERY_STATEMENT argument instead and provide aSELECTstatement that includes an alias for an existing table column. An erroroccurs if noprompt column is available.

  • QUERY_STATEMENT: the GoogleSQL querythat generates the prompt data. The query must produce a column namedprompt.

Note: We recommend against using theLIMIT and OFFSET clause in the prompt query. Using this clause causes the query to process all of the input data first and then applyLIMIT andOFFSET.Note: To minimize Vertex AI charges, write query results to a table and then reference that table in theAI.GENERATE_TEXT function. This can help you ensure that you are sending as few rows as possible to the model.
  • MAX_OUTPUT_TOKENS: anINT64 value that setsthe maximum number of tokens that can be generated in the response.A token might be smaller than a word and isapproximately four characters. One hundred tokens correspond toapproximately 60-80 words.This value must be in the range[1,4096].Specify a lower value for shorter responses and a higher value forlonger responses.The default is1024.
  • TOP_P: aFLOAT64 value in the range[0.0,1.0]that changes how the model selects tokens for output. Specify a lowervalue for less random responses and a higher value for more randomresponses.The default is0.95.If you don't specify a value, the model determines an appropriate value.

    Tokens are selected from the most to leastprobable until the sum of their probabilities equals theTOP_P value.For example, if tokens A, B, and C have a probability of0.3,0.2, and0.1, and theTOP_P value is0.5, then the model selects either A orB as the next token by using theTEMPERATURE value and doesn'tconsider C.

  • TEMPERATURE:aFLOAT64 value in the range[0.0,1.0]that controls the degree of randomness in token selection.LowerTEMPERATURE values are good for prompts that require amore deterministic and less open-ended or creative response, while higherTEMPERATURE values can lead to more diverse or creative results. ATEMPERATURE value of0 is deterministic, meaning that the highestprobability response is always selected.The default is0.

  • STOP_SEQUENCES: anARRAY<STRING> value thatremoves the specified strings if they are included in responses from themodel. Strings are matched exactly, including capitalization. The defaultis an empty array.

Details

The model and input table must be in the same region.

Mistral AI

You must enable Mistral AI models in Vertex AI before you canuse them. For more information, seeEnable a partner model.

AI.GENERATE_TEXT(MODEL `PROJECT_ID.DATASET.MODEL`,{ TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) },STRUCT(  {    {      [MAX_OUTPUT_TOKENS AS max_output_tokens]      [,TOP_P AS top_p]      [,TEMPERATURE AS temperature]      [,STOP_SEQUENCES AS stop_sequences]    |    }    [,MODEL_PARAMS AS model_params]  }  ))

Arguments

AI.GENERATE_TEXT takes the following arguments:

  • PROJECT_ID: the project that contains theresource.

  • DATASET: the dataset that contains theresource.

  • MODEL: the name of the remote model over theVertex AI model. For more information about how to createthis type of remote model, seeTheCREATE MODEL statement for remote models over LLMs.

    You can confirm what model is used by the remote model by opening theGoogle Cloud console and looking at theRemote endpoint field inthe model details page.

  • TABLE: the name of the BigQuerytable that contains the prompt data. The text in the column that's namedprompt is sent to the model. If your table does not have apromptcolumn, use theQUERY_STATEMENT argument instead and provide aSELECTstatement that includes an alias for an existing table column. An erroroccurs if noprompt column is available.

  • QUERY_STATEMENT: the GoogleSQL querythat generates the prompt data. The query must produce a column namedprompt.

Note: We recommend against using theLIMIT and OFFSET clause in the prompt query. Using this clause causes the query to process all of the input data first and then applyLIMIT andOFFSET.Note: To minimize Vertex AI charges, write query results to a table and then reference that table in theAI.GENERATE_TEXT function. This can help you ensure that you are sending as few rows as possible to the model.
  • MAX_OUTPUT_TOKENS: anINT64 value that setsthe maximum number of tokens that can be generated in the response.A token might be smaller than a word and isapproximately four characters. One hundred tokens correspond toapproximately 60-80 words.This value must be in the range[1,4096].Specify a lower value for shorter responses and a higher value forlonger responses.The default is1024.
  • TOP_P: aFLOAT64 value in the range[0.0,1.0]that changes how the model selects tokens for output. Specify a lowervalue for less random responses and a higher value for more randomresponses.The default is0.95.If you don't specify a value, the model determines an appropriate value.

    Tokens are selected from the most to leastprobable until the sum of their probabilities equals theTOP_P value.For example, if tokens A, B, and C have a probability of0.3,0.2, and0.1, and theTOP_P value is0.5, then the model selects either A orB as the next token by using theTEMPERATURE value and doesn'tconsider C.

  • TEMPERATURE:aFLOAT64 value in the range[0.0,1.0]that controls the degree of randomness in token selection.LowerTEMPERATURE values are good for prompts that require amore deterministic and less open-ended or creative response, while higherTEMPERATURE values can lead to more diverse or creative results. ATEMPERATURE value of0 is deterministic, meaning that the highestprobability response is always selected.The default is0.

  • STOP_SEQUENCES: anARRAY<STRING> value thatremoves the specified strings if they are included in responses from themodel. Strings are matched exactly, including capitalization. The defaultis an empty array.

Details

The model and input table must be in the same region.

Open models

AI.GENERATE_TEXT(MODEL `PROJECT_ID.DATASET.MODEL`,{ TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) },STRUCT(  {    {      [MAX_OUTPUT_TOKENS AS max_output_tokens]      [,TOP_K AS top_k]      [,TOP_P AS top_p]      [,TEMPERATURE AS temperature]    }    |    [,MODEL_PARAMS AS model_params]  }  ))

Arguments

AI.GENERATE_TEXT takes the following arguments:

  • PROJECT_ID: the project that contains theresource.

  • DATASET: the dataset that contains theresource.

  • MODEL: the name of the remote model over theVertex AI model. For more information about how to createthis type of remote model, seeTheCREATE MODEL statement for remote models over LLMs.

    You can confirm what model is used by the remote model by opening theGoogle Cloud console and looking at theRemote endpoint field inthe model details page.

  • TABLE: the name of the BigQuerytable that contains the prompt data. The text in the column that's namedprompt is sent to the model. If your table does not have apromptcolumn, use theQUERY_STATEMENT argument instead and provide aSELECTstatement that includes an alias for an existing table column. An erroroccurs if noprompt column is available.

  • QUERY_STATEMENT: the GoogleSQL querythat generates the prompt data. The query must produce a column namedprompt.

Note: We recommend against using theLIMIT and OFFSET clause in the prompt query. Using this clause causes the query to process all of the input data first and then applyLIMIT andOFFSET.Note: To minimize Vertex AI charges, write query results to a table and then reference that table in theAI.GENERATE_TEXT function. This can help you ensure that you are sending as few rows as possible to the model.
  • MAX_OUTPUT_TOKENS: anINT64 value that setsthe maximum number of tokens that can be generated in the response.A token might be smaller than a word and isapproximately four characters. One hundred tokens correspond toapproximately 60-80 words.This value must be in the range[1,4096].Specify a lower value for shorter responses and a higher value forlonger responses.If you don't specify a value, the model determines an appropriate value.
  • TOP_K: anINT64 value in the range[1,40] thatchanges how the model selects tokens for output. Specify a lower value forless random responses and a higher value for more random responses.If you don't specify a value, the model determines an appropriate value.

    ATOP_K value of1 means the next selected token is the most probableamong all tokens in the model's vocabulary, while aTOP_K value of3means that the next token is selected from among the threemost probable tokens by using theTEMPERATURE value.

    For each token selection step, theTOP_K tokens with the highestprobabilities are sampled. Then tokens are further filtered based on theTOP_P value, with the final token selected using temperature sampling.

  • TOP_P: aFLOAT64 value in the range[0.0,1.0]that changes how the model selects tokens for output. Specify a lowervalue for less random responses and a higher value for more randomresponses.If you don't specify a value, the model determines an appropriate value.

    Tokens are selected from the most to leastprobable until the sum of their probabilities equals theTOP_P value.For example, if tokens A, B, and C have a probability of0.3,0.2, and0.1, and theTOP_P value is0.5, then the model selects either A orB as the next token by using theTEMPERATURE value and doesn'tconsider C.

  • TEMPERATURE:aFLOAT64 value in the range[0.0,1.0]that controls the degree of randomness in token selection.LowerTEMPERATURE values are good for prompts that require amore deterministic and less open-ended or creative response, while higherTEMPERATURE values can lead to more diverse or creative results. ATEMPERATURE value of0 is deterministic, meaning that the highestprobability response is always selected.If you don't specify a value, the model determines an appropriate value.

Details

The model and input table must be in the same region.

Syntax for object tables

Use the following syntax to useAI.GENERATE_TEXT with Geminimodels and object table data.

AI.GENERATE_TEXT(MODEL `PROJECT_ID.DATASET.MODEL`,{ TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) },STRUCT(PROMPT AS prompt  {    {      [,MAX_OUTPUT_TOKENS AS max_output_tokens]      [,TOP_P AS top_p]      [,TEMPERATURE AS temperature]      [,STOP_SEQUENCES AS stop_sequences]      [,SAFETY_SETTINGS AS safety_settings]    }    |    [,MODEL_PARAMS AS model_params]  }  ))

Arguments

AI.GENERATE_TEXT takes the following arguments:

  • PROJECT_ID: the project that contains theresource.

  • DATASET: the dataset that contains theresource.

  • MODEL: the name of the remote model over the Vertex AImodel. For more information about how to create this type of remotemodel, seeTheCREATE MODEL statement for remote models over LLMs.

    You can confirm which model is used by the remote model by opening theGoogle Cloud console and looking at theRemote endpoint field inthe model details page.

    Note: Using a remote model based on a Gemini 2.5 model incurscharges for thethinking process.
  • TABLE: the name of theobject tablethat contains the content to analyze. For more information onwhat types of content you can analyze, seeInput.

    The Cloud Storage bucket used by the input object table must be inthe same project where you have created the model and where you arecalling theAI.GENERATE_TEXT function.

  • QUERY_STATEMENT: the GoogleSQL query thatgenerates the image data. You can only specifyWHERE andORDER BYclauses in the query.

  • PROMPT: aSTRING value that contains the promptto use to analyze the visual content. Theprompt value must containless than 16,000 tokens. A token might be smaller than a word and isapproximately four characters. One hundred tokens correspond toapproximately 60-80 words.

  • MAX_OUTPUT_TOKENS: anINT64 value that setsthe maximum number of tokens that can be generated in the response.This value must be in the range[1,8192].Specify a lower value for shorter responses and a higher value forlonger responses. The default is1024.

  • TOP_P: aFLOAT64 value in the range[0.0,1.0] thatchanges how the model selects tokens for output. Specify a lower value forless random responses and a higher value for more random responses. Thedefault is0.95.

    Tokens are selected from the most to leastprobable until the sum of their probabilities equals theTOP_P value.For example, if tokens A, B, and C have a probability of0.3,0.2, and0.1, and theTOP_P value is0.5, then the model selects either A orB as the next token by using theTEMPERATURE value and doesn'tconsider C.

  • TEMPERATURE: aFLOAT64 value in the range[0.0,1.0] that controls the degree of randomness in token selection.LowerTEMPERATURE values are good for prompts that require amore deterministic and less open-ended or creative response, while higherTEMPERATURE values can lead to more diverse or creative results. ATEMPERATURE value of0 is deterministic, meaning that the highestprobability response is always selected. The default is0.

  • STOP_SEQUENCES: anARRAY<STRING> value that removesthe specified strings if they are included in responses from the model.Strings are matched exactly, including capitalization. The default is an emptyarray.

  • SAFETY_SETTINGS: anARRAY<STRUCT<STRING AS category,STRING AS threshold>> value that configures content safety thresholds tofilter responses. The first element in the struct specifies a harm category,and the second element in the struct specifies a corresponding blockingthreshold. The model filters out content that violate these settings. You canonly specify each category once. For example, you can't specify bothSTRUCT('HARM_CATEGORY_DANGEROUS_CONTENT' AS category,'BLOCK_MEDIUM_AND_ABOVE' AS threshold) andSTRUCT('HARM_CATEGORY_DANGEROUS_CONTENT' AS category, 'BLOCK_ONLY_HIGH' ASthreshold). If there is no safety setting for a given category, theBLOCK_MEDIUM_AND_ABOVE safety setting is used.

    Supported categories are as follows:

    • HARM_CATEGORY_HATE_SPEECH
    • HARM_CATEGORY_DANGEROUS_CONTENT
    • HARM_CATEGORY_HARASSMENT
    • HARM_CATEGORY_SEXUALLY_EXPLICIT

    Supported thresholds are as follows:

    • BLOCK_NONE (Restricted)
    • BLOCK_LOW_AND_ABOVE
    • BLOCK_MEDIUM_AND_ABOVE (Default)
    • BLOCK_ONLY_HIGH
    • HARM_BLOCK_THRESHOLD_UNSPECIFIED

    For more information, refer to the definition ofsafety categoryandblocking threshold.

  • MODEL_PARAMS: a JSON-formatted string literalthat provides additional parameters to the model. The value must conformto thegenerateContent request bodyformat. You can provide a value for any field in the request body exceptfor thecontents[] field. If you set this field, then you can't alsospecify any model parameters in the top-level struct argument to theAI.GENERATE_TEXT function.

Details

The model and input table must be in the same region.

Output

AI.GENERATE_TEXT returns the input table plus the following columns:

Gemini API models

  • result: aSTRING value that contains the text generated by the model.
  • rai_result: aJSON value that contains the responsible AI result,including safety attributes.
  • grounding_result: aJSON value that contains the result of thegrounding check if it's performed.
  • statistics: aJSON value that contains statistics about the generationprocess, such as token counts.
  • full_response: aJSON value that contains the complete JSON responsefrom the Vertex AI API.
  • status: aSTRING value that contains the status of the API call. Anempty string indicates success.

Claude models

  • result: aSTRING value that contains the text generated by the model.
  • full_response: aJSON value that contains the complete JSON responsefrom the Vertex AI API.
  • status: aSTRING value that contains the status of the API call. Anempty string indicates success.

LLama models

  • result: aSTRING value that contains the text generated by the model.
  • full_response: aJSON value that contains the complete JSON responsefrom the Vertex AI API.
  • status: aSTRING value that contains the status of the API call. Anempty string indicates success.

Mistral AI models

  • result: aSTRING value that contains the text generated by the model.
  • full_response: aJSON value that contains the complete JSON responsefrom the Vertex AI API.
  • status: aSTRING value that contains the status of the API call. Anempty string indicates success.

Open models

  • result: aSTRING value that contains the text generated by the model.
  • full_response: aJSON value that contains the complete JSON responsefrom the Vertex AI API.
  • status: aSTRING value that contains the status of the API call. Anempty string indicates success.

Examples

Text analysis

Example 1

This example shows a request to a Claude model that provides a singleprompt.

SELECT*FROMAI.GENERATE_TEXT(MODEL`mydataset.claude_model`,(SELECT'What is the purpose of dreams?'ASprompt));

Example 2

This example shows a request to a Gemini model that providesprompt data from a table column namedquestion that is aliased asprompt.

SELECT*FROMAI.GENERATE_TEXT(MODEL`mydataset.gemini_model`,(SELECTquestionASpromptFROM`mydataset.prompt_table`));

Example 3

This example shows a request to a Gemini model thatconcatenates strings and a table column to provide the prompt data.

SELECT*FROMAI.GENERATE_TEXT(MODEL`mydataset.gemini_model`,(SELECTCONCAT('Classify the sentiment of the following text as positive or negative.Text:',input_column,'Sentiment:')ASpromptFROM`mydataset.input_table`));

Example 4

This example shows a request a Gemini modelthat excludes model responses that containthe stringsGolf orfootball.

SELECT*FROMAI.GENERATE_TEXT(MODEL`mydataset.gemini_model`,TABLE`mydataset.prompt_table`,STRUCT(.15AStemperature,['Golf','football']ASstop_sequences));

Example 5

This example shows a request to a Gemini model with thefollowing characteristics:

  • Provides prompt data from a table column that's namedprompt.
  • Retrieves and returns public web data for response grounding.
SELECT*FROMAI.GENERATE_TEXT(MODEL`mydataset.gemini_model`,TABLE`mydataset.prompt_table`,STRUCT(TRUEASground_with_google_search));

Example 6

This example shows a request to a Gemini model with thefollowing characteristics:

  • Provides prompt data from a table column that's namedprompt.
  • Uses theMODEL_PARAMS argument to pass modelparameters as a JSON-formatted string.
SELECT*FROMAI.GENERATE_TEXT(MODEL`mydataset.gemini_model`,TABLE`mydataset.prompt_table`,STRUCT('''      {        "safety_settings": [          { "category": "HARM_CATEGORY_HATE_SPEECH",            "threshold": "BLOCK_LOW_AND_ABOVE" },          { "category": "HARM_CATEGORY_DANGEROUS_CONTENT",            "threshold": "BLOCK_MEDIUM_AND_ABOVE" }        ],        "generation_config": {          "max_output_tokens": 75,          "thinking_config": {"thinking_budget": 0}        }      }      '''ASmodel_params));

Visual content analysis

Example 1

This example adds product description information to a table by analyzingthe object data in anObjectRef column namedimage:

UPDATEmydataset.productsSETimage_description=(SELECTresultFROMAI.GENERATE_TEXT(MODEL`mydataset.gemini_model`,(SELECT('Can you describe the following image?',OBJ.GET_ACCESS_URL(image,'r'))ASprompt)))WHEREimageISNOTNULL;

Example 2

This example analyzes visual content from an object table that's nameddogs and identifies the breed of dog contained in the content. The contentreturned is filtered by the specified safety settings:

SELECTuri,resultFROMAI.GENERATE_TEXT(MODEL`mydataset.dog_identifier_model`,TABLE`mydataset.dogs`STRUCT('What is the breed of the dog?'ASPROMPT,.01ASTEMPERATURE,TRUEASFLATTEN_JSON_OUTPUT,[STRUCT('HARM_CATEGORY_HATE_SPEECH'AScategory,'BLOCK_LOW_AND_ABOVE'ASthreshold),STRUCT('HARM_CATEGORY_DANGEROUS_CONTENT'AScategory,'BLOCK_MEDIUM_AND_ABOVE'ASthreshold)]ASsafety_settings));

Audio content analysis

This example translates and transcribes audio content from an object tablethat's namedfeedback:

SELECTuri,resultFROMAI.GENERATE_TEXT(MODEL`mydataset.audio_model`,TABLE`mydataset.feedback`,STRUCT('What is the content of this audio clip, translated into Spanish?'ASPROMPT,.01ASTEMPERATURE,TRUEASFLATTEN_JSON_OUTPUT));

PDF content analysis

This example classifies PDF content from an object tablethat's nameddocuments:

SELECTuri,resultFROMAI.GENERATE_TEXT(MODEL`mydataset.classify_model`TABLE`mydataset.documents`STRUCT('Classify this document using the following categories: legal, tax-related, real estate'ASPROMPT,.2ASTEMPERATURE,TRUEASFLATTEN_JSON_OUTPUT));

Use Vertex AI Provisioned Throughput

You can useVertex AI Provisioned Throughputwith theAI.GENERATE_TEXT function to provide consistent high throughput forrequests. The remote model that you reference in theAI.GENERATE_TEXT functionmust use asupported Gemini modelin order for you to use Provisioned Throughput.

To use Provisioned Throughput,calculate your Provisioned Throughput requirementsand thenpurchase Provisioned Throughputquota before running theAI.GENERATE_TEXT function. When you purchaseProvisioned Throughput, do the following:

  • ForModel, select the same Gemini model as the one usedby the remote model that you reference in theAI.GENERATE_TEXT function.
  • ForRegion, select the same region as the dataset that containsthe remote model that you reference in theAI.GENERATE_TEXT function, withthe following exceptions:

    • If the dataset is in theUS multi-region, select theus-central1region.
    • If the dataset is in theEU multi-region, select theeurope-west4region.

After you submit the order, wait for the order to be approved and appear on theOrders page.

After you have purchased Provisioned Throughput quota, use therequest_type argument to determine how theAI.GENERATE_TEXT function usesthe quota.

Locations

AI.GENERATE_TEXT must run in the sameregion or multi-region as the remote model that thefunction references. See the following topics for more information:

Quotas

SeeVertex AI and Cloud AI service functions quotas and limits.

Known issues

This section contains information about known issues.

Resource exhausted errors

Sometimes after a query job that uses this function finishes successfully,some returned rows contain the following error message:

Aretryableerroroccurred:RESOURCEEXHAUSTEDerrorfrom<remoteendpoint>

This issue occurs because BigQuery query jobs finish successfullyeven if the function fails for some of the rows. The function fails when thevolume of API calls to the remote endpoint exceeds the quota limits for thatservice. This issue occurs most often when you are running multiple parallelbatch queries. BigQuery retries these calls, but if the retriesfail, theresource exhausted error message is returned.

To iterate through inference calls until all rows are successfully processed,you can use theBigQuery remote inference SQL scriptsor theBigQuery remote inference pipeline Dataform package.To try the BigQuery ML remote inference SQL script, seeHandle quota errors by callingAI.GENERATE_TEXT iteratively.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-11-24 UTC.