Generate ML predictions using the Spanner emulator

This page describes how to generate ML predictions using the Spanneremulator for GoogleSQL-dialect databases and PostgreSQL-dialect databases.

Spanner Vertex AI integration can be used with the Spanner emulator to generatepredictions using the GoogleSQL or PostgreSQL ML predictfunctions. The emulator is a binary that mimics a Spanner server, andcan also be used in unit and integration testing. You can use the emulator as anopen source projectorlocally using the Google Cloud CLI. To learn moreabout the ML predict functions, seeHow does Spanner Vertex AI integrationwork?.

You can use any model with the emulator to generate predictions.You can also use a model from theVertex AI Model Gardenor a model deployed to yourVertex AI endpoint.Since the emulator doesn't connect to Vertex AI,the emulator can't verify the model or its schema for any model used from theVertex AI Model Garden or deployed to the Vertex AIendpoints.

By default, when you use a prediction function with the emulator,the function yields a random value based on the provided model inputsand model output schema. You can use a callback function to modify the modelinput and output, and generate prediction results based on specific behaviors.

Before you begin

Complete the following steps before you use the Spanner emulatorto generate ML predictions.

Install the Spanner emulator

You can eitherinstall the emulator locallyor set it up using theGitHub repository.

Select a model

When you use theML.PREDICT (for GoogleSQL) or theML_PREDICT_ROW (for PostgreSQL) function, you must specify the locationof the ML model. You can use any trained model. If you select a model that isrunning in theVertex AI Model Garden or a model that isdeployed to your Vertex AI endpoint,you must provide theinput andoutput values for these models.

To learn more about Spanner Vertex AI integration, seeHow does Spanner Vertex AI integration work?.

Generate predictions

You can use the emulator to generate predictions using theSpanner ML predict functions.

Default behavior

You can use any model deployed to an endpoint with the Spanneremulator to generate predictions. The following example uses a model calledFraudDetection to generate a result.

GoogleSQL

To learn more about how to use theML.PREDICT function to generatepredictions, seeGenerate ML predictions using SQL.

Register the model

Before you can use a model with theML.PREDICTfunction, you must register the model using theCREATE MODELstatement and provide theinput andoutput values:

CREATEMODELFraudDetectionINPUT(AmountINT64,NameSTRING(MAX))OUTPUT(OutcomeBOOL)REMOTEOPTIONS(endpoint='//aiplatform.googleapis.com/projects/PROJECT_ID/locations/REGION_ID/endpoints/ENDPOINT_ID');

Replace the following:

  • PROJECT_ID: the ID of the Google Cloud projectthat the model is located in

  • REGION_ID: the ID of the Google Cloud regionthe model is located in—for example,us-central1

  • ENDPOINT_ID: the ID of the model endpoint

Run the prediction

Use theML.PREDICTGoogleSQL function to generate your prediction.

SELECTOutcomeFROMML.PREDICT(MODELFraudDetection,(SELECT1000ASAmount,"John Smith"ASName))

The expected output of this query isTRUE.

PostgreSQL

To learn more about how to use thespanner.ML_PREDICT_ROW function togenerate predictions, seeGenerate ML predictions using SQL.

Run the prediction

Use thespanner.ML_PREDICT_ROW PostgreSQL function to generateyour prediction.

SELECT(spanner.ml_predict_row('projects/`MODEL_ID`/locations/`REGION_ID`/endpoints/`ENDPOINT_ID`','{"instances": [{"Amount": "1000", "Name": "John Smith"}]}')->'predictions'->0->'Outcome')::boolean

Replace the following:

  • PROJECT_ID: the ID of the Google Cloud projectthat the model is located in

  • REGION_ID: the ID of the Google Cloud regionthe model is located in—for example,us-central1

  • ENDPOINT_ID: the ID of the model endpoint

The expected output of this query isTRUE.

Custom Callback

You can use a custom callback function to implement selected model behaviors,and to transform specific model inputs to outputs. The following example uses thegemini-pro model from the Vertex AI Model Gardenand the Spanner emulator to generate predictions using a customcallback.

When using a custom callback for a model, you mustfork theSpanner emulator repository,then build and deploy it. For more information on how tobuild and deploy the Spanner emulator, see theSpanner emulator quickstart.

GoogleSQL

Register the model

Before you can use a model with theML.PREDICTfunction, you must register the model using theCREATE MODELstatement:

CREATEMODELGeminiProINPUT(promptSTRING(MAX))OUTPUT(contentSTRING(MAX))REMOTEOPTIONS(endpoint='//aiplatform.googleapis.com/projects/PROJECT_ID/locations/REGION_ID/publishers/google/models/gemini-pro',default_batch_size=1);

Since the emulator doesn't connect to the Vertex AI,you must provide theinput andoutput values.

Replace the following:

  • PROJECT_ID: the ID of the Google Cloud project that themodel is located in

  • REGION_ID: the ID of the Google Cloud regionthe model is located in—for example,us-central1

Callback

Use a callback to add custom logic to theGeminiPro model.

absl::StatusModelEvaluator::Predict(constgooglesql::Model*model,constCaseInsensitiveStringMap<constModelColumn>&model_inputs,CaseInsensitiveStringMap<ModelColumn>&model_outputs){// Custom logic for GeminiPro.if(model->Name()=="GeminiPro"){RET_CHECK(model_inputs.contains("prompt"));RET_CHECK(model_inputs.find("prompt")->second.value->type()->IsString());RET_CHECK(model_outputs.contains("content"));std::stringcontent;// Process prompts used in tests.int64_tnumber;staticLazyRE2is_prime_prompt={R"(Is (\d+) a prime number\?)"};if(RE2::FullMatch(model_inputs.find("prompt")->second.value->string_value(),*is_prime_prompt,&number)){content=IsPrime(number)?"Yes":"No";}else{// Default response.content="Sorry, I don't understand";}*model_outputs["content"].value=googlesql::values::String(content);returnabsl::OkStatus();}// Custom model prediction logic can be added here.returnDefaultPredict(model,model_inputs,model_outputs);}

Run the prediction

Use theML.PREDICTGoogleSQL function to generate your prediction.

SELECTcontentFROMML.PREDICT(MODELGeminiPro,(SELECT"Is 7 a prime number?"ASprompt))

The expected output of this query is"YES".

PostgreSQL

Use thespanner.ML_PREDICT_ROW PostgreSQL function to generateyour prediction.

Callback

Use a callback to add custom logic to theGeminiPro model.

absl::StatusModelEvaluator::PgPredict(absl::string_viewendpoint,constgooglesql::JSONValueConstRef&instance,constgooglesql::JSONValueConstRef&parameters,lesql::JSONValueRefprediction){if(endpoint.ends_with("publishers/google/models/gemini-pro")){RET_CHECK(instance.IsObject());RET_CHECK(instance.HasMember("prompt"));std::stringcontent;// Process prompts used in tests.int64_tnumber;staticLazyRE2is_prime_prompt={R"(Is (\d+) a prime number\?)"};if(RE2::FullMatch(instance.GetMember("prompt").GetString(),*is_prime_prompt,&number)){content=IsPrime(number)?"Yes":"No";}else{// Default response.content="Sorry, I don't understand";}prediction.SetToEmptyObject();prediction.GetMember("content").SetString(content);returnabsl::OkStatus();}// Custom model prediction logic can be added here.returnDefaultPgPredict(endpoint,instance,parameters,prediction);}

Run the prediction

SELECT(spanner.ml_predict_row('projects/`PROJECT_ID`/locations/`REGION_ID`/publishers/google/models/gemini-pro','{"instances": [{"prompt": "Is 7 a prime number?"}]}')->'predictions'->0->'content')::text

Replace the following:

  • PROJECT_ID: the ID of the Google Cloud projectthat the model is located in

  • REGION_ID: the ID of the Google Cloud region the model islocated in—for example,us-central1

The expected output of this query is"YES".

What's next?

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.