Data-driven prompt optimizer

This document describes how to use the data-driven optimizer to automatically optimize prompt performance by improving the system instructions for a set of prompts.

The data-driven optimizer can help you improve your prompts quickly at scale, without manually rewriting system instructions or individual prompts. This is especially useful when you want to use system instructions and prompts that were written for one model with a different model.

Prompt optimization example

For example, to optimize system instructions for a set of prompts that referencecontextual information to answer questions about cooking, you can usedata-driven optimizer. To complete this task, you would preparethe inputs similar to the following:

System instructions

You are a professional chef. Your goal is teaching how to cook healthy cooking recipes to your apprentice.Given a question from your apprentice and some context, provide the correct answer to the question.Use the context to return a single and correct answer with some explanation.

Prompt template

Question: {input_question}Facts: {input_context}

Sample prompts

input_question input_context

What are some techniques for cooking red meat and pork that maximizeflavor and tenderness while minimizing the formation of unhealthycompounds? Red meat and pork should be cooked to an internal temperature of 145degrees fahrenheit (63 degrees celsius) to ensure safety.Marinating meat in acidic ingredients like lemon juice or vinegar can helptenderize it by breaking down tough muscle fibers. High-heat cooking methodslike grilling and pan-searing can create delicious browning andcaramelization, but it's important to avoid charring, which can produceharmful compounds.

What are some creative ways to add flavor and nutrition to protein shakeswithout using added sugars or artificial ingredients? Adding leafy greens like spinach or kale is a great way to boost thenutritional value of your shake without drastically altering the flavor.Using unsweetened almond milk or coconut water instead of regular milk can adda subtle sweetness and a boost of healthy fats or electrolytes, respectively.Did you know that over-blending your shake can actually heat it up? To keepthings cool and refreshing, blend for shorter bursts and give your blender abreak if needed.

`input_question`	`input_context`
What are some techniques for cooking red meat and pork that maximizeflavor and tenderness while minimizing the formation of unhealthycompounds?	Red meat and pork should be cooked to an internal temperature of 145degrees fahrenheit (63 degrees celsius) to ensure safety.Marinating meat in acidic ingredients like lemon juice or vinegar can helptenderize it by breaking down tough muscle fibers. High-heat cooking methodslike grilling and pan-searing can create delicious browning andcaramelization, but it's important to avoid charring, which can produceharmful compounds.
What are some creative ways to add flavor and nutrition to protein shakeswithout using added sugars or artificial ingredients?	Adding leafy greens like spinach or kale is a great way to boost thenutritional value of your shake without drastically altering the flavor.Using unsweetened almond milk or coconut water instead of regular milk can adda subtle sweetness and a boost of healthy fats or electrolytes, respectively.Did you know that over-blending your shake can actually heat it up? To keepthings cool and refreshing, blend for shorter bursts and give your blender abreak if needed.

Optimized system instructions

As a highly skilled chef with a passion for healthy cooking, you love sharing your knowledge withaspiring chefs. Today, a culinary intern approaches you with a question about healthy cooking. Giventhe intern's question and some facts, provide a clear, concise, and informative answer that will helpthe intern excel in their culinary journey.

How optimization works

The data-driven optimizer takes the following parameters:

Optimization mode: Specifies what to optimize. It can be one of the following:
- instruction: Optimizes the system instruction.
- demonstration: Selects sample prompts to add to the system instruction as few-shot examples.
- instruction_and_demo: Performs both of the above actions.
Evaluation metrics: the metrics that the data-driven optimizer uses to optimize the system instructions and/or select sample prompts.
Target model: theGoogle model for which thedata-driven optimizer optimizes the system instructions andselects sample prompts.

When you run the data-driven optimizer, it optimizes the systeminstructions based on your selections by running a custom training jobwhere it iteratively evaluates your sample prompts and rewrites your systeminstructions to find the version that produces the best evaluation score forthe target model.

At the end of the job, the data-driven optimizer outputs theoptimized system instructions with their evaluation score.

Evaluation metrics

The data-driven optimizer uses evaluation metrics to optimizesystem instructions and select sample prompts. You can use thestandard evaluation metrics or define yourown custom evaluation metrics. Note: All evaluation metrics MUST have the property that higher score indicates better performance.

You can use multiple metrics at a time. However, custom metricscan only be used one at a time. If you use standard and custom metricstogether, only one of the metrics can be a custom metric. The others must bestandard metrics.

To learn how to specify metrics one at a time or in combination, seeEVALUATION_METRIC_PARAMETERS in the SDK tab inCreate a prompt template and system instructions.

Custom evaluation metrics

Custom metrics are useful when standard metrics don't fit your application.Note that thedata-driven optimizer only supports one custom metricat a time.

To learn how to create custom metrics, seeCreate custom metrics.

Standard evaluation metrics

The data-driven optimizer supportscustom evaluation metrics, and additionally supports the following evaluationmetrics:

Metric type	Use case	Metric	Description
Model-based	Summarization	`summarization_quality`	Describes the model's ability to answer questions given a body of text to reference.
	Question answering	`question_answering_correctness`^*	Describes the model's ability to correctly answer a question.
	Question answering	`question_answering_quality`	Describes the model's ability to answer questions given a body of text to reference.
	Coherence	`coherence`	Describes the model's ability to provide a coherent response and measures how well the generated text flows logically and makes sense.
	Safety	`safety`	Describes the model's level of safety, that is, whether the response contains any unsafe text.
	Fluency	`fluency`	Describes the model's language mastery.
	Groundedness	`groundedness`	Describes the model's ability to provide or reference information included only in the input text.
	Comet	`comet**`	Describes the model's ability on the quality of a translation against the reference.
	MetricX	`metricx**`	Describes the model's ability on the quality of a translation.
Computation-based	Tool use and function calling	`tool_call_valid`^*	Describes the model's ability to predict a valid tool call.
		`tool_name_match`^*	Describes the model's ability to predict a tool call with the correct tool name. Only the first tool call is inspected.
		`tool_parameter_key_match`^*	Describes the model's ability to predict a tool call with the correct parameter names.
		`tool_parameter_kv_match`^*	Describes the model's ability to predict a tool call with the correct parameter names and key values.
	General text generation	`bleu`^*	Holds the result of an algorithm for evaluating the quality of the prediction, which has been translated from one natural language to another natural language. The quality of the prediction is considered to be the correspondence between a prediction parameter and its reference parameter.
		`exact_match`^*	Computes whether a prediction parameter matches a reference parameter exactly.
		`rouge_1`^*	Used to compare the provided prediction parameter against a reference parameter.
		`rouge_2`^*
		`rouge_l`^*
		`rouge_l_sum`^*

^* If you want to optimize your prompts using thequestion_answering_correctness or computation-based evaluations, you must doone of the following:

Add a variable that represents the ground truth response for your prompts toyourprompt template.
If you don't have ground truth responses for your prompts, but you previouslyused the prompts with aGoogle model andachieved your targeted results, you can add thesource_model parameter toyourconfiguration instead of adding ground truth responses.When thesource_model parameter is set, the data-driven optimizer runs yoursample prompts on the source model to generate the ground truth responses foryou.
Thesource_model parameter should only be used for model upgrade or migration.

^** If you want to optimize your prompts using thecomet ormetricx, you must provide thetranslation_source_field_name parameter to your configuration which specifies the corresponding field name of the source text in the data. Also, the MetricX value has been modified to between 0 (worst) and 25 (best) to respect the larger-the-better property.

Before you begin

To ensure that theCompute Engine default service account has the necessary permissions to optimize prompts, ask your administrator to grant theCompute Engine default service account the following IAM roles on the project:

Important: You must grant these roles to the Compute Engine default service account,not to your user account. Failure to grant the roles to the correct principal might result in permission errors.

Vertex AI User (roles/aiplatform.user)
Storage Object Admin (roles/storage.objectAdmin)
Artifact Registry Reader (roles/artifactregistry.reader)
If using custom metrics:
- Cloud Run Developer (roles/run.developer)
- Cloud Run Invoker (roles/run.invoker)
- Vertex AI Service Agent (roles/aiplatform.serviceAgent)

For more information about granting roles, seeManage access to projects, folders, and organizations.

Your administrator might also be able to give theCompute Engine default service account the required permissions throughcustom roles or otherpredefined roles.

Optimize prompts

You can optimize prompts in the following ways:

using the Vertex AI prompt optimizer in the Vertex AI Console
using the Vertex AI API
running theVertex AI prompt optimizer notebook.

To optimize prompts, choose which methodyou want to use, then completethe steps as described in detail in the following sections:

Tip: We recommend running the Vertex AI prompt optimizer in the Vertex AI Console for first time users. The console provides a more interactive experience than the Vertex AI API.

Create a prompt template and system instructions

Prompt templates define the format of all of your prompts through replaceablevariables. When you use a prompt template to optimize prompts, thevariables are replaced by the data in the prompt dataset.

Prompt template variables must meet the following requirements:

Variables must be wrapped in curly-braces
Variable names must not contain spaces or dashes-
Variables that represent multimodal inputs must include theMIME_TYPE stringafter the variable:
```
@@@MIME_TYPE
```
ReplaceMIME_TYPE with animage,video,audio,ordocumentMIME type that is supported by the target model.

Create a prompt template and system instructions using one of the followingmethods:

Notebook

If you want to run the data-driven optimizer through thenotebook, create system instructions and a prompt template by doing thefollowing:

In Colab Enterprise, open the Vertex AI promptoptimizer notebook.
Go to Vertex AI prompt optimizer notebook
In theCreate a prompt template and system instructions section, dothe following:
1. In theSYSTEM_INSTRUCTION field, enter your system instructions.For example:
```
Based on the following images and articles respond to the questions.'\n' Be concise,and answer \"I don't know\" if the response cannot be found in the provided articles or images.
```
2. In thePROMPT_TEMPLATE field, enter your prompt template. Forexample:
```
Article 1:\n\n{article_1}\n\nImage 1:\n\n{image_1} @@@image/jpeg\n\nQuestion: {question}
```
3. If you want to optimize your prompts using thequestion_answering_correctness or computation-based evaluations, youmust do one of the following:
- Add the{target} variable to the prompttemplate, to represent the prompt's ground truth response. For example:
```
Article 1:\n\n{article_1}\n\nImage 1:\n\n{image_1} @@@image/jpeg\n\nQuestion: {question}\n\nAnswer: {target}
```
- If you don't have ground truth responses for your prompts, but youpreviously used the prompts with aGoogle model and achieved yourtargeted results, you can add thesource_model parameter to yourconfiguration instead of adding ground truthresponses. When thesource_model parameter is set, thedata-driven optimizer runs your sample prompts on thesource model to generate the ground truth responses for you.

SDK

If you want to run the data-driven optimizer through the SDKwithout using the notebook, create text files for your prompt template andsystem instructions by doing the following:

Create a text file for your system instructions.

In the text file, define your system instructions to the text file. Forexample:

Based on the following images and articles respond to the questions.'\n' Be concise, and answer \"I don't know\" if the response cannot be found in the provided articles or images.

Create a text file for your prompt template.
In the text file, define a prompt template that includes one or morevariables. For example:
```
Article 1:\n\n{article_1}\n\nImage 1:\n\n{image_1} @@@image/jpeg\n\nQuestion: {question}
```
If you want to optimize your prompts using thequestion_answering_correctness or computation-based evaluations, youmust do one of the following:
- Add the{target} variable to the prompttemplate, to represent the prompt's ground truth response. For example:
```
Article 1:\n\n{article_1}\n\nImage 1:\n\n{image_1} @@@image/jpeg\n\nQuestion: {question}\n\nAnswer: {target}
```
- If you don't have ground truth responses for your prompts, but youpreviously used the prompts with aGoogle model and achieved yourtargeted results, you can add thesource_model parameter to yourconfiguration instead of adding ground truthresponses. When thesource_model parameter is set, thedata-driven optimizer runs your sample prompts on thesource model to generate the ground truth responses for you.

Prepare sample prompts

To get the best results from the data-driven optimizer, use50-100 sample prompts.

The tool can still be effective with as few as 5 sample prompts.
The best samples include examples where the target model performs poorly and examples where the target model performs well.

Note: The optimizer's performance improves as youincrease the number of sample prompts. If you notice poor performance for yourset of sample prompts, consider adding more prompts and rerunning the tool.

The sample prompts contain the data that replaces the variables in the prompttemplate. You can use a JSONL or CSV file to store your sample prompts.

JSONL file

Create a JSONL file.

In the JSONL file, add the prompt data that replaces each variable. Forexample:

{"article_1": "The marine life …", "image_1": "gs://path_to_image", "Question": "What are some most effective ways to reduce ocean pollution?", "target": "The articles and images don't answer this question."}{"article_1": "During the year …", "image_1": "gs://path_to_image", "Question": "Who was the president in 2023?", "target": "Joe Biden"}

Upload the JSONL file to a Cloud Storage bucket.

CSV file

Create a CSV file.
In the first row, add the variables from your prompt template.
In the following rows, add the sample data that replaces each variable.
Upload the CSV file to a Cloud Storage bucket.

Optional: Create custom metrics

Note: Custom metrics require you to deploy a Cloud Run function.Cloud Run is a separate Google Cloud product with separate pricing.

Create a custom metric by doing the following:

Create a text file namedrequirements.txt.
In therequirements.txt file, define the required libraries for the customevaluation metric function. All functions require thefunctions-frameworkpackage.
For example, therequirements.txt file for a custom metric that computesROUGE-L would look similar to the following:
```
functions-framework==3.*rouge-score
```
Create a Python file namedmain.py.

In themain.py file, write your custom evaluation function. The functionmust accept the following:

HTTP POST requests
JSON input that contains theresponse, which is the output from the LLM,and thereference, which is the ground truth response for the prompt if provided in the prompt dataset.

For example, themain.py file for a custom metric that computes ROUGE-Lwould look similar to the following:

from typing import Anyimport jsonimport functions_frameworkfrom rouge_score import rouge_scorer# Register an HTTP function with the Functions Framework@functions_framework.httpdef main(request):   request_json = request.get_json(silent=True)   if not request_json:       raise ValueError('Can not find request json.')   """Extract 'response' and 'reference' from the request payload. 'response'   represents the model's response, while 'reference' represents the ground   truth response."""   response = request_json['response']   reference = request_json['reference']   # Compute ROUGE-L F-measure   scorer = rouge_scorer.RougeScorer(['rougeL'], use_stemmer=True)   scores = scorer.score(reference, response)   final_score = scores['rougeL'].fmeasure   # Return the custom score in the response   return json.dumps({       # The following key is the CUSTOM_METRIC_NAME that you pass to the job       'custom_accuracy': final_score,       # The following key is optional       'explanation': 'ROUGE_L F-measure between reference and response',   })

Deploy your custom evaluation function as a Cloud Run function byrunning thegcloud functions deploy command:
```
gcloud functions deployFUNCTION_NAME \   --projectPROJECT_ID \   --gen2 \   --memory=2Gb \   --concurrency=6 \   --min-instances 6 \   --region=REGION \   --runtime="python310" \   --source="." \   --entry-point main \   --trigger-http \   --timeout=3600 \   --quiet
```
Replace the following:
- FUNCTION_NAME: the name for the custom evaluationmetric.
- PROJECT_ID: your project ID.
- REGION: the region where you want to deploy thefunction. It should be the same region as using the target model.

Create a configuration

The data-driven optimizer configuration specifies theparametersyou want to set for your prompt optimization job.

Note that Gemma models don't have managed APIs on Vertex AI. Touse a Gemma model, you must first deploy it in Vertex AI or onyour local machine. To use a Gemma model, you must first deploy it inVertex AI or on your local machine. For more information about deployingin Vertex AI, seeUse Gemma open models.For more information about deploying to your local machine, seeRun Gemma with Ollama.

Create a configuration using one of the following options:

Notebook

If you want to run the data-driven optimizer through thenotebook, create a configuration by doing the following:

In Colab Enterprise, open the data-drivenoptimizer notebook.
Go to Vertex AI prompt optimizer notebook
In theConfigure project settings section, do the following:
1. In thePROJECT_ID field, enter your project ID.
2. In theLOCATION field, enter the location where you want to run thedata-driven optimizer.
3. In theOUTPUT_PATH field, enter the URI for the Cloud Storagebucket where you want the data-driven optimizerto write the optimized system instructions and/or few shot examples.For example,gs://bucket-name/output-path.
4. In theINPUT_PATH field, enter the URI for the sampleprompts in your Cloud Storage bucket. For example,gs://bucket-name/sample-prompts.jsonl.
In theConfigure optimization settings section, do the following:
1. In theTARGET_MODEL field, enter themodel for which you want to optimizeprompts.
2. In theTHINKING_BUDGET field, enter the thinking budget for the target model you want to optimize prompts. Default to -1, which means no thinking for non-thinking models and auto thinking for thinking models like Gemini-2.5. SeeThinking to learn about manual budget settings.
3. In theOPTIMIZATION_MODE, enter the optimization mode you want touse. Must be one of the following:
  - instruction: Optimizes the system instruction.
  - demonstration: Selects sample prompts to add to the system instructions as few-shot examples.
  - instruction_and_demo: Performs both of the above actions.
4. In theEVAL_METRIC field, enter anevaluation metric that you want to optimize your prompts for.
5. Optional: In theSOURCE_MODEL field, enter theGoogle model that the systeminstructions and prompts were previously used with. When thesource_model parameter is set, the data-drivenoptimizer runs your sample prompts on the source model to generate theground truth responses for you, for evaluation metrics that requireground truth responses. If you didn't previously run your prompts witha Google model or you didn't achieve your target results, add groundtruth responses to your prompt instead. For more information, see theCreate a prompt and system instructions section ofthis document.
Optional: In theConfigure advanced optimization settings section,you can additionally add any of the optional parameters to yourconfiguration.

View optional parameters

In theNUM_INST_OPTIMIZATION_STEPS field, enter the number ofiterations that the data-driven optimizer uses ininstruction optimization mode. The runtime increases linearly as youincrease this value. Must be an integer between10 and20. If leftunset, the default is10.
In theNUM_DEMO_OPTIMIZATION_STEPS field, enter the number ofdemonstrations that the data-driven optimizer evaluates.Used withdemonstration andinstruction_and_demo optimization mode.Must be an integer between10 and30. If left unset, the default is10.
In theNUM_DEMO_PER_PROMPT field, enter the number ofdemonstrations generated per prompt. Must be an integer between2 andand the total number of sample prompts - 1. If left unset, the default is3.
In theTARGET_MODEL_QPS field, enter the queries per second (QPS)that the data-driven optimizer sends to the target model.The runtime decreases linearly as you increase this value. Must be afloat that is3.0 or greater, but less than the QPS quota you haveon the target model. If left unset, the default is3.0.
In theSOURCE_MODEL_QPS field, enter the queries per second(QPS) that the data-driven optimizer sends to thesource model. Must be a float that is3.0 or greater, but lessthan the QPS quota you have on the source model. If left unset, thedefault is3.0.
In theEVAL_QPS field, enter the queries per second (QPS)that the data-driven optimizer sends to the Gen AI evaluation service or the Cloud Run function.
- For model based metrics, must be a float that is3.0 orgreater. If left unset, the default is3.0.
- For custom metrics, must be a float that is3.0 or greater. Thisdetermines the rate at which the data-drivenoptimizer calls your custom metric Cloud Run functions.
If you want to use more than one evaluation metric, do the following:
1. In theEVAL_METRIC_1 field, enter an evaluation metric that youwant to use.
2. In theEVAL_METRIC_1_WEIGHT field, enter the weight that youwant the data-driven optimizer to use when it runsthe optimization.
3. In theEVAL_METRIC_2 field, enter an evaluation metric that youwant to use.
4. In theEVAL_METRIC_2_WEIGHT field, enter the weight that youwant the data-driven optimizer to use when it runsthe optimization.
5. In theEVAL_METRIC_3 field, optionally enter an evaluationmetric that you want to use.
6. In theEVAL_METRIC_3_WEIGHT field, optionally enter the weight that youwant the data-driven optimizer to use when it runsthe optimization.
7. In theMETRIC_AGGREGATION_TYPE field, enter the weight that youwant the data-driven optimizer to use when it runsthe optimization.
In thePLACEHOLDER_TO_VALUE field, enter the information thatreplaces any variables in the system instructions. Information includedwithin this flag is not optimized by the data-drivenoptimizer.
In theRESPONSE_MIME_TYPE field, enter theMIME response typethat the target model uses. Must be one oftext/plain orapplication/json. If left unset, the default istext/plain.
In theTARGET_LANGUAGE field, enter the language of the systeminstructions. If left unset, the default is English.

SDK

If you want to run the data-driven optimizer through theSDK, create a JSON file with the parameters you want to use tooptimize prompts by doing the following:

Create a JSON file with the parameters that you want to use to optimizeyour prompts. Each configuration file requires the following parameters.The configuration varies slightly betweenvertexai.types.PromptOptimizerMethod.OPTIMIZATION_TARGET_GEMINI_NANOandvertexai.types.PromptOptimizerMethod.VAPO.
VAPO
```
{"project": "PROJECT_ID","system_instruction": "SYSTEM_INSTRUCTION","prompt_template": "PROMPT_TEMPLATE","target_model": "TARGET_MODEL","thinking_budget": "THINKING_BUDGET",EVALUATION_METRIC_PARAMETERS,"optimization_mode": "OPTIMIZATION_MODE","input_data_path": "SAMPLE_PROMPT_URI","output_path": "OUTPUT_URI"}
```
GEMINI_NANO
```
{"project": "PROJECT_ID","system_instruction": "SYSTEM_INSTRUCTION","prompt_template": "PROMPT_TEMPLATE","target_model": "TARGET_MODEL","target_model_endpoint_url": "BASE_URL",EVALUATION_METRIC_PARAMETERS,"optimization_mode": "OPTIMIZATION_MODE","input_data_path": "SAMPLE_PROMPT_URI","output_path": "OUTPUT_URI"}
```
Replace the following:
- PROJECT_ID: your project ID.
- SYSTEM_INSTRUCTION: the system instructions you want to optimize.
- PROMPT_TEMPLATE: the prompt template.
- TARGET_MODEL: themodel for which you want to optimizeprompts. For example,gemma-3n-e4b-it.
- BASE_URL: Optional: the base URL of the locally-deployed model. For example,http://localhost:8000/v1. If you aren't using a locally deployed model, remove thetarget_model_endpoint_url field from your configuration.
- THINKING_BUDGET: the thinking budget for the target model that you want to optimize prompts. Defaults to -1, which means no thinking for non-thinking models and auto thinking for thinking models like Gemini-2.5. To learn about manual budget settings, seeThinking. Note that some models, like Gemma models, don't supportthinking_budget.
- EVALUATION_METRIC_PARAMETERS: the parametersyou specify depend on how many evaluation metrics you're using, andwhether your metrics are standard or custom:
  Single standard metric
  If you're using a singlestandard evaluation metric,use the following parameter:
  "eval_metric": "EVALUATION_METRIC",
  ReplaceEVALUATION_METRIC with the metric that you want to optimize your prompts for.
  Single custom metric
  If you're using a singlecustom evaluation metric, use thefollowing parameters:
  "eval_metric": "custom_metric","custom_metric_name": "CUSTOM_METRIC_NAME","custom_metric_cloud_function_name": "FUNCTION_NAME",
  Replace the following:
  - CUSTOM_METRIC_NAME: the metric name, as definedby the key that corresponds with thefinal_score. For example,custom_accuracy.
  - FUNCTION_NAME: the name of theCloud Run function that you previously deployed.
  Multiple standard metrics
  If you're using multiple standard evaluation metrics,use the following parameters:
  "eval_metrics_types": [EVALUATION_METRIC_LIST],"eval_metrics_weights": [EVAL_METRICS_WEIGHTS],"aggregation_type": "METRIC_AGGREGATION_TYPE",
  Replace the following:
  - EVALUATION_METRIC_LIST: a list ofevaluation metrics. Must be an array. For example,"bleu", "summarization_quality".
  - EVAL_METRICS_WEIGHTS: the weight foreach metric. Must be an array and have the same length asEVALUATION_METRIC_LIST.
  - METRIC_AGGREGATION_TYPE: the type ofaggregation used for the evaluation metrics. Must be one ofweighted_sum orweighted_average. If left unset, thedefault isweighted_sum.
  Multiple standard & custom metrics
  If you're using multiple evaluation metrics that include a mixof a single custom metric and one or more standard metrics, usethe following parameters:
  Note: Only one of the metrics that you use can be a custommetric. The others must be standard metrics.
  "eval_metrics_types": ["custom_metric",EVALUATION_METRIC_LIST],"eval_metrics_weights": [EVAL_METRICS_WEIGHTS],"aggregation_type": "METRIC_AGGREGATION_TYPE","custom_metric_name": "CUSTOM_METRIC_NAME","custom_metric_cloud_function_name": "FUNCTION_NAME",
  Replace the following:
  EVALUATION_METRIC_LIST: a list ofthe standard evaluation metrics. Must be an array. Forexample,"bleu", "summarization_quality".
  EVAL_METRICS_WEIGHTS: the weight foreach metric. Must be an array.
  METRIC_AGGREGATION_TYPE: the type ofaggregation used for the evaluation metrics. Must be one ofweighted_sum orweighted_average. If left unset, thedefault isweighted_sum.
  CUSTOM_METRIC_NAME: the metric name,as defined by the key that corresponds with thefinal_score.For example,custom_accuracy.
  FUNCTION_NAME: the name of theCloud Run function that you previously deployed.
- OPTIMIZATION_MODE: the optimization mode. Must be one of the following:
- SAMPLE_PROMPT_URI: the URI for the sampleprompts in your Cloud Storage bucket. For example,gs://bucket-name/sample-prompts.jsonl.
- OUTPUT_URI: the URI for the Cloud Storagebucket where you want the data-driven optimizerto write the optimized system instructions and/or few shot examples.For example,gs://bucket-name/output-path.
You can additionally add any of the optional parameters to yourconfiguration file.
Optional parameters are broken down into 5 categories:
- Optimization process parameters. These parameters control theoverall optimization process, including its duration and the number ofoptimization iterations it runs, which directly impacts the quality ofoptimizations.
- Model selection and location parameters. These parameters specifywhich models the data-driven optimizer uses and thelocations it uses those models in.
- Latency (QPS) parameters. These parameters control QPS, impactingthe speed of the optimization process.
- Other. Other parameters that control the structure and content ofprompts.
  View optional parameters
  "num_steps":NUM_INST_OPTIMIZATION_STEPS,"num_demo_set_candidates": "NUM_DEMO_OPTIMIZATION_STEPS,"demo_set_size":NUM_DEMO_PER_PROMPT,"target_model_location": "TARGET_MODEL_LOCATION","source_model": "SOURCE_MODEL","source_model_location": "SOURCE_MODEL_LOCATION","target_model_qps":TARGET_MODEL_QPS,"eval_qps":EVAL_QPS,"source_model_qps":SOURCE_MODEL_QPS,"response_mime_type": "RESPONSE_MIME_TYPE","language": "TARGET_LANGUAGE","placeholder_to_content": "PLACEHOLDER_TO_CONTENT","data_limit":DATA_LIMIT
  Replace the following:
  - Model selection and location parameters:
    - TARGET_MODEL_LOCATION: the location that you want to runthe target model in. If left unset, the default isus-central1.
    - SOURCE_MODEL: theGoogle model that the systeminstructions and prompts were previously used with. When thesource_model parameter is set, the data-driven optimizer runs yoursample prompts on the source model to generate the ground truthresponses for you, for evaluation metrics that require ground truthresponses. If you didn't previously run your prompts with a Googlemodel or you didn't achieve your target results, add ground truthresponses to your prompt instead. For more information, see theCreate a prompt and system instructions section ofthis document.
    - SOURCE_MODEL_LOCATION: thelocation that you want to runthe source model in. If left unset, the default isus-central1.
  - Latency (QPS) parameters:
    Note: You must set a QPS that is lower than or equal to the QPM quotathat is available to you, or your job will fail. To convert QPM quotato QPS, divide your QPM by 60. For example, a QPM quota of 600 isequivalent to a QPS of 10 (600/60 = 10).
    TARGET_MODEL_QPS: the queries per second(QPS) that the data-driven optimizer sends to thetarget model. The runtime decreases linearly as you increase thisvalue. Must be a float that is3.0 or greater, but less than theQPS quota you have on the target model. If left unset, the default is3.0.
    EVAL_QPS: the queries per second (QPS)that the data-driven optimizer sends to the Gen AI evaluation service or the Cloud Run function.
    For model based metrics, must be a float that is3.0 orgreater. If left unset, the default is3.0.
    For custom metrics, must be a float that is3.0 or greater. Thisdetermines the rate at which the data-drivenoptimizer calls your custom metric Cloud Run functions.
    SOURCE_MODEL_QPS: the queries per second(QPS) that the data-driven optimizer sends to thesource model. Must be a float that is3.0 or greater, but lessthan the QPS quota you have on the source model. If left unset, thedefault is3.0.
  - Other parameters:
    - RESPONSE_MIME_TYPE: the MIME response typethat the target model uses. Must be one oftext/plain orapplication/json. If left unset, the default istext/plain.
    - TARGET_LANGUAGE: the language of the systeminstructions. If left unset, the default is English.
    - PLACEHOLDER_TO_CONTENT: the information thatreplaces any variables in the system instructions. Informationincluded within this flag is not optimized by the data-drivenprompt optimizer.
    - DATA_LIMIT: the amount of data used forvalidation. The runtime increases linearly with this value. Must bean integer between5 and100. If left unset, the default is100.
Upload the JSON file to a Cloud Storage bucket.

Run prompt optimizer

Run the data-driven optimizer using one of the following options:

Notebook

Run the data-driven optimizer through the notebook, by doing the following:

In Colab Enterprise, open the Vertex AI promptoptimizer notebook.
Go to data-driven optimizer notebook
In theRun prompt optimizer section, clickplay_circleRun cell.
The data-driven optimizer runs.

REST

Before using any of the request data, make the following replacements:

LOCATION: the location where you want to run the Vertex AI prompt optimizer.
PROJECT_ID: yourproject ID.
JOB_NAME: a name for the Vertex AI prompt optimizer job.
PATH_TO_CONFIG: the URI of the configuration file in your Cloud Storage bucket. For example,gs://bucket-name/configuration.json.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/customJobs

Request JSON body:

{  "displayName": "JOB_NAME",  "jobSpec": {    "workerPoolSpecs": [      {        "machineSpec": {          "machineType": "n1-standard-4"        },        "replicaCount": 1,        "containerSpec": {          "imageUri": "us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd:preview_v1_0",          "args": ["--config=PATH_TO_CONFIG""]        }      }    ]  }}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by running gcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/customJobs"

PowerShell

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by running gcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/customJobs" | Select-Object -Expand Content

The response looks similar to the following:

Response

{  "name": "projects/PROJECT_ID/locations/LOCATION/customJobs/JOB_ID",  "displayName": "JOB_NAME",    "jobSpec": {      "workerPoolSpecs": [        {        "machineSpec": {              "machineType": "n1-standard-4"          },          "replicaCount": "1",          "diskSpec": {            "bootDiskType": "pd-ssd",            "bootDiskSizeGb": 100          },          "containerSpec": {            "imageUri": "us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd:preview_v1_0"            "args": [            "--config=https://storage.mtls.cloud.google.com/testing-apd/testing-config.json"          ]          }        }      ]    },  "state": "JOB_STATE_PENDING",  "createTime": "2020-09-15T19:09:54.342080Z",  "startTime": "2020-09-15T19:13:42.991045Z",}

SDK

Run the data-driven optimizer through the SDK, by adding the following codesections into your Colab or Notebook.

Make the following replacements:

LOCATION: the location where you want to run the data-driven optimizer.
PROJECT_ID: your project ID.
PROJECT_NUMBER: your project number, available in the Google Cloud console.
PATH_TO_CONFIG: the URI of the configuration file in Cloud Storage. For example,gs://bucket-name/configuration.json.

# Authenticatefromgoogle.colabimportauthauth.authenticate_user(project_id=PROJECT_ID)# Set the Service AccountSERVICE_ACCOUNT=f"{PROJECT_NUMBER}-compute@developer.gserviceaccount.com"# Import Vertex AI SDK and Setupimportvertexaivertexai.init(project=PROJECT_ID,location=LOCATION)#Create the Vertex AI Clientclient=vertexai.Client(project=PROJECT_ID,location=LOCATION)# Setup the job dictionaryvapo_config={'config_path':PATH_TO_CONFIG,'service_account':SERVICE_ACCOUNT,'wait_for_completion':True,}#Start the Vertex AI Prompt Optimizerclient=client.prompt_optimizer.optimize(method="vertexai.types.PromptOptimizerMethod.VAPO",config=vapo_config)

To use Gemma model as the target model, make the following changes:

nano_config=vertexai.types.PromptOptimizerConfig(config_path="gs://sample-bucket/config_nano.json",project_number=project_number,wait_for_completion=True)# Simpler version# nano_config = {#     "config_path": "gs://sample-bucket/config_nano.json",#     "service_account": service_account,#     "wait_for_completion": True#     }logging.basicConfig(encoding='utf-8',level=logging.INFO,force=True)client.prompt_optimizer.optimize(method=vertexai.types.PromptOptimizerMethod.OPTIMIZATION_TARGET_GEMINI_NANO,config=nano_config)

Once the optimization completes, examine the output artifacts at the output location specified in the config.

Analyze results and iterate

After you run the data-driven optimizer review the job's progressusing one of the following options:

Notebook

If you want to view the results of the data-driven optimizerthrough the notebook, do the following:

Open theVertex AI prompt optimizer notebook.
In theInspect the results section, do the following:
1. In theRESULT_PATH field, add the URI of the Cloud Storagebucket that you configured the data-driven optimizer towrite results to. For example,gs://bucket-name/output-path.
2. Clickplay_circleRun cell.

Console

In the Google Cloud console, in the Vertex AI section, goto theTraining pipelines page.
Go to Training pipelines
Click theCustom jobs tab. data-driven optimizer'scustom training job appears in the list along with its status.

When the job is finished, review the optimizations by doing the following:

In the Google Cloud console, go to the Cloud StorageBucketspage:
Go to Buckets
Click the name of your Cloud Storage bucket.
Navigate to the folder that has the same name as the optimization modeyou used to evaluate the prompts, eitherinstruction ordemonstration. If you usedinstruction_and_demo mode, both foldersappear. Theinstruction folder contains the results from the systeminstruction optimization, while thedemonstration folder contains theresults from thedemonstration optimization and the optimized systeminstructions.
The folder contains the following files:
- config.json: the complete configuration that the Vertex AIprompt optimizer used.
- templates.json: each set of system instructions and/or few shotexamples that the data-driven optimizer generated andtheir evaluation score.
- eval_results.json: the target model's response for each sample promptfor each set of generated system instructions and/or few shot examplesand their evaluation score.
- optimized_results.json: the best performing system instructionsand/or few shot examples and their evaluation score.
To view the optimized system instructions, view theoptimized_results.json file.

Best practices

Preview models are only supported through theglobal region and the Vertex Custom Job doesn't supportglobal as a region. Thus, don't use VAPO to optimize the preview models as the target model.
For GA models, the users can select the region-specific locations, such asus-central1 oreurope-central2 instead ofglobal to comply with their data residency requirement.

What's next

Try theVertex AI prompt optimizer SDK notebook.
Learn aboutresponsible AI best practices and Vertex AI's safety filters.
Learn more aboutprompting strategies.
Explore examples of prompts in thePrompt gallery.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

Data-driven prompt optimizer Stay organized with collections Save and categorize content based on your preferences.

Prompt optimization example

System instructions

Prompt template

Sample prompts

Optimized system instructions

How optimization works

Evaluation metrics

Custom evaluation metrics

Standard evaluation metrics

Before you begin

Optimize prompts

Create a prompt template and system instructions

Notebook

SDK

Prepare sample prompts

JSONL file

CSV file

Optional: Create custom metrics

Create a configuration

Notebook

SDK

VAPO

GEMINI_NANO

Single standard metric

Single custom metric

Multiple standard metrics

Multiple standard & custom metrics

Run prompt optimizer

Notebook

REST

curl

PowerShell

Response

SDK

Analyze results and iterate

Notebook

Console

Best practices

What's next

Data-driven prompt optimizer