Experiment with parameter values Stay organized with collections Save and categorize content based on your preferences.
Each call that you send to a model includes parameter values that control how the model generatesa response. The model can generate different results for different parameter values. Experiment withdifferent parameter values to get the best values for the task. The parameters available fordifferent models may differ. The most common parameters are the following:
- Max output tokens
- Temperature
- Top-P
- Top-K
- Seed
Max output tokens
Maximum number of tokens that can be generated in the response. A token isapproximately four characters. 100 tokens correspond to roughly 60-80 words.Specify a lower value for shorter responses and a higher value for potentially longerresponses.
Temperature
The temperature is used for sampling during response generation, which occurs whentopPandtopK are applied. Temperature controls the degree of randomness in token selection.Lower temperatures are good for prompts that require a less open-ended or creative response, whilehigher temperatures can lead to more diverse or creative results. A temperature of0means that the highest probability tokens are always selected. In this case, responses for a givenprompt are mostly deterministic, but a small amount of variation is still possible.If the model returns a response that's too generic, too short, or the model gives a fallbackresponse, try increasing the temperature. If the model enters infinite generation, increasing thetemperature to at least0.1 may lead to improved results.
1.0 is therecommended starting value for temperature.Gemini models support a temperature value between 0.0 and 2.0. Models have a default temperature of 1.0.
Top-P
Top-P changes how the model selects tokens for output. Tokens are selectedfrom the most probable to least probable until the sum of their probabilitiesequals the top-P value. For example, if tokens A, B, and C have a probability of0.3, 0.2, and 0.1 and the top-P value is0.5, then the model willselect either A or B as the next token by using temperature and excludes C as acandidate.Specify a lower value for less random responses and a higher value for morerandom responses.
Top-K
Top-K changes how the model selects tokens for output. A top-K of1 means the next selected token is the most probable among alltokens in the model's vocabulary (also called greedy decoding), while a top-K of3 means that the next token is selected from among the three mostprobable tokens by using temperature.For each token selection step, the top-K tokens with the highestprobabilities are sampled. Then tokens are further filtered based on top-P withthe final token selected using temperature sampling.
Specify a lower value for less random responses and a higher value for morerandom responses.
Seed
When seed is fixed to a specific value, the model makes a best effort to providethe same response for repeated requests. Deterministic output isn't guaranteed.Also, changing the model or parameter settings, such as the temperature, cancause variations in the response even when you use the same seed value. Bydefault, a random seed value is used.This is a preview feature.
What's next
- Explore examples of prompts in thePrompt gallery.
- Learn how to optimize prompts for use withGoogle models by using theVertex AI prompt optimizer (Preview).
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.