Text generation Stay organized with collections Save and categorize content based on your preferences.
To see an example of getting started with Chat with the Gemini Pro model, run the "Getting Started with Chat with the Gemini Pro model" notebook in one of the following environments:
Open in Colab |Open in Colab Enterprise |Openin Vertex AI Workbench |View on GitHub
This page shows you how to send chat prompts to a Gemini model by usingthe Google Cloud console, REST API, and supported SDKs.
To learn how to add images and other media to your request, seeImage understanding.
For a list of languages supported by Gemini, seeLanguage support.
To explorethe generative AI models and APIs that are available on Vertex AI, go toModel Garden in the Google Cloud console.
If you're looking for a way to use Gemini directly from your mobile andweb apps, see theFirebase AI Logic client SDKs forSwift, Android, Web, Flutter, and Unity apps.
Generate text
For testing and iterating on chat prompts, we recommend using theGoogle Cloud console. To send prompts programmatically to the model, you can use theREST API, Google Gen AI SDK, Vertex AI SDK for Python, or one of the other supported libraries andSDKs.
You can use system instructions to steer the behavior of the model based on aspecific need or use case. For example, you can define a persona or role for achatbot that responds to customer service requests. For more information, seethesystem instructions code samples.
You can use theGoogle Gen AI SDK to send requests ifyou're usingGemini 2.0 Flash.
Here is a simple text generation example.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
fromgoogleimportgenaifromgoogle.genai.typesimportHttpOptionsclient=genai.Client(http_options=HttpOptions(api_version="v1"))response=client.models.generate_content(model="gemini-2.5-flash",contents="How does AI work?",)print(response.text)# Example response:# Okay, let's break down how AI works. It's a broad field, so I'll focus on the ...## Here's a simplified overview:# ...Go
Learn how to install or update theGo.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
import("context""fmt""io""google.golang.org/genai")//generateWithTextshowshowtogeneratetextusingatextprompt.funcgenerateWithText(wio.Writer)error{ctx:=context.Background()client,err:=genai.NewClient(ctx, &genai.ClientConfig{HTTPOptions:genai.HTTPOptions{APIVersion:"v1"},})iferr!=nil{returnfmt.Errorf("failed to create genai client: %w",err)}resp,err:=client.Models.GenerateContent(ctx,"gemini-2.5-flash",genai.Text("How does AI work?"),nil,)iferr!=nil{returnfmt.Errorf("failed to generate content: %w",err)}respText:=resp.Text()fmt.Fprintln(w,respText)//Exampleresponse://That's a great question! Understanding how AI works can feel like ...//...//**1.TheFoundation:DataandAlgorithms**//...returnnil}Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
const{GoogleGenAI}=require('@google/genai');constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;constGOOGLE_CLOUD_LOCATION=process.env.GOOGLE_CLOUD_LOCATION||'global';asyncfunctiongenerateContent(projectId=GOOGLE_CLOUD_PROJECT,location=GOOGLE_CLOUD_LOCATION){constclient=newGoogleGenAI({vertexai:true,project:projectId,location:location,});constresponse=awaitclient.models.generateContent({model:'gemini-3-flash-preview',contents:'How does AI work?',});console.log(response.text);returnresponse.text;}Java
Learn how to install or update theJava.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
importcom.google.genai.Client;importcom.google.genai.types.GenerateContentResponse;importcom.google.genai.types.HttpOptions;publicclassTextGenerationWithText{publicstaticvoidmain(String[]args){//TODO(developer):Replacethesevariablesbeforerunningthesample.StringmodelId="gemini-2.5-flash";generateContent(modelId);}//GeneratestextwithtextinputpublicstaticStringgenerateContent(StringmodelId){//Initializeclientthatwillbeusedtosendrequests.Thisclientonlyneedstobecreated//once,andcanbereusedformultiplerequests.try(Clientclient=Client.builder().location("global").vertexAI(true).httpOptions(HttpOptions.builder().apiVersion("v1").build()).build()){GenerateContentResponseresponse=client.models.generateContent(modelId,"How does AI work?",null);System.out.print(response.text());//Exampleresponse://Okay,let's break down how AI works. It'sabroadfield,soI'll focus on the ...////Here's a simplified overview://...returnresponse.text();}}}C#
Learn how to install or update theC#.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
usingGoogle.GenAI;usingGoogle.GenAI.Types;usingSystem;usingSystem.Threading.Tasks;publicclassTextGenWithTxt{publicasyncTask<string>GenerateContent(stringprojectId="your-project-id",stringlocation="global",stringmodel="gemini-2.5-flash"){awaitusingvarclient=newClient(project:projectId,location:location,vertexAI:true,httpOptions:newHttpOptions{ApiVersion="v1"});GenerateContentResponseresponse=awaitclient.Models.GenerateContentAsync(model:model,contents:"How does AI work?");stringresponseText=response.Candidates[0].Content.Parts[0].Text;Console.WriteLine(responseText);//Exampleresponse://AI,orArtificialIntelligence,atitscore,isaboutcreatingmachinesthatcanperform...//Here's a breakdown of how it generally works...returnresponseText;}}Streaming and non-streaming responses
You can choose whether the model generatesstreaming responses ornon-streaming responses. For streaming responses, you receive each responseas soon as its output token is generated. For non-streaming responses, you receiveall responses after all of the output tokens are generated.
Here is a streaming text generation example.
Python
Before trying this sample, follow thePython setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIPython API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
fromgoogleimportgenaifromgoogle.genai.typesimportHttpOptionsclient=genai.Client(http_options=HttpOptions(api_version="v1"))chat_session=client.chats.create(model="gemini-2.5-flash")forchunkinchat_session.send_message_stream("Why is the sky blue?"):print(chunk.text,end="")# Example response:# The# sky appears blue due to a phenomenon called **Rayleigh scattering**. Here's# a breakdown of why:# ...Gemini multiturn chat behavior
When you use multiturn chat, Vertex AI locally stores the initial contentand prompts that you sent to the model. Vertex AI sends all of this data witheach subsequent request to the model. Consequently, the input costs for eachmessage that you send is a running total of all the data that was already sentto the model. If your initial content is sufficiently large, consider usingcontext caching when you create the initial model object to bettercontrol input costs.
What's next
Learn how to send multimodal prompt requests:
Learn aboutresponsible AI best practices and Vertex AI's safety filters.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.
Open in Colab
Open in Colab Enterprise
Openin Vertex AI Workbench
View on GitHub