Get text embeddings Stay organized with collections Save and categorize content based on your preferences.
This document describes how to create a text embedding using theVertex AIText embeddingsAPI.
Vertex AI text embeddings API uses dense vector representations:gemini-embedding-001, for example, uses 3072-dimensionalvectors. Dense vector embedding models use deep-learning methods similar to theones used by large language models. Unlike sparse vectors, which tend todirectly map words to numbers, dense vectors are designed to better representthe meaning of a piece of text. The benefit of using dense vector embeddings ingenerative AI is that instead of searching for direct word or syntax matches,you can better search for passages that align to the meaning of the query, evenif the passages don't use the same language.
The vectors are normalized, so you can use cosine similarity, dot product, orEuclidean distance to provide the same similarity rankings.
- To learn more about embeddings, see theembeddings APIs overview.
- To learn about text embedding models, seeText embeddings.
- For information about which languages each embeddings model supports, seeSupported textlanguages.
To see an example of getting text embeddings, run the "Getting Started with Text Embeddings + Vertex AI Vector Search" notebook in one of the following environments:
Open in Colab |Open in Colab Enterprise |Openin Vertex AI Workbench |View on GitHub
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Enable the Vertex AI API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Enable the Vertex AI API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.- Choose a task type for your embeddings job.
API limits
For each request, you're limited to 250 input texts. The API has a maximum inputtoken limit of 20,000. Inputs exceeding this limit result in a 400 error. Eachindividual input text is further limited to 2048 tokens; any excess is silentlytruncated. You can also disable silent truncation by settingautoTruncate tofalse.
For more information, seeText embeddinglimits.
Get text embeddings for a snippet of text
You can get text embeddings for a snippet of text by using the Vertex AI API orthe Vertex AI SDK for Python.
Choose an embedding dimension
All models produce a full-length embedding vector by default. Forgemini-embedding-001, this vector has 3072 dimensions, and othermodels produce 768-dimensional vectors. However, by using theoutput_dimensionality parameter, users can control the size of the outputembedding vector. Selecting a smaller output dimensionality can save storagespace and increase computational efficiency for downstream applications, whilesacrificing little in terms of quality.
The following examples use thegemini-embedding-001 model.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
fromgoogleimportgenaifromgoogle.genai.typesimportEmbedContentConfigclient=genai.Client()response=client.models.embed_content(model="gemini-embedding-001",contents=["How do I get a driver's license/learner's permit?","How long is my driver's license valid for?","Driver's knowledge test study guide",],config=EmbedContentConfig(task_type="RETRIEVAL_DOCUMENT",# Optionaloutput_dimensionality=3072,# Optionaltitle="Driver's License",# Optional),)print(response)# Example response:# embeddings=[ContentEmbedding(values=[-0.06302902102470398, 0.00928034819662571, 0.014716853387653828, -0.028747491538524628, ... ],# statistics=ContentEmbeddingStatistics(truncated=False, token_count=13.0))]# metadata=EmbedContentMetadata(billable_character_count=112)Go
Learn how to install or update theGo.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
import("context""fmt""io""google.golang.org/genai")//generateEmbedContentWithTextshowshowtoembedcontentwithtext.funcgenerateEmbedContentWithText(wio.Writer)error{ctx:=context.Background()client,err:=genai.NewClient(ctx, &genai.ClientConfig{HTTPOptions:genai.HTTPOptions{APIVersion:"v1"},})iferr!=nil{returnfmt.Errorf("failed to create genai client: %w",err)}outputDimensionality:=int32(3072)config:= &genai.EmbedContentConfig{TaskType:"RETRIEVAL_DOCUMENT",//optionalTitle:"Driver's License",//optionalOutputDimensionality: &outputDimensionality,//optional}contents:=[]*genai.Content{{Parts:[]*genai.Part{{Text:"How do I get a driver's license/learner's permit?",},{Text:"How long is my driver's license valid for?",},{Text:"Driver's knowledge test study guide",},},Role:genai.RoleUser,},}modelName:="gemini-embedding-001"resp,err:=client.Models.EmbedContent(ctx,modelName,contents,config)iferr!=nil{returnfmt.Errorf("failed to generate content: %w",err)}fmt.Fprintln(w,resp)//Exampleresponse://embeddings=[ContentEmbedding(values=[-0.06302902102470398,0.00928034819662571,0.014716853387653828,-0.028747491538524628,...],//statistics=ContentEmbeddingStatistics(truncated=False,token_count=13.0))]//metadata=EmbedContentMetadata(billable_character_count=112)returnnil}Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
const{GoogleGenAI}=require('@google/genai');constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;asyncfunctiongenerateEmbeddingsForRetrieval(projectId=GOOGLE_CLOUD_PROJECT){constclient=newGoogleGenAI({vertexai:true,project:projectId,});constprompt=["How do I get a driver's license/learner's permit?","How long is my driver's license valid for?","Driver's knowledge test study guide",];constresponse=awaitclient.models.embedContent({model:'gemini-embedding-001',contents:prompt,config:{taskType:'RETRIEVAL_DOCUMENT',//OptionaloutputDimensionality:3072,//Optionaltitle:"Driver's License",//Optional},});console.log(response);//Exampleresponse://embeddings=[ContentEmbedding(values=[-0.06302902102470398,0.00928034819662571,0.014716853387653828,-0.028747491538524628,...],//statistics=ContentEmbeddingStatistics(truncated=False,token_count=13.0))]//metadata=EmbedContentMetadata(billable_character_count=112)returnresponse;}Java
Learn how to install or update theJava.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=globalexportGOOGLE_GENAI_USE_VERTEXAI=True
importcom.google.genai.Client;importcom.google.genai.types.EmbedContentConfig;importcom.google.genai.types.EmbedContentResponse;importjava.util.List;publicclassEmbeddingsDocRetrievalWithTxt{publicstaticvoidmain(String[]args){//TODO(developer):Replacethesevariablesbeforerunningthesample.StringmodelId="gemini-embedding-001";embedContent(modelId);}//Showshowtoembedcontentwithtext.publicstaticEmbedContentResponseembedContent(StringmodelId){//ClientInitialization.Oncecreated,itcanbereusedformultiplerequests.try(Clientclient=Client.builder().location("global").vertexAI(true).build()){EmbedContentResponseresponse=client.models.embedContent(modelId,List.of("How do I get a driver's license/learner's permit?","How long is my driver's license valid for?","Driver's knowledge test study guide"),EmbedContentConfig.builder().taskType("RETRIEVAL_DOCUMENT").outputDimensionality(3072).title("Driver's License").build());System.out.println(response);//Exampleresponse://embeddings=Optional[[ContentEmbedding{values=Optional[[-0.035855383,0.008127963,...]]//statistics=Optional[ContentEmbeddingStatistics{truncated=Optional[false],//tokenCount=Optional[11.0]}]}]],//metadata=Optional[EmbedContentMetadata{billableCharacterCount=Optional[153]}]}returnresponse;}}}REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Yourproject ID.
- TEXT: The text that you want to generate embeddings for.Limit: five texts of up to 2,048 tokens per text for all models except
textembedding-gecko@001. The max input token length fortextembedding-gecko@001is 3072. Forgemini-embedding-001, each request can only include a single input text. For more information, seeText embedding limits. - AUTO_TRUNCATE: If set to
false, text that exceeds the token limit causes the request to fail. The default value istrue.
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict
Request JSON body:
{ "instances": [ { "content": "TEXT"} ], "parameters": { "autoTruncate":AUTO_TRUNCATE }}To send your request, choose one of these options:
curl
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict"
PowerShell
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict" | Select-Object -Expand Content
You should receive a JSON response similar to the following. Note thatvalues has been truncated to save space.
Response
{ "predictions": [ { "embeddings": { "statistics": { "truncated": false, "token_count": 6 }, "values": [ ... ] } } ]}Example curl command
MODEL_ID="gemini-embedding-001"PROJECT_ID=PROJECT_IDcurl\-XPOST\-H"Authorization: Bearer $(gcloud auth print-access-token)"\-H"Content-Type: application/json"\https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/${MODEL_ID}:predict -d \$'{"instances":[{"content":"What is life?"}],}'Supported models
The following tables show the available Google and open text embedding models.
Google models
You can get text embeddings by using the following models:
| Model name | Description | Output Dimensions | Max sequence length | Supported text languages |
|---|---|---|---|---|
gemini-embedding-001 | State-of-the-art performance across English, multilingual and code tasks. It unifies the previously specialized models liketext-embedding-005 andtext-multilingual-embedding-002 and achieves better performance in their respective domains. Read ourTech Report for more detail. | up to 3072 | 2048 tokens | Supported text languages |
text-embedding-005 | Specialized in English and code tasks. | up to 768 | 2048 tokens | English |
text-multilingual-embedding-002 | Specialized in multilingual tasks. | up to 768 | 2048 tokens | Supported text languages |
For superior embedding quality,gemini-embedding-001 is our largemodel designed to provide the highest performance.
Open models
You can get text embeddings by using the following models:
| Model name | Description | Output dimensions | Max sequence length | Supported text languages |
|---|---|---|---|---|
multilingual-e5-small | Part of the E5 family of text embedding models. Small variant contains 12 layers. | Up to 384 | 512 tokens | Supported languages |
multilingual-e5-large | Part of the E5 family of text embedding models. Large variant contains 24 layers. | Up to 1024 | 512 tokens | Supported languages |
To get started, see the E5 familymodelcard.For more information on open models, seeOpen models for MaaS
Add an embedding to a vector database
After you've generated your embedding you can add embeddings to a vectordatabase, like Vector Search. This enables low-latency retrieval,and is critical as the size of your data increases.
To learn more about Vector Search, seeOverview ofVector Search.
What's next
- To learn more about rate limits, seeGenerative AI on Vertex AIrate limits.
- To get batch predictions for embeddings, seeGet batch text embeddingspredictions
- To learn more about multimodal embeddings, seeGet multimodalembeddings
- To tune an embedding, seeTune text embeddings
- To learn more about the research behind
text-embedding-005andtext-multilingual-embedding-002, see the research paperGecko: VersatileText Embeddings Distilled from Large LanguageModels.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.
Open in Colab
Open in Colab Enterprise
Openin Vertex AI Workbench
View on GitHub