Text embeddings API Stay organized with collections Save and categorize content based on your preferences.
The Text embeddings API converts textual data into numerical vectors. Thesevector representations are designed to capture the semantic meaning and contextof the words they represent.
Supported Models:
You can get text embeddings by using the following models:
| Model name | Description | Output Dimensions | Max sequence length | Supported text languages |
|---|---|---|---|---|
gemini-embedding-001 | State-of-the-art performance across English, multilingual and code tasks. It unifies the previously specialized models liketext-embedding-005 andtext-multilingual-embedding-002 and achieves better performance in their respective domains. Read ourTech Report for more detail. | up to 3072 | 2048 tokens | Supported text languages |
text-embedding-005 | Specialized in English and code tasks. | up to 768 | 2048 tokens | English |
text-multilingual-embedding-002 | Specialized in multilingual tasks. | up to 768 | 2048 tokens | Supported text languages |
For superior embedding quality,gemini-embedding-001 is our largemodel designed to provide the highest performance.
Syntax
curl
PROJECT_ID=PROJECT_IDREGION=us-central1MODEL_ID=MODEL_IDcurl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:predict-d\'{ "instances": [ ... ], "parameters": { ... } }'
Python
PROJECT_ID=PROJECT_IDREGION=us-central1MODEL_ID=MODEL_IDimportvertexaifromvertexai.language_modelsimportTextEmbeddingModelvertexai.init(project=PROJECT_ID,location=REGION)model=TextEmbeddingModel.from_pretrained(MODEL_ID)embeddings=model.get_embeddings(...)
Parameter list
| Top-level fields | |
|---|---|
| A list of objects containing the following fields:
|
| An object containing the following fields:
|
instance fields | |
|---|---|
|
The text that you want to generate embeddings for. |
| Optional: Used to convey intended downstream application to help the model produce better embeddings. If left blank, the default used is
For more information about task types, seeChoose an embeddings task type. |
| Optional: Used to help the model produce better embeddings. Only valid with |
task_type
The following table describes thetask_type parameter values and their usecases:
task_type | Description |
|---|---|
RETRIEVAL_QUERY | Specifies the given text is a query in a search or retrieval setting. Use RETRIEVAL_DOCUMENT for the document side. |
RETRIEVAL_DOCUMENT | Specifies the given text is a document in a search or retrieval setting. |
SEMANTIC_SIMILARITY | Specifies the given text is used for Semantic Textual Similarity (STS). |
CLASSIFICATION | Specifies that the embedding is used for classification. |
CLUSTERING | Specifies that the embedding is used for clustering. |
QUESTION_ANSWERING | Specifies that the query embedding is used for answering questions. Use RETRIEVAL_DOCUMENT for the document side. |
FACT_VERIFICATION | Specifies that the query embedding is used for fact verification. Use RETRIEVAL_DOCUMENT for the document side. |
CODE_RETRIEVAL_QUERY | Specifies that the query embedding is used for code retrieval for Java and Python. Use RETRIEVAL_DOCUMENT for the document side. |
Retrieval Tasks:
Query: Use task_type=RETRIEVAL_QUERY to indicate that the input text is a search query.Corpus: Use task_type=RETRIEVAL_DOCUMENT to indicate that the input text is partof the document collection being searched.
Similarity Tasks:
Semantic similarity: Use task_type=SEMANTIC_SIMILARITY for both input texts to assesstheir overall meaning similarity.
SEMANTIC_SIMILARITY is not intended for retrieval use cases, such as document search and information retrieval. For these use cases, useRETRIEVAL_DOCUMENT,RETRIEVAL_QUERY,QUESTION_ANSWERING, andFACT_VERIFICATION.parameters fields | |
|---|---|
| Optional: When set to true, input text will be truncated. When set to false, an error is returned if the input text is longer than the maximum length supported by the model. Defaults to true. |
| Optional: Used to specify output embedding size. If set, output embeddings will be truncated to the size specified. |
Request body
{"instances":[{"task_type":"RETRIEVAL_DOCUMENT","title":"document title","content":"I would like embeddings for this text!"},]}Response body
{"predictions":[{"embeddings":{"statistics":{"truncated":boolean,"token_count":integer},"values":[number]}}]}| Response elements | |
|---|---|
| A list of objects with the following fields:
|
embeddings fields | |
|---|---|
| A list of |
| The statistics computed from the input text. Contains:
|
Sample response
{"predictions":[{"embeddings":{"values":[0.0058424929156899452,0.011848051100969315,0.032247550785541534,-0.031829461455345154,-0.055369812995195389,...],"statistics":{"token_count":4,"truncated":false}}}]}Examples
Embed a text string
The following example shows how to obtain the embedding of a text string.
REST
After youset up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Yourproject ID.
- TEXT: The text that you want to generate embeddings for.Limit: five texts of up to 2,048 tokens per text for all models except
textembedding-gecko@001. The max input token length fortextembedding-gecko@001is 3072. Forgemini-embedding-001, each request can only include a single input text. For more information, seeText embedding limits. - AUTO_TRUNCATE: If set to
false, text that exceeds the token limit causes the request to fail. The default value istrue.
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict
Request JSON body:
{ "instances": [ { "content": "TEXT"} ], "parameters": { "autoTruncate":AUTO_TRUNCATE }}To send your request, choose one of these options:
curl
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict"
PowerShell
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict" | Select-Object -Expand Content
You should receive a JSON response similar to the following. Note thatvalues has been truncated to save space.
Response
{ "predictions": [ { "embeddings": { "statistics": { "truncated": false, "token_count": 6 }, "values": [ ... ] } } ]}- Use the
generateContentmethod to request that the response is returned after it's fully generated. To reduce the perception of latency to a human audience, stream the response as it's being generated by using thestreamGenerateContentmethod. - The multimodal model ID is located at the end of the URL before the method (for example,
gemini-2.0-flash). This sample might support other models as well. - When you use a regional API endpoint (for example,
us-central1), the region from the endpoint URL determines where the request is processed. Any conflicting location in the resource path is ignored.
Python
To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.
from__future__importannotationsfromvertexai.language_modelsimportTextEmbeddingInput,TextEmbeddingModeldefembed_text()->list[list[float]]:"""Embeds texts with a pre-trained, foundational model. Returns: A list of lists containing the embedding vectors for each input text """# A list of texts to be embedded.texts=["banana muffins? ","banana bread? banana muffins?"]# The dimensionality of the output embeddings.dimensionality=3072# The task type for embedding. Check the available tasks in the model's documentation.task="RETRIEVAL_DOCUMENT"model=TextEmbeddingModel.from_pretrained("gemini-embedding-001")kwargs=dict(output_dimensionality=dimensionality)ifdimensionalityelse{}embeddings=[]# gemini-embedding-001 takes one input at a timefortextintexts:text_input=TextEmbeddingInput(text,task)embedding=model.get_embeddings([text_input],**kwargs)print(embedding)# Example response:# [[0.006135190837085247, -0.01462465338408947, 0.004978656303137541, ...]]embeddings.append(embedding[0].values)returnembeddingsGo
Before trying this sample, follow theGo setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIGo API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
import("context""fmt""io"aiplatform"cloud.google.com/go/aiplatform/apiv1""cloud.google.com/go/aiplatform/apiv1/aiplatformpb""google.golang.org/api/option""google.golang.org/protobuf/types/known/structpb")// embedTexts shows how embeddings are set for gemini-embedding-001 modelfuncembedTexts(wio.Writer,project,locationstring)error{// location := "us-central1"ctx:=context.Background()apiEndpoint:=fmt.Sprintf("%s-aiplatform.googleapis.com:443",location)dimensionality:=3072model:="gemini-embedding-001"texts:=[]string{"banana muffins? ","banana bread? banana muffins?"}client,err:=aiplatform.NewPredictionClient(ctx,option.WithEndpoint(apiEndpoint))iferr!=nil{returnerr}deferclient.Close()endpoint:=fmt.Sprintf("projects/%s/locations/%s/publishers/google/models/%s",project,location,model)allEmbeddings:=make([][]float32,0,len(texts))// gemini-embedding-001 takes 1 input at a timefor_,text:=rangetexts{instances:=make([]*structpb.Value,1)instances[0]=structpb.NewStructValue(&structpb.Struct{Fields:map[string]*structpb.Value{"content":structpb.NewStringValue(text),"task_type":structpb.NewStringValue("QUESTION_ANSWERING"),},})params:=structpb.NewStructValue(&structpb.Struct{Fields:map[string]*structpb.Value{"outputDimensionality":structpb.NewNumberValue(float64(dimensionality)),},})req:=&aiplatformpb.PredictRequest{Endpoint:endpoint,Instances:instances,Parameters:params,}resp,err:=client.Predict(ctx,req)iferr!=nil{returnerr}// Process the prediction for the single text// The response will contain one prediction because we sent one instance.iflen(resp.Predictions)==0{returnfmt.Errorf("no predictions returned for text \"%s\"",text)}prediction:=resp.Predictions[0]embeddingValues:=prediction.GetStructValue().Fields["embeddings"].GetStructValue().Fields["values"].GetListValue().ValuescurrentEmbedding:=make([]float32,len(embeddingValues))forj,value:=rangeembeddingValues{currentEmbedding[j]=float32(value.GetNumberValue())}allEmbeddings=append(allEmbeddings,currentEmbedding)}iflen(allEmbeddings) >0{fmt.Fprintf(w,"Dimensionality: %d. Embeddings length: %d",len(allEmbeddings[0]),len(allEmbeddings))}else{fmt.Fprintln(w,"No texts were processed.")}returnnil}Java
Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
import staticjava.util.stream.Collectors.toList;importcom.google.cloud.aiplatform.v1.EndpointName;importcom.google.cloud.aiplatform.v1.PredictRequest;importcom.google.cloud.aiplatform.v1.PredictResponse;importcom.google.cloud.aiplatform.v1.PredictionServiceClient;importcom.google.cloud.aiplatform.v1.PredictionServiceSettings;importcom.google.protobuf.Struct;importcom.google.protobuf.Value;importjava.io.IOException;importjava.util.ArrayList;importjava.util.List;importjava.util.OptionalInt;importjava.util.regex.Matcher;importjava.util.regex.Pattern;publicclassPredictTextEmbeddingsSample{publicstaticvoidmain(String[]args)throwsIOException{// TODO(developer): Replace these variables before running the sample.// Details about text embedding request structure and supported models are available in:// https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddingsStringendpoint="us-central1-aiplatform.googleapis.com:443";Stringproject="YOUR_PROJECT_ID";Stringmodel="gemini-embedding-001";predictTextEmbeddings(endpoint,project,model,List.of("banana bread?","banana muffins?"),"QUESTION_ANSWERING",OptionalInt.of(3072));}// Gets text embeddings from a pretrained, foundational model.publicstaticList<List<Float>>predictTextEmbeddings(Stringendpoint,Stringproject,Stringmodel,List<String>texts,Stringtask,OptionalIntoutputDimensionality)throwsIOException{PredictionServiceSettingssettings=PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();Matchermatcher=Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(endpoint);Stringlocation=matcher.matches()?matcher.group("Location"):"us-central1";EndpointNameendpointName=EndpointName.ofProjectLocationPublisherModelName(project,location,"google",model);List<List<Float>>floats=newArrayList<>();// You can use this prediction service client for multiple requests.try(PredictionServiceClientclient=PredictionServiceClient.create(settings)){// gemini-embedding-001 takes one input at a time.for(inti=0;i <texts.size();i++){PredictRequest.Builderrequest=PredictRequest.newBuilder().setEndpoint(endpointName.toString());if(outputDimensionality.isPresent()){request.setParameters(Value.newBuilder().setStructValue(Struct.newBuilder().putFields("outputDimensionality",valueOf(outputDimensionality.getAsInt())).build()));}request.addInstances(Value.newBuilder().setStructValue(Struct.newBuilder().putFields("content",valueOf(texts.get(i))).putFields("task_type",valueOf(task)).build()));PredictResponseresponse=client.predict(request.build());for(Valueprediction:response.getPredictionsList()){Valueembeddings=prediction.getStructValue().getFieldsOrThrow("embeddings");Valuevalues=embeddings.getStructValue().getFieldsOrThrow("values");floats.add(values.getListValue().getValuesList().stream().map(Value::getNumberValue).map(Double::floatValue).collect(toList()));}}returnfloats;}}privatestaticValuevalueOf(Strings){returnValue.newBuilder().setStringValue(s).build();}privatestaticValuevalueOf(intn){returnValue.newBuilder().setNumberValue(n).build();}}Node.js
Before trying this sample, follow theNode.js setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AINode.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.
asyncfunctionmain(project,model='gemini-embedding-001',texts='banana bread?;banana muffins?',task='QUESTION_ANSWERING',dimensionality=0,apiEndpoint='us-central1-aiplatform.googleapis.com'){constaiplatform=require('@google-cloud/aiplatform');const{PredictionServiceClient}=aiplatform.v1;const{helpers}=aiplatform;// helps construct protobuf.Value objects.constclientOptions={apiEndpoint:apiEndpoint};constlocation='us-central1';constendpoint=`projects/${project}/locations/${location}/publishers/google/models/${model}`;asyncfunctioncallPredict(){constinstances=texts.split(';').map(e=>helpers.toValue({content:e,task_type:task}));constclient=newPredictionServiceClient(clientOptions);constparameters=helpers.toValue(dimensionality >0?{outputDimensionality:parseInt(dimensionality)}:{});constallEmbeddings=[]// gemini-embedding-001 takes one input at a time.for(constinstanceofinstances){constrequest={endpoint,instances:[instance],parameters};const[response]=awaitclient.predict(request);constpredictions=response.predictions;constembeddings=predictions.map(p=>{constembeddingsProto=p.structValue.fields.embeddings;constvaluesProto=embeddingsProto.structValue.fields.values;returnvaluesProto.listValue.values.map(v=>v.numberValue);});allEmbeddings.push(embeddings[0])}console.log('Got embeddings: \n'+JSON.stringify(allEmbeddings));}callPredict();}Supported text languages
All text embedding models support and have been evaluated on English-languagetext. Thetext-multilingual-embedding-002 model additionally supports and hasbeen evaluated on the following languages:
- Evaluated languages:
Arabic (ar),Bengali (bn),English (en),Spanish (es),German (de),Persian (fa),Finnish (fi),French (fr),Hindi (hi),Indonesian (id),Japanese (ja),Korean (ko),Russian (ru),Swahili (sw),Telugu (te),Thai (th),Yoruba (yo),Chinese (zh) - Supported languages:
Afrikaans,Albanian,Amharic,Arabic,Armenian,Azerbaijani,Basque,Belarusiasn,Bengali,Bulgarian,Burmese,Catalan,Cebuano,Chichewa,Chinese,Corsican,Czech,Danish,Dutch,English,Esperanto,Estonian,Filipino,Finnish,French,Galician,Georgian,German,Greek,Gujarati,Haitian Creole,Hausa,Hawaiian,Hebrew,Hindi,Hmong,Hungarian,Icelandic,Igbo,Indonesian,Irish,Italian,Japanese,Javanese,Kannada,Kazakh,Khmer,Korean,Kurdish,Kyrgyz,Lao,Latin,Latvian,Lithuanian,Luxembourgish,Macedonian,Malagasy,Malay,Malayalam,Maltese,Maori,Marathi,Mongolian,Nepali,Norwegian,Pashto,Persian,Polish,Portuguese,Punjabi,Romanian,Russian,Samoan,Scottish Gaelic,Serbian,Shona,Sindhi,Sinhala,Slovak,Slovenian,Somali,Sotho,Spanish,Sundanese,Swahili,Swedish,Tajik,Tamil,Telugu,Thai,Turkish,Ukrainian,Urdu,Uzbek,Vietnamese,Welsh,West Frisian,Xhosa,Yiddish,Yoruba,Zulu.
Thegemini-embedding-001 model supports the following languages:
Arabic,Bengali,Bulgarian,Chinese (Simplified and Traditional),Croatian,Czech,Danish,Dutch,English,Estonian,Finnish,French,German,Greek,Hebrew,Hindi,Hungarian,Indonesian,Italian,Japanese,Korean,Latvian,Lithuanian,Norwegian,Polish,Portuguese,Romanian,Russian,Serbian,Slovak,Slovenian,Spanish,Swahili,Swedish,Thai,Turkish,Ukrainian,Vietnamese,Afrikaans,Amharic,Assamese,Azerbaijani,Belarusian,Bosnian,Catalan,Cebuano,Corsican,Welsh,Dhivehi,Esperanto,Basque,Persian,Filipino (Tagalog),Frisian,Irish,Scots Gaelic,Galician,Gujarati,Hausa,Hawaiian,Hmong,Haitian Creole,Armenian,Igbo,Icelandic,Javanese,Georgian,Kazakh,Khmer,Kannada,Krio,Kurdish,Kyrgyz,Latin,Luxembourgish,Lao,Malagasy,Maori,Macedonian,Malayalam,Mongolian,Meiteilon (Manipuri),Marathi,Malay,Maltese,Myanmar (Burmese),Nepali,Nyanja (Chichewa),Odia (Oriya),Punjabi,Pashto,Sindhi,Sinhala (Sinhalese),Samoan,Shona,Somali,Albanian,Sesotho,Sundanese,Tamil,Telugu,Tajik,Uyghur,Urdu,Uzbek,Xhosa,Yiddish,Yoruba,Zulu.
Model versions
To use a current stable model, specify the model version number, for examplegemini-embedding-001. Specifying a model without a version number,isn't recommended, as it is merely a legacy pointer to another model and isn'tstable.
For more information, seeModel versions and lifecycle.
What's next
For detailed documentation, see the following:
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.