Search with vector embeddings

This page shows you how to use Firestore to perform K-nearestneighbor (KNN) vector searches using the following techniques:

  • Store vector values
  • Create and manage KNN vector indexes
  • Make a K-nearest-neighbor (KNN) query using one of the supported vectordistance measures

Before you begin

Before you store embeddings in Firestore, you must generate vector embeddings. Firestore doesnot generate the embeddings. You can use a service such as Vertex AIto create vector values, for example,text embeddings from yourFirestore data. You can then store these embeddings back inFirestore documents.

To learn more about embeddings, seeWhat are embeddings?

To learn how to get text embeddings with Vertex AI, seeGet text embeddings.

Store vector embeddings

The following examples demonstrate how to store vector embeddings inFirestore.

Write operation with a vector embedding

The following example shows how to store a vector embedding in aFirestore document:

Python
fromgoogle.cloudimportfirestorefromgoogle.cloud.firestore_v1.vectorimportVectorfirestore_client=firestore.Client()collection=firestore_client.collection("coffee-beans")doc={"name":"Kahawa coffee beans","description":"Information about the Kahawa coffee beans.","embedding_field":Vector([0.18332680,0.24160706,0.3416704]),}collection.add(doc)
Node.js
import{Firestore,FieldValue,}from"@google-cloud/firestore";constdb=newFirestore();constcoll=db.collection('coffee-beans');awaitcoll.add({name:"Kahawa coffee beans",description:"Information about the Kahawa coffee beans.",embedding_field:FieldValue.vector([1.0,2.0,3.0])});
Go
import("context""fmt""io""cloud.google.com/go/firestore")typeCoffeeBeanstruct{Namestring`firestore:"name,omitempty"`Descriptionstring`firestore:"description,omitempty"`EmbeddingFieldfirestore.Vector32`firestore:"embedding_field,omitempty"`Colorstring`firestore:"color,omitempty"`}funcstoreVectors(wio.Writer,projectIDstring)error{ctx:=context.Background()// Create clientclient,err:=firestore.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("firestore.NewClient: %w",err)}deferclient.Close()// Vector can be represented by Vector32 or Vector64doc:=CoffeeBean{Name:"Kahawa coffee beans",Description:"Information about the Kahawa coffee beans.",EmbeddingField:[]float32{1.0,2.0,3.0},Color:"red",}ref:=client.Collection("coffee-beans").NewDoc()if_,err=ref.Set(ctx,doc);err!=nil{fmt.Fprintf(w,"failed to upsert: %v",err)returnerr}returnnil}
Java
importcom.google.cloud.firestore.CollectionReference;importcom.google.cloud.firestore.DocumentReference;importcom.google.cloud.firestore.FieldValue;importcom.google.cloud.firestore.VectorQuery;CollectionReferencecoll=firestore.collection("coffee-beans");Map<String,Object>docData=newHashMap<>();docData.put("name","Kahawa coffee beans");docData.put("description","Information about the Kahawa coffee beans.");docData.put("embedding_field",FieldValue.vector(newdouble[]{1.0,2.0,3.0}));ApiFuture<DocumentReference>future=coll.add(docData);DocumentReferencedocumentReference=future.get();

Compute vector embeddings with a Cloud Function

To calculate and store vector embeddings whenever a document is updated orcreated, you can set up aCloud Run function:

Python
@functions_framework.cloud_eventdefstore_embedding(cloud_event)->None:"""Triggers by a change to a Firestore document.  """firestore_payload=firestore.DocumentEventData()payload=firestore_payload._pb.ParseFromString(cloud_event.data)collection_id,doc_id=from_payload(payload)# Call a function to calculate the embeddingembedding=calculate_embedding(payload)# Update the documentdoc=firestore_client.collection(collection_id).document(doc_id)doc.set({"embedding_field":embedding},merge=True)
Node.js
/** * A vector embedding will be computed from the * value of the `content` field. The vector value * will be stored in the `embedding` field. The * field names `content` and `embedding` are arbitrary * field names chosen for this example. */asyncfunctionstoreEmbedding(event:FirestoreEvent<any>):Promise<void>{// Get the previous value of the document's `content` field.constpreviousDocumentSnapshot=event.data.beforeasQueryDocumentSnapshot;constpreviousContent=previousDocumentSnapshot.get("content");// Get the current value of the document's `content` field.constcurrentDocumentSnapshot=event.data.afterasQueryDocumentSnapshot;constcurrentContent=currentDocumentSnapshot.get("content");// Don't update the embedding if the content field did not changeif(previousContent===currentContent){return;}// Call a function to calculate the embedding for the value// of the `content` field.constembeddingVector=calculateEmbedding(currentContent);// Update the `embedding` field on the document.awaitcurrentDocumentSnapshot.ref.update({embedding:embeddingVector,});}
Go
// Not yet supported in the Go client library
Java
// Not yet supported in the Java client library

Create and manage vector indexes

Before you can perform a nearest neighbor search with your vector embeddings,you must create a corresponding index. The following examples demonstratehow to create and manage vector indexes with the Google Cloud CLI. Vector indexescan also bemanaged with the Firebase CLI and Terraform.

Create a vector index

Before you create a vector index, upgrade to the latest version of the Google Cloud CLI:

gcloudcomponentsupdate

To create a vector index, usegcloud firestore indexes composite create:

gcloud
gcloud firestore indexes composite create \--collection-group=collection-group \--query-scope=COLLECTION \--field-config field-path=vector-field,vector-config='vector-configuration' \--database=database-id

where:

  • collection-group is the ID of the collection group.
  • vector-field is the name of the field that contains the vector embedding.
  • database-id is the ID of the database.
  • vector-configuration includes the vectordimension and index type. Thedimension is an integer up to 2048. The index type must beflat. Format the index configuration as follows:{"dimension":"DIMENSION", "flat": "{}"}.

The following example creates a composite index, including a vector index for fieldvector-fieldand an ascending index for fieldcolor. You can use this type of index topre-filterdata before a nearest neighbor search.

gcloud
gcloudfirestoreindexescompositecreate\--collection-group=collection-group\--query-scope=COLLECTION\--field-config=order=ASCENDING,field-path="color"\--field-configfield-path=vector-field,vector-config='{"dimension":"1024", "flat": "{}"}'\--database=database-id

List all vector indexes

gcloud
gcloud firestore indexes composite list --database=database-id

Replacedatabase-id with the ID of the database.

Delete a vector index

gcloud
gcloud firestore indexes composite deleteindex-id --database=database-id

where:

  • index-id is the ID of the index to delete. Useindexes composite list to retrieve the index ID.
  • database-id is the ID of the database.

Describe a vector index

gcloud
gcloud firestore indexes composite describeindex-id --database=database-id

where:

  • index-id is the ID of the index to describe. Use orindexes composite list to retrieve the index ID.
  • database-id is the ID of the database.

Make a nearest-neighbor query

You can perform a similarity search to find the nearest neighbors of avector embedding. Similarity searches requirevector indexes.If an index doesn't exist, Firestore suggests an index to createusing the gcloud CLI.

The following example finds 10 nearest neighbors of the query vector.

Python
fromgoogle.cloud.firestore_v1.base_vector_queryimportDistanceMeasurefromgoogle.cloud.firestore_v1.vectorimportVectorcollection=db.collection("coffee-beans")# Requires a single-field vector indexvector_query=collection.find_nearest(vector_field="embedding_field",query_vector=Vector([0.3416704,0.18332680,0.24160706]),distance_measure=DistanceMeasure.EUCLIDEAN,limit=5,)
Node.js
import{Firestore,FieldValue,VectorQuery,VectorQuerySnapshot,}from"@google-cloud/firestore";// Requires a single-field vector indexconstvectorQuery:VectorQuery=coll.findNearest({vectorField:'embedding_field',queryVector:[3.0,1.0,2.0],limit:10,distanceMeasure:'EUCLIDEAN'});constvectorQuerySnapshot:VectorQuerySnapshot=awaitvectorQuery.get();
Go
import("context""fmt""io""cloud.google.com/go/firestore")funcvectorSearchBasic(wio.Writer,projectIDstring)error{ctx:=context.Background()// Create clientclient,err:=firestore.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("firestore.NewClient: %w",err)}deferclient.Close()collection:=client.Collection("coffee-beans")// Requires a vector index// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexesvectorQuery:=collection.FindNearest("embedding_field",[]float32{3.0,1.0,2.0},5,// More info: https://firebase.google.com/docs/firestore/vector-search#vector_distancesfirestore.DistanceMeasureEuclidean,nil)docs,err:=vectorQuery.Documents(ctx).GetAll()iferr!=nil{fmt.Fprintf(w,"failed to get vector query results: %v",err)returnerr}for_,doc:=rangedocs{fmt.Fprintln(w,doc.Data()["name"])}returnnil}
Java
importcom.google.cloud.firestore.VectorQuery;importcom.google.cloud.firestore.VectorQuerySnapshot;VectorQueryvectorQuery=coll.findNearest("embedding_field",newdouble[]{3.0,1.0,2.0},/* limit */10,VectorQuery.DistanceMeasure.EUCLIDEAN);ApiFuture<VectorQuerySnapshot>future=vectorQuery.get();VectorQuerySnapshotvectorQuerySnapshot=future.get();

Vector distances

Nearest-neighbor queries support the following options for vector distance:

  • EUCLIDEAN: Measures theEUCLIDEAN distance between the vectors.To learn more, seeEuclidean.
  • COSINE: Compares vectors based on the angle between them which lets youmeasure similarity that isn't based on the vectors magnitude.We recommend usingDOT_PRODUCT with unit normalized vectors instead ofCOSINE distance, which is mathematically equivalent with betterperformance. To learn more, seeCosine similarity.
  • DOT_PRODUCT: Similar toCOSINE but is affected by the magnitude of thevectors. To learn more, seeDot product.

Choose the distance measure

Depending on whether or not all your vector embeddings are normalized, you candetermine which distance measure to use to find the distance measure. A normalizedvector embedding has a magnitude (length) of exactly 1.0.

In addition, if you know which distance measure your model was trained with,use that distance measure to compute the distance between your vectorembeddings.

Normalized data

If you have a dataset where all vector embeddings are normalized, then all threedistance measures provide the same semantic search results. In essence, although eachdistance measure returns a different value, those values sort the same way. Whenembeddings are normalized,DOT_PRODUCT is usually the most computationallyefficient, but the difference is negligible in most cases. However, if yourapplication is highly performance sensitive,DOT_PRODUCT might help withperformance tuning.

Non-normalized data

If you have a dataset where vector embeddings aren't normalized,then it's not mathematically correct to useDOT_PRODUCT as a distancemeasure because dot product doesn't measure distance. Dependingon how the embeddings were generated and what type of search is preferred,either theCOSINE orEUCLIDEAN distance measure producessearch results that are subjectively better than the other distance measures.Experimentation with eitherCOSINE orEUCLIDEAN mightbe necessary to determine which is best for your use case.

Unsure if data is normalized or non-normalized

If you're unsure whether or not your data is normalized and you want to useDOT_PRODUCT, we recommend that you useCOSINE instead.COSINE is likeDOT_PRODUCT with normalization built in.Distance measured usingCOSINE ranges from0 to2. A resultthat is close to0 indicates the vectors are very similar.

Pre-filter documents

To pre-filter documents before finding the nearest neighbors, you can combine asimilarity search with other query operators. Theand andor composite filters are supported. For more information about supported field filters, seeQuery operators.

Python
fromgoogle.cloud.firestore_v1.base_vector_queryimportDistanceMeasurefromgoogle.cloud.firestore_v1.vectorimportVectorcollection=db.collection("coffee-beans")# Similarity search with pre-filter# Requires a composite vector indexvector_query=collection.where("color","==","red").find_nearest(vector_field="embedding_field",query_vector=Vector([0.3416704,0.18332680,0.24160706]),distance_measure=DistanceMeasure.EUCLIDEAN,limit=5,)
Node.js
// Similarity search with pre-filter// Requires composite vector indexconstpreFilteredVectorQuery:VectorQuery=coll.where("color","==","red").findNearest({vectorField:"embedding_field",queryVector:[3.0,1.0,2.0],limit:5,distanceMeasure:"EUCLIDEAN",});constvectorQueryResults=awaitpreFilteredVectorQuery.get();
Go
import("context""fmt""io""cloud.google.com/go/firestore")funcvectorSearchPrefilter(wio.Writer,projectIDstring)error{ctx:=context.Background()// Create clientclient,err:=firestore.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("firestore.NewClient: %w",err)}deferclient.Close()collection:=client.Collection("coffee-beans")// Similarity search with pre-filter// Requires a composite vector indexvectorQuery:=collection.Where("color","==","red").FindNearest("embedding_field",[]float32{3.0,1.0,2.0},5,// More info: https://firebase.google.com/docs/firestore/vector-search#vector_distancesfirestore.DistanceMeasureEuclidean,nil)docs,err:=vectorQuery.Documents(ctx).GetAll()iferr!=nil{fmt.Fprintf(w,"failed to get vector query results: %v",err)returnerr}for_,doc:=rangedocs{fmt.Fprintln(w,doc.Data()["name"])}returnnil}
Java
importcom.google.cloud.firestore.VectorQuery;importcom.google.cloud.firestore.VectorQuerySnapshot;VectorQuerypreFilteredVectorQuery=coll.whereEqualTo("color","red").findNearest("embedding_field",newdouble[]{3.0,1.0,2.0},/* limit */10,VectorQuery.DistanceMeasure.EUCLIDEAN);ApiFuture<VectorQuerySnapshot>future=preFilteredVectorQuery.get();VectorQuerySnapshotvectorQuerySnapshot=future.get();

Retrieve the calculated vector distance

You can retrieve the calculated vector distance by assigning adistance_result_field output property name on theFindNearest query, asshown in the following example:

Python
fromgoogle.cloud.firestore_v1.base_vector_queryimportDistanceMeasurefromgoogle.cloud.firestore_v1.vectorimportVectorcollection=db.collection("coffee-beans")vector_query=collection.find_nearest(vector_field="embedding_field",query_vector=Vector([0.3416704,0.18332680,0.24160706]),distance_measure=DistanceMeasure.EUCLIDEAN,limit=10,distance_result_field="vector_distance",)docs=vector_query.stream()fordocindocs:print(f"{doc.id}, Distance:{doc.get('vector_distance')}")
Node.js
constvectorQuery:VectorQuery=coll.findNearest({vectorField:'embedding_field',queryVector:[3.0,1.0,2.0],limit:10,distanceMeasure:'EUCLIDEAN',distanceResultField:'vector_distance'});constsnapshot:VectorQuerySnapshot=awaitvectorQuery.get();snapshot.forEach((doc)=>{console.log(doc.id,' Distance: ',doc.get('vector_distance'));});
Go
import("context""fmt""io""cloud.google.com/go/firestore")funcvectorSearchDistanceResultField(wio.Writer,projectIDstring)error{ctx:=context.Background()client,err:=firestore.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("firestore.NewClient: %w",err)}deferclient.Close()collection:=client.Collection("coffee-beans")// Requires a vector index// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexesvectorQuery:=collection.FindNearest("embedding_field",[]float32{3.0,1.0,2.0},10,firestore.DistanceMeasureEuclidean,&firestore.FindNearestOptions{DistanceResultField:"vector_distance",})docs,err:=vectorQuery.Documents(ctx).GetAll()iferr!=nil{fmt.Fprintf(w,"failed to get vector query results: %v",err)returnerr}for_,doc:=rangedocs{fmt.Fprintf(w,"%v, Distance: %v\n",doc.Data()["name"],doc.Data()["vector_distance"])}returnnil}
Java
importcom.google.cloud.firestore.VectorQuery;importcom.google.cloud.firestore.VectorQueryOptions;importcom.google.cloud.firestore.VectorQuerySnapshot;VectorQueryvectorQuery=coll.findNearest("embedding_field",newdouble[]{3.0,1.0,2.0},/* limit */10,VectorQuery.DistanceMeasure.EUCLIDEAN,VectorQueryOptions.newBuilder().setDistanceResultField("vector_distance").build());ApiFuture<VectorQuerySnapshot>future=vectorQuery.get();VectorQuerySnapshotvectorQuerySnapshot=future.get();for(DocumentSnapshotdocument:vectorQuerySnapshot.getDocuments()){System.out.println(document.getId()+" Distance: "+document.get("vector_distance"));}

If you want to use a field mask to return a subset of document fields along with adistanceResultField, then you must also include the value ofdistanceResultField in the field mask, as shown in the following example:

Python
vector_query=collection.select(["color","vector_distance"]).find_nearest(vector_field="embedding_field",query_vector=Vector([0.3416704,0.18332680,0.24160706]),distance_measure=DistanceMeasure.EUCLIDEAN,limit=10,distance_result_field="vector_distance",)
Node.js
constvectorQuery:VectorQuery=coll.select('name','description','vector_distance').findNearest({vectorField:'embedding_field',queryVector:[3.0,1.0,2.0],limit:10,distanceMeasure:'EUCLIDEAN',distanceResultField:'vector_distance'});
Go
import("context""fmt""io""cloud.google.com/go/firestore")funcvectorSearchDistanceResultFieldMasked(wio.Writer,projectIDstring)error{ctx:=context.Background()client,err:=firestore.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("firestore.NewClient: %w",err)}deferclient.Close()collection:=client.Collection("coffee-beans")// Requires a vector index// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexesvectorQuery:=collection.Select("color","vector_distance").FindNearest("embedding_field",[]float32{3.0,1.0,2.0},10,firestore.DistanceMeasureEuclidean,&firestore.FindNearestOptions{DistanceResultField:"vector_distance",})docs,err:=vectorQuery.Documents(ctx).GetAll()iferr!=nil{fmt.Fprintf(w,"failed to get vector query results: %v",err)returnerr}for_,doc:=rangedocs{fmt.Fprintf(w,"%v, Distance: %v\n",doc.Data()["color"],doc.Data()["vector_distance"])}returnnil}
Java
importcom.google.cloud.firestore.VectorQuery;importcom.google.cloud.firestore.VectorQueryOptions;importcom.google.cloud.firestore.VectorQuerySnapshot;VectorQueryvectorQuery=coll.select("name","description","vector_distance").findNearest("embedding_field",newdouble[]{3.0,1.0,2.0},/* limit */10,VectorQuery.DistanceMeasure.EUCLIDEAN,VectorQueryOptions.newBuilder().setDistanceResultField("vector_distance").build());ApiFuture<VectorQuerySnapshot>future=vectorQuery.get();VectorQuerySnapshotvectorQuerySnapshot=future.get();for(DocumentSnapshotdocument:vectorQuerySnapshot.getDocuments()){System.out.println(document.getId()+" Distance: "+document.get("vector_distance"));}

Specify a distance threshold

You can specify a similarity threshold that returns only documents within thethreshold. The behavior of the threshold field depends on the distance measureyou choose:

  • EUCLIDEAN andCOSINE distances limit the threshold to documents wheredistance is less than or equal to the specified threshold. These distancemeasures decrease as the vectors become more similar.
  • DOT_PRODUCT distance limits the threshold to documents where distance isgreater than or equal to the specified threshold. Dot product distancesincrease as the vectors become more similar.

The following example shows how to specify a distance threshold to return up to 10 nearest documents that are, at most, 4.5 units away using theEUCLIDEAN distance metric:

Python
fromgoogle.cloud.firestore_v1.base_vector_queryimportDistanceMeasurefromgoogle.cloud.firestore_v1.vectorimportVectorcollection=db.collection("coffee-beans")vector_query=collection.find_nearest(vector_field="embedding_field",query_vector=Vector([0.3416704,0.18332680,0.24160706]),distance_measure=DistanceMeasure.EUCLIDEAN,limit=10,distance_threshold=4.5,)docs=vector_query.stream()fordocindocs:print(f"{doc.id}")
Node.js
constvectorQuery:VectorQuery=coll.findNearest({vectorField:'embedding_field',queryVector:[3.0,1.0,2.0],limit:10,distanceMeasure:'EUCLIDEAN',distanceThreshold:4.5});constsnapshot:VectorQuerySnapshot=awaitvectorQuery.get();snapshot.forEach((doc)=>{console.log(doc.id);});
Go
import("context""fmt""io""cloud.google.com/go/firestore")funcvectorSearchDistanceThreshold(wio.Writer,projectIDstring)error{ctx:=context.Background()client,err:=firestore.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("firestore.NewClient: %w",err)}deferclient.Close()collection:=client.Collection("coffee-beans")// Requires a vector index// https://firebase.google.com/docs/firestore/vector-search#create_and_manage_vector_indexesvectorQuery:=collection.FindNearest("embedding_field",[]float32{3.0,1.0,2.0},10,firestore.DistanceMeasureEuclidean,&firestore.FindNearestOptions{DistanceThreshold:firestore.Ptr[float64](4.5),})docs,err:=vectorQuery.Documents(ctx).GetAll()iferr!=nil{fmt.Fprintf(w,"failed to get vector query results: %v",err)returnerr}for_,doc:=rangedocs{fmt.Fprintln(w,doc.Data()["name"])}returnnil}
Java
importcom.google.cloud.firestore.VectorQuery;importcom.google.cloud.firestore.VectorQueryOptions;importcom.google.cloud.firestore.VectorQuerySnapshot;VectorQueryvectorQuery=coll.findNearest("embedding_field",newdouble[]{3.0,1.0,2.0},/* limit */10,VectorQuery.DistanceMeasure.EUCLIDEAN,VectorQueryOptions.newBuilder().setDistanceThreshold(4.5).build());ApiFuture<VectorQuerySnapshot>future=vectorQuery.get();VectorQuerySnapshotvectorQuerySnapshot=future.get();for(DocumentSnapshotdocument:vectorQuerySnapshot.getDocuments()){System.out.println(document.getId());}

Limitations

As you work with vector embeddings, note the following limitations:

  • The maximum supported embedding dimension is 2048. To store larger indexes, usedimensionality reduction.
  • The maximum number of documents to return from a nearest-neighbor query is 1000.
  • Vector search does not supportreal-time snapshot listeners.
  • Only the Python, Node.js, Go, and Java client libraries support vector search.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.