Query public index to get nearest neighbors

After you've created and deployed the index, you can run queries to getthe nearest neighbors.

Here are some examples for a match query to find the top nearest neighbors usingthe k-nearest neighbors algorithm (k-NN).

Example queries for public endpoint

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_find_neighbors(project:str,location:str,index_endpoint_name:str,deployed_index_id:str,queries:List[List[float]],num_neighbors:int,)->List[    List[aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor]]:"""Query the vector search index.    Args:        project (str): Required. Project ID        location (str): Required. The region name        index_endpoint_name (str): Required. Index endpoint to run the query        against.        deployed_index_id (str): Required. The ID of the DeployedIndex to run        the queries against.        queries (List[List[float]]): Required. A list of queries. Each query is        a list of floats, representing a single embedding.        num_neighbors (int): Required. The number of neighbors to return.    Returns:        List[List[aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor]] - A list of nearest neighbors for each query.    """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexendpointinstancefromanexistingendpoint.my_index_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#Querytheindexendpointforthenearestneighbors.returnmy_index_endpoint.find_neighbors(deployed_index_id=deployed_index_id,queries=queries,num_neighbors=num_neighbors,)

Command-line

ThepublicEndpointDomainName listed below can be found atDeploy and is formatted as<number>.<region>-<number>.vdb.vertexai.goog.

  $ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://1957880287.us-central1-181224308459.vdb.vertexai.goog/v1/projects/181224308459/locations/us-central1/indexEndpoints/3370566089086861312:findNeighbors -d '{deployed_index_id: "test_index_public1", queries: [{datapoint: {datapoint_id: "0", feature_vector: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}, neighbor_count: 5}]}'

This curl example demonstrates how to call fromhttp(s) clients, although public endpoint supports dual protocol for restful andgrpc_cli.

  $ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://1957880287.us-central1-181224308459.vdb.vertexai.goog/v1/projects/${PROJECT_ID}/locations/us-central1/indexEndpoints/${INDEX_ENDPOINT_ID}:readIndexDatapoints -d '{deployed_index_id:"test_index_public1", ids: ["606431", "896688"]}'

This curl example demonstrates how to query withtoken and numeric restricts.

  $ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`"  https://${PUBLIC_ENDPOINT_DOMAIN}/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexEndpoints/${INDEX_ENDPOINT_ID}:findNeighbors -d '{deployed_index_id:"${DEPLOYED_INDEX_ID}", queries: [{datapoint: {datapoint_id:"x", feature_vector: [1, 1], "sparse_embedding": {"values": [111.0,111.1,111.2], "dimensions": [10,20,30]}, numeric_restricts: [{namespace: "int-ns", value_int: -2, op: "GREATER"}, {namespace: "int-ns", value_int: 4, op: "LESS_EQUAL"}, {namespace: "int-ns", value_int: 0, op: "NOT_EQUAL"}], restricts: [{namespace: "color", allow_list: ["red"]}]}}]}'

Console

Use these instructions to query an index deployed to a public endpoint from the console.

In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search.
Go to Vector Search
Select the index you want to query. TheIndex info page opens.
Scroll down to theDeployed indexes section and select the deployed index you want to query. TheDeployed index info page opens.
From theQuery index section, select whether to query by a dense embedding value, a sparse embedding value, a hybrid embedding value (dense and sparse embeddings), or a specific data point.
Enter the query parameters for the type of query you selected. For example, if you're querying by a dense embedding, enter the embedding vector to query by.
Execute the query using the provided curl command, or by running with Cloud Shell.
If using Cloud Shell, selectRun in Cloud Shell.
Run in Cloud Shell.
The results return nearest neighbors.

Hybrid queries

Hybrid search uses both dense andsparse embeddings for searches based on combination of keyword search andsemantic search.

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_find_neighbors_hybrid_queries(project:str,location:str,index_endpoint_name:str,deployed_index_id:str,num_neighbors:int,)->List[List[aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor]]:"""Query the vector search index using example hybrid queries.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_endpoint_name(str):Required.Indexendpointtorunthequeryagainst.deployed_index_id(str):Required.TheIDoftheDeployedIndextorunthequeriesagainst.num_neighbors(int):Required.Thenumberofneighborstoreturn.Returns:List[List[aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor]]-Alistofnearestneighborsforeachquery."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexendpointinstancefromanexistingendpoint.my_index_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#Queryhybriddatapoints,sparse-onlydatapoints,anddense-onlydatapoints.hybrid_queries=[aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(dense_embedding=[1,2,3],sparse_embedding_dimensions=[10,20,30],sparse_embedding_values=[1.0,1.0,1.0],rrf_ranking_alpha=0.5,),aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(dense_embedding=[1,2,3],sparse_embedding_dimensions=[10,20,30],sparse_embedding_values=[0.1,0.2,0.3],),aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(sparse_embedding_dimensions=[10,20,30],sparse_embedding_values=[0.1,0.2,0.3],),aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(dense_embedding=[1,2,3]),]returnmy_index_endpoint.find_neighbors(deployed_index_id=deployed_index_id,queries=hybrid_queries,num_neighbors=num_neighbors,)

Queries with filtering and crowding

Filtering vector matches lets you restrictyour nearest neighbor results to specific categories. Filters can also designatecategories to exclude from your results.

Per-crowding neighbor limitscan increase result diversity by limiting the number of results returned fromany singlecrowding tag in yourindex data.

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_find_neighbors_filtering_crowding(project:str,location:str,index_endpoint_name:str,deployed_index_id:str,queries:List[List[float]],num_neighbors:int,filter:List[aiplatform.matching_engine.matching_engine_index_endpoint.Namespace],numeric_filter:List[        aiplatform.matching_engine.matching_engine_index_endpoint.NumericNamespace],per_crowding_attribute_neighbor_count:int,)->List[    List[aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor]]:"""Query the vector search index with filtering and crowding.    Args:        project (str): Required. Project ID        location (str): Required. The region name        index_endpoint_name (str): Required. Index endpoint to run the query        against.        deployed_index_id (str): Required. The ID of the DeployedIndex to run        the queries against.        queries (List[List[float]]): Required. A list of queries. Each query is        a list of floats, representing a single embedding.        num_neighbors (int): Required. The number of neighbors to return.        filter (List[Namespace]): Required. A list of Namespaces for filtering        the matching results. For example,        [Namespace("color", ["red"], []), Namespace("shape", [], ["square"])]        will match datapoints that satisfy "redcolor" but not include        datapoints with "squareshape".        numeric_filter (List[NumericNamespace]): Required. A list of        NumericNamespaces for filtering the matching results. For example,        [NumericNamespace(name="cost", value_int=5, op="GREATER")] will limit        the matching results to datapoints with cost greater than 5.        per_crowding_attribute_neighbor_count (int): Required. The maximum        number of returned matches with the same crowding tag.    Returns:        List[List[aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor]] - A list of nearest neighbors for each query.    """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexendpointinstancefromanexistingendpoint.my_index_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#Querytheindexendpointforthenearestneighbors.returnmy_index_endpoint.find_neighbors(deployed_index_id=deployed_index_id,queries=queries,num_neighbors=num_neighbors,filter=filter,numeric_filter=numeric_filter,per_crowding_attribute_neighbor_count=per_crowding_attribute_neighbor_count,)

Query-time settings that impact performance

The following query-time parameters can affect latency, availability, andcost when using Vector Search. This guidance applies to most cases.However, always experiment with your configurations to make sure that they workfor your use case.

For parameter definitions, seeIndex configurationparameters.

Parameter About Performance impact

Parameter	About	Performance impact
`approximateNeighborsCount`	Tells the algorithm the number of approximate results to retrieve from each shard. The value of`approximateNeighborsCount` should always be greater than the value of`setNeighborsCount`. If the value of`setNeighborsCount` is small, 10 times that value is recommended for`approximateNeighborsCount`. For larger`setNeighborsCount` values, a smaller multiplier can be used. The corresponding REST API name for this field is`approximate_neighbor_count`.	Increasing the value of`approximateNeighborsCount` can affect performance in the following ways: Recall: Increased Latency: Potentially increased Availability: No impact Cost: Can increase because more data is processed during a search Decreasing the value of`approximateNeighborsCount` can affect performance in the following ways: Recall: Decreased Latency: Potentially decreases Availability: No impact Cost: Can decrease cost because less data is processed during a search
`setNeighborCount`	Specifies the number of results that you want the query to return. The corresponding REST API name for this field is`neighbor_count`.	Values less than or equal to 300 remain performant in most use cases. For larger values, test for your specific use case.
`fractionLeafNodesToSearch`	Controls the percentage of leaf nodes to visit when searching for nearest neighbors. This is related to the`leafNodeEmbeddingCount` in that the more embeddings per leaf node, the more data examined per leaf. The corresponding REST API name for this field is`fraction_leaf_nodes_to_search_override`.	Increasing the value of`fractionLeafNodesToSearch` can affect performance in the following ways: Recall: Increased Latency: Increased Availability: No impact Cost: Can increase because higher latency occupies more machine resources Decreasing the value of`fractionLeafNodesToSearch` can affect performance in the following ways: Recall: Decreased Latency: Decreased Availability: No impact Cost: Can decrease because lower latency occupies fewer machine resources

approximateNeighborsCount

Tells the algorithm the number of approximate results to retrieve from each shard.

The value ofapproximateNeighborsCount should always be greater than the value ofsetNeighborsCount. If the value ofsetNeighborsCount is small, 10 times that value is recommended forapproximateNeighborsCount. For largersetNeighborsCount values, a smaller multiplier can be used.

The corresponding REST API name for this field isapproximate_neighbor_count.

Increasing the value ofapproximateNeighborsCount can affect performance in the following ways:

Recall: Increased
Latency: Potentially increased
Availability: No impact
Cost: Can increase because more data is processed during a search

Decreasing the value ofapproximateNeighborsCount can affect performance in the following ways:

Recall: Decreased
Latency: Potentially decreases
Availability: No impact
Cost: Can decrease cost because less data is processed during a search

setNeighborCount

Specifies the number of results that you want the query to return.

The corresponding REST API name for this field isneighbor_count.

Values less than or equal to 300 remain performant in most use cases. For larger values, test for your specific use case.

fractionLeafNodesToSearch

Controls the percentage of leaf nodes to visit when searching for nearest neighbors. This is related to theleafNodeEmbeddingCount in that the more embeddings per leaf node, the more data examined per leaf.

The corresponding REST API name for this field isfraction_leaf_nodes_to_search_override.

Increasing the value offractionLeafNodesToSearch can affect performance in the following ways:

Recall: Increased
Latency: Increased
Availability: No impact
Cost: Can increase because higher latency occupies more machine resources

Decreasing the value offractionLeafNodesToSearch can affect performance in the following ways:

Recall: Decreased
Latency: Decreased
Availability: No impact
Cost: Can decrease because lower latency occupies fewer machine resources

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

Query public index to get nearest neighbors Stay organized with collections Save and categorize content based on your preferences.

Example queries for public endpoint

Python

Command-line

Console

Hybrid queries

Python

Queries with filtering and crowding

Python

Query-time settings that impact performance

Query public index to get nearest neighbors