Manage indexes

The following sections describe how to configure, create, list, and delete yourindexes.

Index overview

An index is a file or files consisting of your embedding vectors. These vectorsare made from large amounts of data you want to deploy and query with Vector Search.With Vector Search, you can create two types of indexes,depending on how you plan to update them with your data. You can create an index designed forbatch updates, or an index designed forstreaming your updates.

A batch index is for when you want to update your index in a batch, with datawhich has been stored over a set amount of time, like systems which areprocessed weekly or monthly. A streaming index is when you want index datato be updated as new data is added to your data store, for example, if youhave a bookstore and want to show new inventory online as soon as possible. Whichtype you choose is important, since setup and requirements are different.

Configure index parameters

Before youcreate an index,configure the parameters for your index.

For example, create a file namedindex_metadata.json:

{  "contentsDeltaUri": "gs://BUCKET_NAME/path",  "config": {    "dimensions": 100,    "approximateNeighborsCount": 150,    "distanceMeasureType": "DOT_PRODUCT_DISTANCE",    "shardSize": "SHARD_SIZE_MEDIUM",    "algorithm_config": {      "treeAhConfig": {        "leafNodeEmbeddingCount": 5000,        "fractionLeafNodesToSearch": 0.03      }    }  }}

You can find the definition for each of these fields inIndex configuration parameters.

Create an index

Index size

Index data is split into equal parts called shards for processing. When youcreate an index, you must specify the size of the shards to use. The supportedsizes are as follows:

  • SHARD_SIZE_SMALL: 2 GiB per shard.
  • SHARD_SIZE_MEDIUM: 20 GiB per shard.
  • SHARD_SIZE_LARGE: 50 GiB per shard.

The machine types that you can use to deploy your index(using public endpoints orusing VPC endpoints)depends on the shard size of the index. The following table shows the shardsizes that each machine type supports:

Machine typeSHARD_SIZE_SMALLSHARD_SIZE_MEDIUMSHARD_SIZE_LARGE
n1-standard-16
n1-standard-32
e2-standard-2 (default)
e2-standard-16 (default)
e2-highmem-16 (default)
n2d-standard-32

To learn how shard size and machine type affect pricing, see theVertex AI pricing page. To learn howshard size impacts performance, seeConfiguration parameters that impact performance.

Create an index for batch update

Use these instructions to create and deploy your index. If you don't have yourembeddings ready yet, you can skip toCreate an empty batch index.With this option, no embeddings data is required at index creation time.

Note: Thedimensions field is relevant only to dense embeddings.

Tocreate an index:

gcloud

Before using any of the command data below, make the following replacements:

  • LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
  • INDEX_NAME: Display name for the index.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexescreate\--metadata-file=LOCAL_PATH_TO_METADATA_FILE\--display-name=INDEX_NAME\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexescreate`--metadata-file=LOCAL_PATH_TO_METADATA_FILE`--display-name=INDEX_NAME`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexescreate^--metadata-file=LOCAL_PATH_TO_METADATA_FILE^--display-name=INDEX_NAME^--region=LOCATION^--project=PROJECT_ID

You should receive a response similar to the following:

You can poll for the status of the operation for the responseto include "done": true. Use the following example to poll the status.  $ gcloud ai operations describe 1234567890123456789 --project=my-test-project --region=us-central1

Seegcloud ai operations to learnmore about thedescribe command.

REST

Before using any of the request data, make the following replacements:

  • INPUT_DIR: The Cloud Storage directory path of the index content.
  • INDEX_NAME: Display name for the index.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes

Request JSON body:

{  "display_name": "INDEX_NAME",  "metadata": {    "contentsDeltaUri": "INPUT_DIR",    "config": {      "dimensions": 100,      "approximateNeighborsCount": 150,      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",      "algorithm_config": {        "treeAhConfig": {          "leafNodeEmbeddingCount": 500,          "leafNodesToSearchPercent": 7        }      }    }  }}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID",  "metadata": {    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexOperationMetadata",    "genericMetadata": {      "createTime": "2022-01-08T01:21:10.147035Z",      "updateTime": "2022-01-08T01:21:10.147035Z"    }  }}

Terraform

The following sample uses thegoogle_vertex_ai_index Terraform resource to create an index for batch updates.

To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.

# Cloud Storage bucket name must be uniqueresource"random_id""bucket_name_suffix"{byte_length=8}# Create a Cloud Storage bucketresource"google_storage_bucket""bucket"{name="vertex-ai-index-bucket-${random_id.bucket_name_suffix.hex}"location="us-central1"uniform_bucket_level_access=true}# Create index contentresource"google_storage_bucket_object""data"{name="contents/data.json"bucket=google_storage_bucket.bucket.namecontent=<<EOF{"id":"42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}{"id":"43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}EOF}resource"google_vertex_ai_index""default"{region="us-central1"display_name="sample-index-batch-update"description="A sample index for batch update"labels={foo="bar"}metadata{contents_delta_uri="gs://${google_storage_bucket.bucket.name}/contents"config{dimensions=2approximate_neighbors_count=150distance_measure_type="DOT_PRODUCT_DISTANCE"algorithm_config{tree_ah_config{leaf_node_embedding_count=500leaf_nodes_to_search_percent=7}}}}index_update_method="BATCH_UPDATE"timeouts{create="2h"update="1h"}}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_create_index(project:str,location:str,display_name:str,gcs_uri:Optional[str]=None)->aiplatform.MatchingEngineIndex:"""Create a vector search index.    Args:        project (str): Required. Project ID        location (str): Required. The region name        display_name (str): Required. The index display name        gcs_uri (str): Optional. The Google Cloud Storage uri for index content    Returns:        The created MatchingEngineIndex.    """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#CreateIndexindex=aiplatform.MatchingEngineIndex.create_tree_ah_index(display_name=display_name,contents_delta_uri=gcs_uri,description="Matching Engine Index",dimensions=100,approximate_neighbors_count=150,leaf_node_embedding_count=500,leaf_nodes_to_search_percent=7,index_update_method="BATCH_UPDATE",#Options:STREAM_UPDATE,BATCH_UPDATEdistance_measure_type=aiplatform.matching_engine.matching_engine_index_config.DistanceMeasureType.DOT_PRODUCT_DISTANCE,)returnindex

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.CreateIndexRequest;importcom.google.cloud.aiplatform.v1.Index;importcom.google.cloud.aiplatform.v1.Index.IndexUpdateMethod;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importcom.google.cloud.aiplatform.v1.LocationName;importcom.google.protobuf.Value;importcom.google.protobuf.util.JsonFormat;importjava.util.concurrent.TimeUnit;publicclassCreateIndexSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";StringdisplayName="YOUR_INDEX_DISPLAY_NAME";StringcontentsDeltaUri="gs://YOUR_BUCKET/";StringmetadataJson=String.format("{\n"+"  \"contentsDeltaUri\": \"%s\",\n"+"  \"config\": {\n"+"    \"dimensions\": 100,\n"+"        \"approximateNeighborsCount\": 150,\n"+"        \"distanceMeasureType\": \"DOT_PRODUCT_DISTANCE\",\n"+"        \"shardSize\": \"SHARD_SIZE_MEDIUM\",\n"+"        \"algorithm_config\": {\n"+"      \"treeAhConfig\": {\n"+"        \"leafNodeEmbeddingCount\": 5000,\n"+"            \"fractionLeafNodesToSearch\": 0.03\n"+"      }\n"+"    }\n"+"  }\n"+"}",contentsDeltaUri);createIndexSample(project,location,displayName,metadataJson);}publicstaticIndexcreateIndexSample(Stringproject,Stringlocation,StringdisplayName,StringmetadataJson)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){Value.BuildermetadataBuilder=Value.newBuilder();JsonFormat.parser().merge(metadataJson,metadataBuilder);CreateIndexRequestrequest=CreateIndexRequest.newBuilder().setParent(LocationName.of(project,location).toString()).setIndex(Index.newBuilder().setDisplayName(displayName).setMetadata(metadataBuilder).setIndexUpdateMethod(IndexUpdateMethod.BATCH_UPDATE)).build();returnindexServiceClient.createIndexAsync(request).get(5,TimeUnit.MINUTES);}}}

Console

Use these instructions to create an index for batch updates.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. Click Create new index to open theIndex pane. TheCreate a new index pane appears.
  3. In theDisplay name field, provide a name to uniquely identify your index.
  4. In theDescription field, provide a description for what the index is for.
  5. In theRegion field, select a region from the drop-down.
  6. In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
  7. In theAlgorithm type drop-down, select the algorithm type that Vector Search uses for efficient search. If you select the treeAh algorithm, enter the approximate neighbors count.
  8. In theDimensions field, enter the number of dimensions of your input vectors.
  9. In theUpdate method field, selectBatch.
  10. In theShard size field, select from the drop-down the shard size you want.
  11. ClickCreate. Your new index appears in your list of indexes once it's ready. Note: Build time can take up to an hour to complete.

Create an empty batch index

To create and deploy your index right away, you can create an empty batch index.With this option, no embeddings data is required at index creation time.

To create an empty index, the request is almost identical to creating an indexfor batch updates. The difference is you remove thecontentsDeltaUri field,since you aren't linking a data location. Here's an empty batch index example:

Empty index request example

{  "display_name":INDEX_NAME,  "indexUpdateMethod": "BATCH_UPDATE",  "metadata": {    "config": {      "dimensions": 100,      "approximateNeighborsCount": 150,      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",      "algorithm_config": {        "treeAhConfig": {          "leafNodeEmbeddingCount": 500,          "fractionLeafNodesToSearch": 0.07        }      }    }  }}

Create an index for streaming updates

Use these instructions to create and deploy your streaming index. If you don'thave your embeddings ready yet, skip toCreate an empty index for streaming updates.With this option, no embeddings data is required at index creation time.

Note: Thedimensions field is relevant only to dense embeddings.

REST

Before using any of the request data, make the following replacements:

HTTP method and URL:

POST https://ENDPOINT-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes

Request JSON body:

{  displayName: "INDEX_NAME",  description: "DESCRIPTION",  metadata: {     contentsDeltaUri: "INPUT_DIR",     config: {        dimensions: "DIMENSIONS",        approximateNeighborsCount: 150,        distanceMeasureType: "DOT_PRODUCT_DISTANCE",        algorithmConfig: {treeAhConfig: {leafNodeEmbeddingCount: 10000, leafNodesToSearchPercent: 2}}     },  },  indexUpdateMethod: "STREAM_UPDATE"}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://ENDPOINT-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://ENDPOINT-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{  "name": "projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID",  "metadata": {    "@type": "type.googleapis.com/google.cloud.aiplatform.ui.CreateIndexOperationMetadata",    "genericMetadata": {      "createTime": "2023-12-05T23:17:45.416117Z",      "updateTime": "2023-12-05T23:17:45.416117Z",      "state": "RUNNING",      "worksOn": [        "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID"      ]    }  }}

Terraform

The following sample uses thegoogle_vertex_ai_index Terraform resource to create an index for streaming updates.

To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.

# Cloud Storage bucket name must be uniqueresource"random_id""default"{byte_length=8}# Create a Cloud Storage bucketresource"google_storage_bucket""bucket"{name="vertex-ai-index-bucket-${random_id.default.hex}"location="us-central1"uniform_bucket_level_access=true}# Create index contentresource"google_storage_bucket_object""data"{name="contents/data.json"bucket=google_storage_bucket.bucket.namecontent=<<EOF{"id":"42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}{"id":"43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}EOF}resource"google_vertex_ai_index""streaming_index"{region="us-central1"display_name="sample-index-streaming-update"description="A sample index for streaming update"labels={foo="bar"}metadata{contents_delta_uri="gs://${google_storage_bucket.bucket.name}/contents"config{dimensions=2approximate_neighbors_count=150distance_measure_type="DOT_PRODUCT_DISTANCE"algorithm_config{tree_ah_config{leaf_node_embedding_count=500leaf_nodes_to_search_percent=7}}}}index_update_method="STREAM_UPDATE"timeouts{create="2h"update="1h"}}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_create_streaming_index(project:str,location:str,display_name:str,gcs_uri:Optional[str]=None)->aiplatform.MatchingEngineIndex:"""Create a vector search index.    Args:        project (str): Required. Project ID        location (str): Required. The region name        display_name (str): Required. The index display name        gcs_uri (str): Optional. The Google Cloud Storage uri for index content    Returns:        The created MatchingEngineIndex.    """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#CreateIndexindex=aiplatform.MatchingEngineIndex.create_tree_ah_index(display_name=display_name,contents_delta_uri=gcs_uri,description="Matching Engine Index",dimensions=100,approximate_neighbors_count=150,leaf_node_embedding_count=500,leaf_nodes_to_search_percent=7,index_update_method="STREAM_UPDATE",#Options:STREAM_UPDATE,BATCH_UPDATEdistance_measure_type=aiplatform.matching_engine.matching_engine_index_config.DistanceMeasureType.DOT_PRODUCT_DISTANCE,)returnindex

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.CreateIndexRequest;importcom.google.cloud.aiplatform.v1.Index;importcom.google.cloud.aiplatform.v1.Index.IndexUpdateMethod;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importcom.google.cloud.aiplatform.v1.LocationName;importcom.google.protobuf.Value;importcom.google.protobuf.util.JsonFormat;importjava.util.concurrent.TimeUnit;publicclassCreateStreamingIndexSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";StringdisplayName="YOUR_INDEX_DISPLAY_NAME";StringcontentsDeltaUri="gs://YOUR_BUCKET/";StringmetadataJson=String.format("{\n"+"  \"contentsDeltaUri\": \"%s\",\n"+"  \"config\": {\n"+"    \"dimensions\": 100,\n"+"        \"approximateNeighborsCount\": 150,\n"+"        \"distanceMeasureType\": \"DOT_PRODUCT_DISTANCE\",\n"+"        \"shardSize\": \"SHARD_SIZE_MEDIUM\",\n"+"        \"algorithm_config\": {\n"+"      \"treeAhConfig\": {\n"+"        \"leafNodeEmbeddingCount\": 5000,\n"+"            \"fractionLeafNodesToSearch\": 0.03\n"+"      }\n"+"    }\n"+"  }\n"+"}",contentsDeltaUri);createStreamingIndexSample(project,location,displayName,metadataJson);}publicstaticIndexcreateStreamingIndexSample(Stringproject,Stringlocation,StringdisplayName,StringmetadataJson)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){Value.BuildermetadataBuilder=Value.newBuilder();JsonFormat.parser().merge(metadataJson,metadataBuilder);CreateIndexRequestrequest=CreateIndexRequest.newBuilder().setParent(LocationName.of(project,location).toString()).setIndex(Index.newBuilder().setDisplayName(displayName).setMetadata(metadataBuilder).setIndexUpdateMethod(IndexUpdateMethod.STREAM_UPDATE)).build();returnindexServiceClient.createIndexAsync(request).get(5,TimeUnit.MINUTES);}}}

Console

Use these instructions to create an index for streaming updates in the Google Cloud console.

Tocreate an index availablefor Streaming Updates requires similar steps to setting up a Batch Update index,except you need to setindexUpdateMethod toSTREAM_UPDATE.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. Click Create new index to open theIndex pane. TheCreate a new index pane appears.
  3. In theDisplay name field, provide a name to uniquely identify your index.
  4. In theDescription field, provide a description for what the index is for.
  5. In theRegion field, select a region from the drop-down.
  6. In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
  7. In theAlgorithm type drop-down, select the algorithm type that Vector Search will use to perform your search. If you select the treeAh algorithm, enter the approximate neighbors count.
  8. In theDimensions field, enter the number of dimensions of your input vectors.
  9. In theUpdate method field, selectStream.
  10. In theShard size field, select from the drop-down the shard size you want.
  11. ClickCreate. Your new index appears in your list of indexes once it's ready. Note: Build time can take up to an hour to complete.

Create an empty index for streaming updates

To create and deploy your index right away, you can create an empty index for streaming.With this option, no embeddings data is required at index creation time.

To create an empty index, the request is almost identical to creating an indexfor streaming. The difference is you remove thecontentsDeltaUri field,since you aren't linking a data location. Here's an empty streaming index example:

Empty index request example

{  "display_name":INDEX_NAME,  "indexUpdateMethod": "STREAM_UPDATE",  "metadata": {    "config": {      "dimensions": 100,      "approximateNeighborsCount": 150,      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",      "algorithm_config": {        "treeAhConfig": {          "leafNodeEmbeddingCount": 500,          "leafNodesToSearchPercent": 7        }      }    }  }}

List indexes

gcloud

Before using any of the command data below, make the following replacements:

  • INDEX_NAME: Display name for the index.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexeslist\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexeslist`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexeslist^--region=LOCATION^--project=PROJECT_ID

You should receive a response similar to the following:

You can poll for the status of the operation for the responseto include "done": true. Use the following example to poll the status.  $ gcloud ai operations describe 1234567890123456789 --project=my-test-project --region=us-central1

Seegcloud ai operations to learnmore about thedescribe command.

REST

Before using any of the request data, make the following replacements:

  • INDEX_NAME: Display name for the index.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{ "indexes": [   {     "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID",     "displayName": "INDEX_NAME",     "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/matchingengine/metadata/nearest_neighbor_search_1.0.0.yaml",     "metadata": {       "config": {         "dimensions": 100,         "approximateNeighborsCount": 150,         "distanceMeasureType": "DOT_PRODUCT_DISTANCE",         "featureNormType": "NONE",         "algorithmConfig": {           "treeAhConfig": {             "maxLeavesToSearch": 50,             "leafNodeCount": 10000           }         }       }     },     "etag": "AMEw9yNU8YX5IvwuINeBkVv3yNa7VGKk11GBQ8GkfRoVvO7LgRUeOo0qobYWuU9DiEc=",     "createTime": "2020-11-08T21:56:30.558449Z",     "updateTime": "2020-11-08T22:39:25.048623Z"   } ]}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_list_index(project:str,location:str)->List[aiplatform.MatchingEngineIndex]:"""List vector search indexes.Args:project(str):Required.ProjectIDlocation(str):Required.TheregionnameReturns:Listofaiplatform.MatchingEngineIndex"""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#ListIndexesreturnaiplatform.MatchingEngineIndex.list()

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.Index;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceClient.ListIndexesPagedResponse;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importcom.google.cloud.aiplatform.v1.LocationName;publicclassListIndexesSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";for(Indexindex:listIndexesSample(project,location).iterateAll()){System.out.println(index.getName());}}publicstaticListIndexesPagedResponselistIndexesSample(Stringproject,Stringlocation)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){Stringparent=LocationName.of(project,location).toString();returnindexServiceClient.listIndexes(parent);}}}

Console

Use these instructions to view a list of your indexes.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.

Tuning the index

Tuning the index requires setting the configuration parameters that impact theperformance of deployed indexes, especially recall and latency. Theseparameters are set when you first create the index. You canuse brute-force indexes to measure recall.

Configuration parameters that impact performance

The following configuration parameters can be set at index creation time andcan affect recall, latency, availability, and cost when usingVector Search. This guidance applies to most cases. However, alwaysexperiment with your configurations to make sure that they work for your usecase.

For parameter definitions, seeIndex configuration parameters.

ParameterAboutPerformance impact
shardSize

Controls the amount of data on each machine.

When choosing a shard size, estimate how large your dataset will be in the future. If the size of your dataset has an upper bound, pick the appropriate shard size to accommodate it. If there is no upper bound or if your use case is extremely sensitive to latency variability, choosing a large shard size is recommended.

If you configure for a larger number ofsmaller shards, a larger number of candidate results are processed during search. More shards can affect performance in the following ways:

  • Recall: Increased
  • Latency: Potentially increased, more variability
  • Availability: Shard outages affect a smaller percentage of data
  • Cost: Can increase if the same machine type is used with more shards

If you configure for a smaller number oflarger shards, fewer candidate results are processed during search. Fewer shards can affect performance in the following ways:

  • Recall: Decreased
  • Latency: Reduced, less variability
  • Availability: Shard outages affect a larger percentage of data
  • Cost: Can decrease if the same machine type is used with fewer shards
distanceMeasureType

Determines the algorithm used for distance calculation between data points and the query vector.

The followingdistanceMeasureType settings can help reduce query latency:

  • DOT_PRODUCT_DISTANCE is most optimized for reducing latency
  • DOT_PRODUCT_DISTANCE combined with settingFeatureNormType toUNIT_L2_NORM is recommended for cosine similarity
leafNodeEmbeddingCount

The number of embeddings for each leaf node. By default, this number is set to 1000.

Generally, changing the value ofleafNodeEmbeddingCount has less effect than changing the value of other parameters.

Increasing the number of embeddings for each leaf node can reduce latency but reduce recall quality. It can affect performance in the following ways:

  • Recall: Decreased due to less targeted search
  • Latency: Reduced, as long as the value is not >15k for most use cases
  • Availability: No impact
  • Cost: Can decrease because fewer replicas are needed for the same QPS

Decreasing the number of embeddings for each leaf node can affect performance in the following ways:

  • Recall: Can increase because more targeted leafs are collected
  • Latency: Increased
  • Availability: No impact
  • Cost: Can increase because more replicas are needed for the same QPS

Using a brute-force index to measure recall

To get the exact nearest neighbors, use indexes with the brute-force algorithm.The brute-force algorithm provides 100% recall at the expense of higher latency.Using a brute-force index to measure recall is usually not a good choice forproduction serving, but you might find it useful for evaluating the recall ofvarious indexing options offline.

To create an index with the brute-force algorithm, specifybrute_force_config in the index metadata:

curl -X POST -H "Content-Type: application/json" \-H "Authorization: Bearer `gcloud auth print-access-token`" \https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/indexes \-d '{    displayName: "'${DISPLAY_NAME}'",    description: "'${DESCRIPTION}'",    metadata: {       contentsDeltaUri: "'${INPUT_DIR}'",       config: {          dimensions: 100,          approximateNeighborsCount: 150,          distanceMeasureType: "DOT_PRODUCT_DISTANCE",          featureNormType: "UNIT_L2_NORM",          algorithmConfig: {             bruteForceConfig: {}          }       },    },}'

Delete an index

Note: You can't delete theindex until all itsindex.deployed_indexes havebeen undeployed.

gcloud

Before using any of the command data below, make the following replacements:

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexesdeleteINDEX_ID\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexesdeleteINDEX_ID`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindexesdeleteINDEX_ID^--region=LOCATION^--project=PROJECT_ID

REST

Before using any of the request data, make the following replacements:

  • INDEX_ID: The ID of the index.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Execute the following command:

curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID",  "metadata": {    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata",    "genericMetadata": {      "createTime": "2022-01-08T02:35:56.364956Z",      "updateTime": "2022-01-08T02:35:56.364956Z"    }  },  "done": true,  "response": {    "@type": "type.googleapis.com/google.protobuf.Empty"  }}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_delete_index(project:str,location:str,index_name:str)->None:"""Delete a vector search index.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_name(str):Required.Theindextoupdate.Afully-qualifiedindexresourcenameoraindexID.Example:"projects/123/locations/us-central1/indexes/my_index_id"or"my_index_id"."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)#Deletetheindexindex.delete()

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.IndexName;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importjava.util.concurrent.TimeUnit;publicclassDeleteIndexSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";StringindexId="YOUR_INDEX_ID";deleteIndexSample(project,location,indexId);}publicstaticvoiddeleteIndexSample(Stringproject,Stringlocation,StringindexId)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){StringindexName=IndexName.of(project,location,indexId).toString();indexServiceClient.deleteIndexAsync(indexName).get(5,TimeUnit.MINUTES);}}}

Console

Use these instructions to delete one or more indexes.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. To delete an index, go to the options menu that is in the same row as the index and selectDelete.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.