Manage indexes

The following sections describe how to configure, create, list, and delete yourindexes.

Index overview

An index is a file or files consisting of your embedding vectors. These vectorsare made from large amounts of data you want to deploy and query with Vector Search.With Vector Search, you can create two types of indexes,depending on how you plan to update them with your data. You can create an index designed forbatch updates, or an index designed forstreaming your updates.

A batch index is for when you want to update your index in a batch, with datawhich has been stored over a set amount of time, like systems which areprocessed weekly or monthly. A streaming index is when you want index datato be updated as new data is added to your data store, for example, if youhave a bookstore and want to show new inventory online as soon as possible. Whichtype you choose is important, since setup and requirements are different.

Configure index parameters

Before youcreate an index,configure the parameters for your index.

For example, create a file namedindex_metadata.json:

{  "contentsDeltaUri": "gs://BUCKET_NAME/path",  "config": {    "dimensions": 100,    "approximateNeighborsCount": 150,    "distanceMeasureType": "DOT_PRODUCT_DISTANCE",    "shardSize": "SHARD_SIZE_MEDIUM",    "algorithm_config": {      "treeAhConfig": {        "leafNodeEmbeddingCount": 5000,        "fractionLeafNodesToSearch": 0.03      }    }  }}

You can find the definition for each of these fields inIndex configuration parameters.

Create an index

Index size

Index data is split into equal parts called shards for processing. When youcreate an index, you must specify the size of the shards to use. The supportedsizes are as follows:

SHARD_SIZE_SMALL: 2 GiB per shard.
SHARD_SIZE_MEDIUM: 20 GiB per shard.
SHARD_SIZE_LARGE: 50 GiB per shard.

The machine types that you can use to deploy your index(using public endpoints orusing VPC endpoints)depends on the shard size of the index. The following table shows the shardsizes that each machine type supports:

Machine type	`SHARD_SIZE_SMALL`	`SHARD_SIZE_MEDIUM`	`SHARD_SIZE_LARGE`
`n1-standard-16`
`n1-standard-32`
`e2-standard-2`	(default)
`e2-standard-16`		(default)
`e2-highmem-16`			(default)
`n2d-standard-32`

To learn how shard size and machine type affect pricing, see theVertex AI pricing page. To learn howshard size impacts performance, seeConfiguration parameters that impact performance.

Create an index for batch update

Use these instructions to create and deploy your index. If you don't have yourembeddings ready yet, you can skip toCreate an empty batch index.With this option, no embeddings data is required at index creation time.

Note: Thedimensions field is relevant only to dense embeddings.

To create an index:

gcloud

Before using any of the command data below, make the following replacements:

LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
INDEX_NAME: Display name for the index.
LOCATION: The region where you are using Vertex AI.
PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexescreate\--metadata-file=LOCAL_PATH_TO_METADATA_FILE\--display-name=INDEX_NAME\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexescreate`--metadata-file=LOCAL_PATH_TO_METADATA_FILE`--display-name=INDEX_NAME`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexescreate^--metadata-file=LOCAL_PATH_TO_METADATA_FILE^--display-name=INDEX_NAME^--region=LOCATION^--project=PROJECT_ID

You should receive a response similar to the following:

You can poll for the status of the operation for the responseto include "done": true. Use the following example to poll the status.  $ gcloud ai operations describe 1234567890123456789 --project=my-test-project --region=us-central1

Seegcloud ai operations to learnmore about thedescribe command.

REST

Before using any of the request data, make the following replacements:

INPUT_DIR: The Cloud Storage directory path of the index content.
INDEX_NAME: Display name for the index.
LOCATION: The region where you are using Vertex AI.
PROJECT_ID: Your Google Cloudproject ID.
PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes

Request JSON body:

{  "display_name": "INDEX_NAME",  "metadata": {    "contentsDeltaUri": "INPUT_DIR",    "config": {      "dimensions": 100,      "approximateNeighborsCount": 150,      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",      "algorithm_config": {        "treeAhConfig": {          "leafNodeEmbeddingCount": 500,          "leafNodesToSearchPercent": 7        }      }    }  }}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by running gcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by running gcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID",  "metadata": {    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexOperationMetadata",    "genericMetadata": {      "createTime": "2022-01-08T01:21:10.147035Z",      "updateTime": "2022-01-08T01:21:10.147035Z"    }  }}

Terraform

The following sample uses thegoogle_vertex_ai_index Terraform resource to create an index for batch updates.

To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.

# Cloud Storage bucket name must be uniqueresource"random_id""bucket_name_suffix"{byte_length=8}# Create a Cloud Storage bucketresource"google_storage_bucket""bucket"{name="vertex-ai-index-bucket-${random_id.bucket_name_suffix.hex}"location="us-central1"uniform_bucket_level_access=true}# Create index contentresource"google_storage_bucket_object""data"{name="contents/data.json"bucket=google_storage_bucket.bucket.namecontent=<<EOF{"id":"42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}{"id":"43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}EOF}resource"google_vertex_ai_index""default"{region="us-central1"display_name="sample-index-batch-update"description="A sample index for batch update"labels={foo="bar"}metadata{contents_delta_uri="gs://${google_storage_bucket.bucket.name}/contents"config{dimensions=2approximate_neighbors_count=150distance_measure_type="DOT_PRODUCT_DISTANCE"algorithm_config{tree_ah_config{leaf_node_embedding_count=500leaf_nodes_to_search_percent=7}}}}index_update_method="BATCH_UPDATE"timeouts{create="2h"update="1h"}}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_create_index(project:str,location:str,display_name:str,gcs_uri:Optional[str]=None)->aiplatform.MatchingEngineIndex:"""Create a vector search index.    Args:        project (str): Required. Project ID        location (str): Required. The region name        display_name (str): Required. The index display name        gcs_uri (str): Optional. The Google Cloud Storage uri for index content    Returns:        The created MatchingEngineIndex.    """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#CreateIndexindex=aiplatform.MatchingEngineIndex.create_tree_ah_index(display_name=display_name,contents_delta_uri=gcs_uri,description="Matching Engine Index",dimensions=100,approximate_neighbors_count=150,leaf_node_embedding_count=500,leaf_nodes_to_search_percent=7,index_update_method="BATCH_UPDATE",#Options:STREAM_UPDATE,BATCH_UPDATEdistance_measure_type=aiplatform.matching_engine.matching_engine_index_config.DistanceMeasureType.DOT_PRODUCT_DISTANCE,)returnindex

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.CreateIndexRequest;importcom.google.cloud.aiplatform.v1.Index;importcom.google.cloud.aiplatform.v1.Index.IndexUpdateMethod;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importcom.google.cloud.aiplatform.v1.LocationName;importcom.google.protobuf.Value;importcom.google.protobuf.util.JsonFormat;importjava.util.concurrent.TimeUnit;publicclassCreateIndexSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";StringdisplayName="YOUR_INDEX_DISPLAY_NAME";StringcontentsDeltaUri="gs://YOUR_BUCKET/";StringmetadataJson=String.format("{\n"+"  \"contentsDeltaUri\": \"%s\",\n"+"  \"config\": {\n"+"    \"dimensions\": 100,\n"+"        \"approximateNeighborsCount\": 150,\n"+"        \"distanceMeasureType\": \"DOT_PRODUCT_DISTANCE\",\n"+"        \"shardSize\": \"SHARD_SIZE_MEDIUM\",\n"+"        \"algorithm_config\": {\n"+"      \"treeAhConfig\": {\n"+"        \"leafNodeEmbeddingCount\": 5000,\n"+"            \"fractionLeafNodesToSearch\": 0.03\n"+"      }\n"+"    }\n"+"  }\n"+"}",contentsDeltaUri);createIndexSample(project,location,displayName,metadataJson);}publicstaticIndexcreateIndexSample(Stringproject,Stringlocation,StringdisplayName,StringmetadataJson)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){Value.BuildermetadataBuilder=Value.newBuilder();JsonFormat.parser().merge(metadataJson,metadataBuilder);CreateIndexRequestrequest=CreateIndexRequest.newBuilder().setParent(LocationName.of(project,location).toString()).setIndex(Index.newBuilder().setDisplayName(displayName).setMetadata(metadataBuilder).setIndexUpdateMethod(IndexUpdateMethod.BATCH_UPDATE)).build();returnindexServiceClient.createIndexAsync(request).get(5,TimeUnit.MINUTES);}}}

Console

Use these instructions to create an index for batch updates.

In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
Go to Vector Search
Click Create new index to open theIndex pane. TheCreate a new index pane appears.
In theDisplay name field, provide a name to uniquely identify your index.
In theDescription field, provide a description for what the index is for.
In theRegion field, select a region from the drop-down.
In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
In theAlgorithm type drop-down, select the algorithm type that Vector Search uses for efficient search. If you select the treeAh algorithm, enter the approximate neighbors count.
In theDimensions field, enter the number of dimensions of your input vectors.
In theUpdate method field, selectBatch.
In theShard size field, select from the drop-down the shard size you want.
ClickCreate. Your new index appears in your list of indexes once it's ready. Note: Build time can take up to an hour to complete.

Create an empty batch index

To create and deploy your index right away, you can create an empty batch index.With this option, no embeddings data is required at index creation time.

To create an empty index, the request is almost identical to creating an indexfor batch updates. The difference is you remove thecontentsDeltaUri field,since you aren't linking a data location. Here's an empty batch index example:

Empty index request example

{  "display_name":INDEX_NAME,  "indexUpdateMethod": "BATCH_UPDATE",  "metadata": {    "config": {      "dimensions": 100,      "approximateNeighborsCount": 150,      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",      "algorithm_config": {        "treeAhConfig": {          "leafNodeEmbeddingCount": 500,          "fractionLeafNodesToSearch": 0.07        }      }    }  }}

Create an index for streaming updates

Use these instructions to create and deploy your streaming index. If you don'thave your embeddings ready yet, skip toCreate an empty index for streaming updates.With this option, no embeddings data is required at index creation time.

Note: Thedimensions field is relevant only to dense embeddings.

REST

Before using any of the request data, make the following replacements:

INDEX_NAME: Display name for the index.

DESCRIPTION: A description of the index.

INPUT_DIR: The Cloud Storage directory path of the index content.

DIMENSIONS: Number of dimensions of the embedding vector.

PROJECT_ID: Your Google Cloud project ID.
PROJECT_NUMBER: Your project's automatically generatedproject number.
LOCATION: The region where you are using Vertex AI.

HTTP method and URL:

POST https://ENDPOINT-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes

Request JSON body:

{  displayName: "INDEX_NAME",  description: "DESCRIPTION",  metadata: {     contentsDeltaUri: "INPUT_DIR",     config: {        dimensions: "DIMENSIONS",        approximateNeighborsCount: 150,        distanceMeasureType: "DOT_PRODUCT_DISTANCE",        algorithmConfig: {treeAhConfig: {leafNodeEmbeddingCount: 10000, leafNodesToSearchPercent: 2}}     },  },  indexUpdateMethod: "STREAM_UPDATE"}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://ENDPOINT-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes"

PowerShell (Windows)

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://ENDPOINT-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{  "name": "projects/PROJECT_NUMBER/locations/LOCATION/operations/OPERATION_ID",  "metadata": {    "@type": "type.googleapis.com/google.cloud.aiplatform.ui.CreateIndexOperationMetadata",    "genericMetadata": {      "createTime": "2023-12-05T23:17:45.416117Z",      "updateTime": "2023-12-05T23:17:45.416117Z",      "state": "RUNNING",      "worksOn": [        "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID"      ]    }  }}

Terraform

The following sample uses thegoogle_vertex_ai_index Terraform resource to create an index for streaming updates.

To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.

# Cloud Storage bucket name must be uniqueresource"random_id""default"{byte_length=8}# Create a Cloud Storage bucketresource"google_storage_bucket""bucket"{name="vertex-ai-index-bucket-${random_id.default.hex}"location="us-central1"uniform_bucket_level_access=true}# Create index contentresource"google_storage_bucket_object""data"{name="contents/data.json"bucket=google_storage_bucket.bucket.namecontent=<<EOF{"id":"42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}{"id":"43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}EOF}resource"google_vertex_ai_index""streaming_index"{region="us-central1"display_name="sample-index-streaming-update"description="A sample index for streaming update"labels={foo="bar"}metadata{contents_delta_uri="gs://${google_storage_bucket.bucket.name}/contents"config{dimensions=2approximate_neighbors_count=150distance_measure_type="DOT_PRODUCT_DISTANCE"algorithm_config{tree_ah_config{leaf_node_embedding_count=500leaf_nodes_to_search_percent=7}}}}index_update_method="STREAM_UPDATE"timeouts{create="2h"update="1h"}}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_create_streaming_index(project:str,location:str,display_name:str,gcs_uri:Optional[str]=None)->aiplatform.MatchingEngineIndex:"""Create a vector search index.    Args:        project (str): Required. Project ID        location (str): Required. The region name        display_name (str): Required. The index display name        gcs_uri (str): Optional. The Google Cloud Storage uri for index content    Returns:        The created MatchingEngineIndex.    """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#CreateIndexindex=aiplatform.MatchingEngineIndex.create_tree_ah_index(display_name=display_name,contents_delta_uri=gcs_uri,description="Matching Engine Index",dimensions=100,approximate_neighbors_count=150,leaf_node_embedding_count=500,leaf_nodes_to_search_percent=7,index_update_method="STREAM_UPDATE",#Options:STREAM_UPDATE,BATCH_UPDATEdistance_measure_type=aiplatform.matching_engine.matching_engine_index_config.DistanceMeasureType.DOT_PRODUCT_DISTANCE,)returnindex

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.CreateIndexRequest;importcom.google.cloud.aiplatform.v1.Index;importcom.google.cloud.aiplatform.v1.Index.IndexUpdateMethod;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importcom.google.cloud.aiplatform.v1.LocationName;importcom.google.protobuf.Value;importcom.google.protobuf.util.JsonFormat;importjava.util.concurrent.TimeUnit;publicclassCreateStreamingIndexSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";StringdisplayName="YOUR_INDEX_DISPLAY_NAME";StringcontentsDeltaUri="gs://YOUR_BUCKET/";StringmetadataJson=String.format("{\n"+"  \"contentsDeltaUri\": \"%s\",\n"+"  \"config\": {\n"+"    \"dimensions\": 100,\n"+"        \"approximateNeighborsCount\": 150,\n"+"        \"distanceMeasureType\": \"DOT_PRODUCT_DISTANCE\",\n"+"        \"shardSize\": \"SHARD_SIZE_MEDIUM\",\n"+"        \"algorithm_config\": {\n"+"      \"treeAhConfig\": {\n"+"        \"leafNodeEmbeddingCount\": 5000,\n"+"            \"fractionLeafNodesToSearch\": 0.03\n"+"      }\n"+"    }\n"+"  }\n"+"}",contentsDeltaUri);createStreamingIndexSample(project,location,displayName,metadataJson);}publicstaticIndexcreateStreamingIndexSample(Stringproject,Stringlocation,StringdisplayName,StringmetadataJson)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){Value.BuildermetadataBuilder=Value.newBuilder();JsonFormat.parser().merge(metadataJson,metadataBuilder);CreateIndexRequestrequest=CreateIndexRequest.newBuilder().setParent(LocationName.of(project,location).toString()).setIndex(Index.newBuilder().setDisplayName(displayName).setMetadata(metadataBuilder).setIndexUpdateMethod(IndexUpdateMethod.STREAM_UPDATE)).build();returnindexServiceClient.createIndexAsync(request).get(5,TimeUnit.MINUTES);}}}

Console

Use these instructions to create an index for streaming updates in the Google Cloud console.

Tocreate an index availablefor Streaming Updates requires similar steps to setting up a Batch Update index,except you need to setindexUpdateMethod toSTREAM_UPDATE.

In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
Go to Vector Search
Click Create new index to open theIndex pane. TheCreate a new index pane appears.
In theDisplay name field, provide a name to uniquely identify your index.
In theDescription field, provide a description for what the index is for.
In theRegion field, select a region from the drop-down.
In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
In theAlgorithm type drop-down, select the algorithm type that Vector Search will use to perform your search. If you select the treeAh algorithm, enter the approximate neighbors count.
In theDimensions field, enter the number of dimensions of your input vectors.
In theUpdate method field, selectStream.
In theShard size field, select from the drop-down the shard size you want.
ClickCreate. Your new index appears in your list of indexes once it's ready. Note: Build time can take up to an hour to complete.

Create an empty index for streaming updates

To create and deploy your index right away, you can create an empty index for streaming.With this option, no embeddings data is required at index creation time.

To create an empty index, the request is almost identical to creating an indexfor streaming. The difference is you remove thecontentsDeltaUri field,since you aren't linking a data location. Here's an empty streaming index example:

Empty index request example

{  "display_name":INDEX_NAME,  "indexUpdateMethod": "STREAM_UPDATE",  "metadata": {    "config": {      "dimensions": 100,      "approximateNeighborsCount": 150,      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",      "algorithm_config": {        "treeAhConfig": {          "leafNodeEmbeddingCount": 500,          "leafNodesToSearchPercent": 7        }      }    }  }}

List indexes

gcloud

Before using any of the command data below, make the following replacements:

INDEX_NAME: Display name for the index.
LOCATION: The region where you are using Vertex AI.
PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexeslist\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexeslist`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexeslist^--region=LOCATION^--project=PROJECT_ID

You should receive a response similar to the following:

You can poll for the status of the operation for the responseto include "done": true. Use the following example to poll the status.  $ gcloud ai operations describe 1234567890123456789 --project=my-test-project --region=us-central1

Seegcloud ai operations to learnmore about thedescribe command.

REST

Before using any of the request data, make the following replacements:

INDEX_NAME: Display name for the index.
LOCATION: The region where you are using Vertex AI.
PROJECT_ID: Your Google Cloudproject ID.
PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes"

PowerShell (Windows)

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{ "indexes": [   {     "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID",     "displayName": "INDEX_NAME",     "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/matchingengine/metadata/nearest_neighbor_search_1.0.0.yaml",     "metadata": {       "config": {         "dimensions": 100,         "approximateNeighborsCount": 150,         "distanceMeasureType": "DOT_PRODUCT_DISTANCE",         "featureNormType": "NONE",         "algorithmConfig": {           "treeAhConfig": {             "maxLeavesToSearch": 50,             "leafNodeCount": 10000           }         }       }     },     "etag": "AMEw9yNU8YX5IvwuINeBkVv3yNa7VGKk11GBQ8GkfRoVvO7LgRUeOo0qobYWuU9DiEc=",     "createTime": "2020-11-08T21:56:30.558449Z",     "updateTime": "2020-11-08T22:39:25.048623Z"   } ]}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_list_index(project:str,location:str)->List[aiplatform.MatchingEngineIndex]:"""List vector search indexes.Args:project(str):Required.ProjectIDlocation(str):Required.TheregionnameReturns:Listofaiplatform.MatchingEngineIndex"""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#ListIndexesreturnaiplatform.MatchingEngineIndex.list()

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.Index;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceClient.ListIndexesPagedResponse;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importcom.google.cloud.aiplatform.v1.LocationName;publicclassListIndexesSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";for(Indexindex:listIndexesSample(project,location).iterateAll()){System.out.println(index.getName());}}publicstaticListIndexesPagedResponselistIndexesSample(Stringproject,Stringlocation)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){Stringparent=LocationName.of(project,location).toString();returnindexServiceClient.listIndexes(parent);}}}

Console

Use these instructions to view a list of your indexes.

In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
Go to Vector Search
A list of your active indexes is displayed.

Tuning the index

Tuning the index requires setting the configuration parameters that impact theperformance of deployed indexes, especially recall and latency. Theseparameters are set when you first create the index. You canuse brute-force indexes to measure recall.

Configuration parameters that impact performance

The following configuration parameters can be set at index creation time andcan affect recall, latency, availability, and cost when usingVector Search. This guidance applies to most cases. However, alwaysexperiment with your configurations to make sure that they work for your usecase.

For parameter definitions, seeIndex configuration parameters.

Parameter About Performance impact

Parameter	About	Performance impact
`shardSize`	Controls the amount of data on each machine. When choosing a shard size, estimate how large your dataset will be in the future. If the size of your dataset has an upper bound, pick the appropriate shard size to accommodate it. If there is no upper bound or if your use case is extremely sensitive to latency variability, choosing a large shard size is recommended.	If you configure for a larger number ofsmaller shards, a larger number of candidate results are processed during search. More shards can affect performance in the following ways: Recall: Increased Latency: Potentially increased, more variability Availability: Shard outages affect a smaller percentage of data Cost: Can increase if the same machine type is used with more shards If you configure for a smaller number oflarger shards, fewer candidate results are processed during search. Fewer shards can affect performance in the following ways: Recall: Decreased Latency: Reduced, less variability Availability: Shard outages affect a larger percentage of data Cost: Can decrease if the same machine type is used with fewer shards
`distanceMeasureType`	Determines the algorithm used for distance calculation between data points and the query vector.	The following`distanceMeasureType` settings can help reduce query latency: `DOT_PRODUCT_DISTANCE` is most optimized for reducing latency `DOT_PRODUCT_DISTANCE` combined with setting`FeatureNormType` to`UNIT_L2_NORM` is recommended for cosine similarity
`leafNodeEmbeddingCount`	The number of embeddings for each leaf node. By default, this number is set to 1000. Generally, changing the value of`leafNodeEmbeddingCount` has less effect than changing the value of other parameters.	Increasing the number of embeddings for each leaf node can reduce latency but reduce recall quality. It can affect performance in the following ways: Recall: Decreased due to less targeted search Latency: Reduced, as long as the value is not >15k for most use cases Availability: No impact Cost: Can decrease because fewer replicas are needed for the same QPS Decreasing the number of embeddings for each leaf node can affect performance in the following ways: Recall: Can increase because more targeted leafs are collected Latency: Increased Availability: No impact Cost: Can increase because more replicas are needed for the same QPS

shardSize

Controls the amount of data on each machine.

When choosing a shard size, estimate how large your dataset will be in the future. If the size of your dataset has an upper bound, pick the appropriate shard size to accommodate it. If there is no upper bound or if your use case is extremely sensitive to latency variability, choosing a large shard size is recommended.

If you configure for a larger number ofsmaller shards, a larger number of candidate results are processed during search. More shards can affect performance in the following ways:

Recall: Increased
Latency: Potentially increased, more variability
Availability: Shard outages affect a smaller percentage of data
Cost: Can increase if the same machine type is used with more shards

If you configure for a smaller number oflarger shards, fewer candidate results are processed during search. Fewer shards can affect performance in the following ways:

Recall: Decreased
Latency: Reduced, less variability
Availability: Shard outages affect a larger percentage of data
Cost: Can decrease if the same machine type is used with fewer shards

distanceMeasureType

Determines the algorithm used for distance calculation between data points and the query vector.

The followingdistanceMeasureType settings can help reduce query latency:

DOT_PRODUCT_DISTANCE is most optimized for reducing latency
DOT_PRODUCT_DISTANCE combined with settingFeatureNormType toUNIT_L2_NORM is recommended for cosine similarity

leafNodeEmbeddingCount

The number of embeddings for each leaf node. By default, this number is set to 1000.

Generally, changing the value ofleafNodeEmbeddingCount has less effect than changing the value of other parameters.

Increasing the number of embeddings for each leaf node can reduce latency but reduce recall quality. It can affect performance in the following ways:

Recall: Decreased due to less targeted search
Latency: Reduced, as long as the value is not >15k for most use cases
Availability: No impact
Cost: Can decrease because fewer replicas are needed for the same QPS

Decreasing the number of embeddings for each leaf node can affect performance in the following ways:

Recall: Can increase because more targeted leafs are collected
Latency: Increased
Availability: No impact
Cost: Can increase because more replicas are needed for the same QPS

Using a brute-force index to measure recall

To get the exact nearest neighbors, use indexes with the brute-force algorithm.The brute-force algorithm provides 100% recall at the expense of higher latency.Using a brute-force index to measure recall is usually not a good choice forproduction serving, but you might find it useful for evaluating the recall ofvarious indexing options offline.

To create an index with the brute-force algorithm, specifybrute_force_config in the index metadata:

curl -X POST -H "Content-Type: application/json" \-H "Authorization: Bearer `gcloud auth print-access-token`" \https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/indexes \-d '{    displayName: "'${DISPLAY_NAME}'",    description: "'${DESCRIPTION}'",    metadata: {       contentsDeltaUri: "'${INPUT_DIR}'",       config: {          dimensions: 100,          approximateNeighborsCount: 150,          distanceMeasureType: "DOT_PRODUCT_DISTANCE",          featureNormType: "UNIT_L2_NORM",          algorithmConfig: {             bruteForceConfig: {}          }       },    },}'

Delete an index

Note: You can't delete theindex until all itsindex.deployed_indexes havebeen undeployed.

gcloud

Before using any of the command data below, make the following replacements:

INDEX_ID: The ID of the index.

LOCATION: The region where you are using Vertex AI.

PROJECT_ID: Your Google Cloud project ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexesdeleteINDEX_ID\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexesdeleteINDEX_ID`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running either gcloud init; orgcloud auth login andgcloud config set project.

gcloudaiindexesdeleteINDEX_ID^--region=LOCATION^--project=PROJECT_ID

REST

Before using any of the request data, make the following replacements:

INDEX_ID: The ID of the index.
LOCATION: The region where you are using Vertex AI.
PROJECT_ID: Your Google Cloudproject ID.
PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Execute the following command:

curl -X DELETE \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID"

PowerShell (Windows)

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID",  "metadata": {    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata",    "genericMetadata": {      "createTime": "2022-01-08T02:35:56.364956Z",      "updateTime": "2022-01-08T02:35:56.364956Z"    }  },  "done": true,  "response": {    "@type": "type.googleapis.com/google.protobuf.Empty"  }}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_delete_index(project:str,location:str,index_name:str)->None:"""Delete a vector search index.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_name(str):Required.Theindextoupdate.Afully-qualifiedindexresourcenameoraindexID.Example:"projects/123/locations/us-central1/indexes/my_index_id"or"my_index_id"."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)#Deletetheindexindex.delete()

Java

Before trying this sample, follow theJava setup instructions in theVertex AI quickstart using client libraries. For more information, see theVertex AIJava API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, seeSet up authentication for a local development environment.

importcom.google.cloud.aiplatform.v1.IndexName;importcom.google.cloud.aiplatform.v1.IndexServiceClient;importcom.google.cloud.aiplatform.v1.IndexServiceSettings;importjava.util.concurrent.TimeUnit;publicclassDeleteIndexSample{publicstaticvoidmain(String[]args)throwsException{// TODO(developer): Replace these variables before running the sample.Stringproject="YOUR_PROJECT_ID";Stringlocation="YOUR_LOCATION";StringindexId="YOUR_INDEX_ID";deleteIndexSample(project,location,indexId);}publicstaticvoiddeleteIndexSample(Stringproject,Stringlocation,StringindexId)throwsException{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.try(IndexServiceClientindexServiceClient=IndexServiceClient.create(IndexServiceSettings.newBuilder().setEndpoint(location+"-aiplatform.googleapis.com:443").build())){StringindexName=IndexName.of(project,location,indexId).toString();indexServiceClient.deleteIndexAsync(indexName).get(5,TimeUnit.MINUTES);}}}

Console

Use these instructions to delete one or more indexes.

In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
Go to Vector Search
A list of your active indexes is displayed.
To delete an index, go to the options menu that is in the same row as the index and selectDelete.

What's next

Learn aboutIndex configuration parameters
Learn how toDeploy and manage public index endpoints
Learn how toDeploy and manage index endpoints in a VPC network
Learn how toUpdate and rebuild your index
Learn how toMonitor an index

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.

Movatterモバイル変換

Manage indexes Stay organized with collections Save and categorize content based on your preferences.

Index overview

Configure index parameters

Create an index

Index size

Create an index for batch update

gcloud

Linux, macOS, or Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl (Linux, macOS, or Cloud Shell)

PowerShell (Windows)

Terraform

Python

Java

Console

Create an empty batch index

Empty index request example

Create an index for streaming updates

REST

curl (Linux, macOS, or Cloud Shell)

PowerShell (Windows)

Terraform

Python

Java

Console

Create an empty index for streaming updates

Empty index request example

List indexes

gcloud

Linux, macOS, or Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl (Linux, macOS, or Cloud Shell)

PowerShell (Windows)

Python

Java

Console

Tuning the index

Configuration parameters that impact performance

Using a brute-force index to measure recall

Delete an index

gcloud

Linux, macOS, or Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl (Linux, macOS, or Cloud Shell)

PowerShell (Windows)

Python

Java

Console

What's next

Manage indexes