Deploy and manage index endpoints in a VPC network

Deploying an index to an endpoint includes the following three tasks:

  1. Create anIndexEndpoint if needed, or reuse an existingIndexEndpoint.
  2. Get theIndexEndpoint ID.
  3. Deploy the index to theIndexEndpoint.

Create anIndexEndpoint within your VPC network

If you are deploying anIndex to an existingIndexEndpoint, you can skip this step.

Before you use an index to serve online vector matching queries, youmust deploy theIndex to anIndexEndpoint within yourVPC Network Peering network. Thefirst step is to create anIndexEndpoint. You can deploy more than one indexto anIndexEndpoint that shares the same VPC network.

gcloud

The following example uses thegcloud ai index-endpoints create command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_NAME: Display name of the index endpoint.
  • VPC_NETWORK_NAME: The Google Compute Engine network name to which the index endpoint should be peered.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointscreate\--display-name=INDEX_ENDPOINT_NAME\--network=VPC_NETWORK_NAME\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointscreate`--display-name=INDEX_ENDPOINT_NAME`--network=VPC_NETWORK_NAME`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointscreate^--display-name=INDEX_ENDPOINT_NAME^--network=VPC_NETWORK_NAME^--region=LOCATION^--project=PROJECT_ID

You should receive a response similar to the following:

The Google Cloud CLI tool might take a few minutes to create theIndexEndpoint.

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_NAME: Display name of the index endpoint.
  • VPC_NETWORK_NAME: The Google Compute Engine network name to which the index endpoint should be peered.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints

Request JSON body:

{  "display_name": "INDEX_ENDPOINT_NAME",  "network": "VPC_NETWORK_NAME"}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID",  "metadata": {    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexEndpointOperationMetadata",    "genericMetadata": {      "createTime": "2022-01-13T04:09:56.641107Z",      "updateTime": "2022-01-13T04:09:56.641107Z"    }  }}

You can poll for the status of the operation until the response includes"done": true.

Terraform

The following sample uses thevertex_ai_index_endpoint Terraform resource to create an index endpoint.

To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.

resource"google_vertex_ai_index_endpoint""default"{display_name="sample-endpoint"description="A sample index endpoint within a VPC network"region="us-central1"network="projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}"depends_on=[google_service_networking_connection.default]}resource"google_service_networking_connection""default"{network=google_compute_network.default.idservice="servicenetworking.googleapis.com"reserved_peering_ranges=[google_compute_global_address.default.name]  # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729deletion_policy="ABANDON"}resource"google_compute_global_address""default"{name="sample-address"purpose="VPC_PEERING"address_type="INTERNAL"prefix_length=16network=google_compute_network.default.id}resource"google_compute_network""default"{name="sample-network"}data"google_project""project"{}# Cloud Storage bucket name must be uniqueresource"random_id""default"{byte_length=8}# Create a Cloud Storage bucketresource"google_storage_bucket""bucket"{name="vertex-ai-index-bucket-${random_id.default.hex}"location="us-central1"uniform_bucket_level_access=true}# Create index contentresource"google_storage_bucket_object""data"{name="contents/data.json"bucket=google_storage_bucket.bucket.namecontent=<<EOF{"id":"42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}{"id":"43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}EOF}resource"google_vertex_ai_index""default"{region="us-central1"display_name="sample-index-batch-update"description="A sample index for batch update"labels={foo="bar"}metadata{contents_delta_uri="gs://${google_storage_bucket.bucket.name}/contents"config{dimensions=2approximate_neighbors_count=150distance_measure_type="DOT_PRODUCT_DISTANCE"algorithm_config{tree_ah_config{leaf_node_embedding_count=500leaf_nodes_to_search_percent=7}}}}index_update_method="BATCH_UPDATE"timeouts{create="2h"update="1h"}}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_create_index_endpoint_vpc(project:str,location:str,display_name:str,network:str)->aiplatform.MatchingEngineIndexEndpoint:"""Create a vector search index endpoint within a VPC network.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnamedisplay_name(str):Required.Theindexendpointdisplaynamenetwork(str):Required.TheVPCnetworkname,intheformatofprojects/{projectnumber}/global/networks/{networkname}.Returns:aiplatform.MatchingEngineIndexEndpoint-Thecreatedindexendpoint."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#CreateIndexEndpointindex_endpoint=aiplatform.MatchingEngineIndexEndpoint.create(display_name=display_name,network=network,description="Matching Engine VPC Index Endpoint",)returnindex_endpoint

Console

Use these instructions to create an index endpoint.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. On the top of the page, select theIndex endpoints tab. Your index endpoints are displayed.
  4. ClickCreate new index endpoint. The Create a new index endpoint panel opens.
  5. Enter a display name for the index endpoint.
  6. In theRegion field, select a region from the drop-down.
  7. In theAccess field, selectPrivate.
  8. Enter your peered VPC network details. Enter the full name of the Compute Engine network to which the job should be peered. The format should beprojects/{project_num}/global/networks/{network_id}
  9. ClickCreate.

Deploy an index

Note: If yourDeployedIndex uses fewer than two replicas per shard,then it is excluded from theVertex AI Service Level Agreement. For yourDeployedIndex to be covered by the SLA, you must setminReplicaCount to at least 2 or greater, and must be adequately provisioned for workload size. To be adequately provisioned we recommend adding additional replicas if CPU/Memory usage consistency operates 60%.Important: Initial deployment to an endpoint typically takes between 20 and 30 minutes.

gcloud

This example uses thegcloud ai index-endpoints deploy-indexcommand.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_ENDPOINT_NAME: Display name of the deployed index endpoint.
  • INDEX_ID: The ID of the index.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeploy-indexINDEX_ENDPOINT_ID\--deployed-index-id=DEPLOYED_INDEX_ID\--display-name=DEPLOYED_INDEX_ENDPOINT_NAME\--index=INDEX_ID\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeploy-indexINDEX_ENDPOINT_ID`--deployed-index-id=DEPLOYED_INDEX_ID`--display-name=DEPLOYED_INDEX_ENDPOINT_NAME`--index=INDEX_ID`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeploy-indexINDEX_ENDPOINT_ID^--deployed-index-id=DEPLOYED_INDEX_ID^--display-name=DEPLOYED_INDEX_ENDPOINT_NAME^--index=INDEX_ID^--region=LOCATION^--project=PROJECT_ID

You should receive a response similar to the following:

The Google Cloud CLI tool might take a few minutes to create theIndexEndpoint.

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_ENDPOINT_NAME: Display name of the deployed index endpoint.
  • INDEX_ID: The ID of the index.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex

Request JSON body:

{ "deployedIndex": {   "id": "DEPLOYED_INDEX_ID",   "index": "projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID",   "displayName": "DEPLOYED_INDEX_ENDPOINT_NAME" }}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": {   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",   "genericMetadata": {     "createTime": "2022-10-19T17:53:16.502088Z",     "updateTime": "2022-10-19T17:53:16.502088Z"   },   "deployedIndexId": "DEPLOYED_INDEX_ID" }}

Terraform

The following sample uses thevertex_ai_index_endpoint_deployed_index Terraform resource to create a deployed index endpoint.

To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.

provider"google"{region="us-central1"}resource"google_vertex_ai_index_endpoint_deployed_index""default"{depends_on=[google_vertex_ai_index_endpoint.default]index_endpoint=google_vertex_ai_index_endpoint.default.idindex=google_vertex_ai_index.default.iddeployed_index_id="deployed_index_for_vpc"}resource"google_vertex_ai_index_endpoint""default"{display_name="sample-endpoint"description="A sample index endpoint within a VPC network"region="us-central1"network="projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}"depends_on=[google_service_networking_connection.default]}resource"google_service_networking_connection""default"{network=google_compute_network.default.idservice="servicenetworking.googleapis.com"reserved_peering_ranges=[google_compute_global_address.default.name]  # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729deletion_policy="ABANDON"}resource"google_compute_global_address""default"{name="sample-address"purpose="VPC_PEERING"address_type="INTERNAL"prefix_length=16network=google_compute_network.default.id}resource"google_compute_network""default"{name="sample-network"}data"google_project""project"{}# Cloud Storage bucket name must be uniqueresource"random_id""default"{byte_length=8}# Create a Cloud Storage bucketresource"google_storage_bucket""bucket"{name="vertex-ai-index-bucket-${random_id.default.hex}"location="us-central1"uniform_bucket_level_access=true}# Create index contentresource"google_storage_bucket_object""data"{name="contents/data.json"bucket=google_storage_bucket.bucket.namecontent=<<EOF{"id":"42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}{"id":"43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}EOF}resource"google_vertex_ai_index""default"{region="us-central1"display_name="sample-index-batch-update"description="A sample index for batch update"labels={foo="bar"}metadata{contents_delta_uri="gs://${google_storage_bucket.bucket.name}/contents"config{dimensions=2approximate_neighbors_count=150distance_measure_type="DOT_PRODUCT_DISTANCE"algorithm_config{tree_ah_config{leaf_node_embedding_count=500leaf_nodes_to_search_percent=7}}}}index_update_method="BATCH_UPDATE"timeouts{create="2h"update="1h"}}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_deploy_index(project:str,location:str,index_name:str,index_endpoint_name:str,deployed_index_id:str,)->None:"""Deploy a vector search index to a vector search index endpoint.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_name(str):Required.Theindextoupdate.Afully-qualifiedindexresourcenameoraindexID.Example:"projects/123/locations/us-central1/indexes/my_index_id"or"my_index_id".index_endpoint_name(str):Required.Indexendpointtodeploytheindexto.deployed_index_id(str):Required.TheuserspecifiedIDoftheDeployedIndex."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)#Createtheindexendpointinstancefromanexistingendpoint.index_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#DeployIndextoEndpointindex_endpoint=index_endpoint.deploy_index(index=index,deployed_index_id=deployed_index_id)print(index_endpoint.deployed_indexes)

Console

Use these instructions to deploy your index to an endpoint.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. Select the name of the index you want to deploy. The index details page opens.
  4. From the index details page, click Deploy to endpoint. The index deployment panel opens.
  5. Enter a display name - this name acts as an ID and can't be updated.
  6. From theEndpoint drop-down, select the endpoint you want to deploy this index to. Note: The endpoint is unavailable if the index is already deployed to it.
  7. Optional: In theMachine type field, select either standard or high-memory.
  8. Optional. SelectEnable autoscaling to automatically resize the number of nodes based on the demands of your workloads. The default number of replicas is 2 if autoscaling is disabled.
  9. ClickDeploy to deploy your index to the endpoint. Note: It takes around 30 minutes to be deployed.

Enable autoscaling

Vector Search supports autoscaling, which canautomatically resize the number of nodes based on the demands of yourworkloads. When demand is high, nodes are added to the node pool, which won'texceed the maximum size you designate. When demand is low, the node pool scalesback down to a minimum size that you designate. You can check the actual nodesin use and the changes bymonitoring the current replicas.

To enable autoscaling, specify themaxReplicaCount andminReplicaCount when you deploy your index:

gcloud

The following example uses thegcloud ai index-endpoints deploy-index command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_NAME: Display name of the deployed index.
  • INDEX_ID: The ID of the index.
  • MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeploy-indexINDEX_ENDPOINT_ID\--deployed-index-id=DEPLOYED_INDEX_ID\--display-name=DEPLOYED_INDEX_NAME\--index=INDEX_ID\--min-replica-count=MIN_REPLICA_COUNT\--max-replica-count=MAX_REPLICA_COUNT\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeploy-indexINDEX_ENDPOINT_ID`--deployed-index-id=DEPLOYED_INDEX_ID`--display-name=DEPLOYED_INDEX_NAME`--index=INDEX_ID`--min-replica-count=MIN_REPLICA_COUNT`--max-replica-count=MAX_REPLICA_COUNT`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeploy-indexINDEX_ENDPOINT_ID^--deployed-index-id=DEPLOYED_INDEX_ID^--display-name=DEPLOYED_INDEX_NAME^--index=INDEX_ID^--min-replica-count=MIN_REPLICA_COUNT^--max-replica-count=MAX_REPLICA_COUNT^--region=LOCATION^--project=PROJECT_ID

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_NAME: Display name of the deployed index.
  • INDEX_ID: The ID of the index.
  • MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex

Request JSON body:

{ "deployedIndex": {   "id": "DEPLOYED_INDEX_ID",   "index": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID",   "displayName": "DEPLOYED_INDEX_NAME",   "automaticResources": {     "minReplicaCount":MIN_REPLICA_COUNT,     "maxReplicaCount":MAX_REPLICA_COUNT   } }}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": {   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",   "genericMetadata": {     "createTime": "2023-10-19T17:53:16.502088Z",     "updateTime": "2023-10-19T17:53:16.502088Z"   },   "deployedIndexId": "DEPLOYED_INDEX_ID" }}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_deploy_autoscaling_index(project:str,location:str,index_name:str,index_endpoint_name:str,deployed_index_id:str,min_replica_count:int,max_replica_count:int,)->None:"""Deploy a vector search index to a vector search index endpoint.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_name(str):Required.Theindextoupdate.Afully-qualifiedindexresourcenameoraindexID.Example:"projects/123/locations/us-central1/indexes/my_index_id"or"my_index_id".index_endpoint_name(str):Required.Indexendpointtodeploytheindexto.deployed_index_id(str):Required.TheuserspecifiedIDoftheDeployedIndex.min_replica_count(int):Required.Theminimumnumberofreplicastodeploy.max_replica_count(int):Required.Themaximumnumberofreplicastodeploy."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)#Createtheindexendpointinstancefromanexistingendpoint.index_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#DeployIndextoEndpoint.Specifyingminandmaxreplicacountswill#enableautoscaling.index_endpoint.deploy_index(index=index,deployed_index_id=deployed_index_id,min_replica_count=min_replica_count,max_replica_count=max_replica_count,)

Console

You can only enable autoscaling from the console during index deployment.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. Select the name of the index you want to deploy. The index details page opens.
  4. From the index details page, click Deploy to endpoint. The index deployment panel opens.
  5. Enter a display name - this name acts as an ID and can't be updated.
  6. From theEndpoint drop-down, select the endpoint you want to deploy this index to. Note: The endpoint is unavailable if the index is already deployed to it.
  7. Optional: In theMachine type field, select either standard or high-memory.
  8. Optional. SelectEnable autoscaling to automatically resize the number of nodes based on the demands of your workloads. The default number of replicas is 2 if autoscaling is disabled.
  • If bothminReplicaCount andmaxReplicaCount are not set, they are set to2 by default.
  • If onlymaxReplicaCount is set,minReplicaCount is set to 2 by default.
  • If onlyminReplicaCount is set,maxReplicaCount is set to equalminReplicaCount.
Note: Google recommends that yourDeployedIndex uses at least two replicasper shard. To do so, setminReplicaCount to 2.

Mutate aDeployedIndex

You can useMutateDeployedIndex API to update the deployment resources (for example,minReplicaCount andmaxReplicaCount) of an already deployed index.

  • Users are not allowed to change themachineType after the index is deployed.
  • IfmaxReplicaCount is not specified in the request, theDeployedIndex will keep using the existingmaxReplicaCount.

gcloud

The following example uses thegcloud ai index-endpoints mutate-deployed-index command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsmutate-deployed-indexINDEX_ENDPOINT_ID\--deployed-index-id=DEPLOYED_INDEX_ID\--min-replica-count=MIN_REPLICA_COUNT\--max-replica-count=MAX_REPLICA_COUNT\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsmutate-deployed-indexINDEX_ENDPOINT_ID`--deployed-index-id=DEPLOYED_INDEX_ID`--min-replica-count=MIN_REPLICA_COUNT`--max-replica-count=MAX_REPLICA_COUNT`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsmutate-deployed-indexINDEX_ENDPOINT_ID^--deployed-index-id=DEPLOYED_INDEX_ID^--min-replica-count=MIN_REPLICA_COUNT^--max-replica-count=MAX_REPLICA_COUNT^--region=LOCATION^--project=PROJECT_ID

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:mutateDeployedIndex

Request JSON body:

{  "deployedIndex": {    "id": "DEPLOYED_INDEX_ID",    "index": "projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID",    "displayName": "DEPLOYED_INDEX_NAME",    "min_replica_count": "MIN_REPLICA_COUNT",    "max_replica_count": "MAX_REPLICA_COUNT"  }}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:mutateDeployedIndex"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:mutateDeployedIndex" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{"name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID","metadata": {  "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",  "genericMetadata": {    "createTime": "2020-10-19T17:53:16.502088Z",    "updateTime": "2020-10-19T17:53:16.502088Z"  },  "deployedIndexId": "DEPLOYED_INDEX_ID"}}

Terraform

To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands. For more information, see theTerraform provider reference documentation.

provider"google"{region="us-central1"}resource"google_vertex_ai_index_endpoint_deployed_index""default"{depends_on=[google_vertex_ai_index_endpoint.default]index_endpoint=google_vertex_ai_index_endpoint.default.idindex=google_vertex_ai_index.default.iddeployed_index_id="deployed_index_for_mutate_vpc"  # This example assumes the deployed index endpoint's resources configuration  # differs from the values specified below. Terraform will mutate the deployed  # index endpoint's resource configuration to match.automatic_resources{min_replica_count=3max_replica_count=5}}resource"google_vertex_ai_index_endpoint""default"{display_name="sample-endpoint"description="A sample index endpoint within a VPC network"region="us-central1"network="projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}"depends_on=[google_service_networking_connection.default]}resource"google_service_networking_connection""default"{network=google_compute_network.default.idservice="servicenetworking.googleapis.com"reserved_peering_ranges=[google_compute_global_address.default.name]  # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729deletion_policy="ABANDON"}resource"google_compute_global_address""default"{name="sample-address"purpose="VPC_PEERING"address_type="INTERNAL"prefix_length=16network=google_compute_network.default.id}resource"google_compute_network""default"{name="sample-network"}data"google_project""project"{}# Cloud Storage bucket name must be uniqueresource"random_id""default"{byte_length=8}# Create a Cloud Storage bucketresource"google_storage_bucket""bucket"{name="vertex-ai-index-bucket-${random_id.default.hex}"location="us-central1"uniform_bucket_level_access=true}# Create index contentresource"google_storage_bucket_object""data"{name="contents/data.json"bucket=google_storage_bucket.bucket.namecontent=<<EOF{"id":"42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}{"id":"43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}EOF}resource"google_vertex_ai_index""default"{region="us-central1"display_name="sample-index-batch-update"description="A sample index for batch update"labels={foo="bar"}metadata{contents_delta_uri="gs://${google_storage_bucket.bucket.name}/contents"config{dimensions=2approximate_neighbors_count=150distance_measure_type="DOT_PRODUCT_DISTANCE"algorithm_config{tree_ah_config{leaf_node_embedding_count=500leaf_nodes_to_search_percent=7}}}}index_update_method="BATCH_UPDATE"timeouts{create="2h"update="1h"}}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_mutate_deployed_index(project:str,location:str,index_endpoint_name:str,deployed_index_id:str,min_replica_count:int,max_replica_count:int,)->None:"""Mutate the deployment resources of an already deployed index.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_endpoint_name(str):Required.Indexendpointtorunthequeryagainst.deployed_index_id(str):Required.TheIDoftheDeployedIndextorunthequeriesagainst.min_replica_count(int):Required.Theminimumnumberofreplicastodeploy.max_replica_count(int):Required.Themaximumnumberofreplicastodeploy."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexendpointinstancefromanexistingendpointindex_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#Mutatethedeployedindexindex_endpoint.mutate_deployed_index(deployed_index_id=deployed_index_id,min_replica_count=min_replica_count,max_replica_count=max_replica_count,)
Note: Google recommends that yourDeployedIndex uses at least two replicasper shard. To do so, setminReplicaCount to 2.

Deployment settings that impact performance

The following deployment settings can affect latency, availability, andcost when using Vector Search. This guidance applies to most cases.However, always experiment with your configurations to make sure that they workfor your use case.

SettingPerformance impact
Machine type

The hardware selection has a direct interaction with the shard size selected. Depending on shard choices you specified at index creation time, each machine type offers a tradeoff between performance and cost.

Reference the pricing page to determine the hardware available and pricing. In general, performance increases in the following order:

  • E2 standard
  • E2 highmem
  • N1 standard
  • N2D standard
Minimum replica count

minReplicaCount reserves a minimum capacity for availability and latency to ensure that the system doesn't have cold start issues when traffic scales up quickly from low levels.

If you have workloads that drop to low levels and then quickly increase to higher levels, consider settingminReplicaCount to a number that can accommodate the initial bursts of traffic.

Maximum replica countmaxReplicaCount primarily lets you control usage cost. You can choose to prevent increasing costs beyond a certain threshold, with the tradeoff of allowing increased latency and reducing availability.

ListIndexEndpoints

To list yourIndexEndpoint resources and view the information ofany associatedDeployedIndex instances, run the followingcode:

Note: These methods will list allIndexEndpoint resources in the given projectand location, including those within VPCs and those with public endpoints.

gcloud

The following example uses thegcloud ai index-endpoints list command.

Before using any of the command data below, make the following replacements:

  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointslist\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointslist`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointslist^--region=LOCATION^--project=PROJECT_ID

REST

Before using any of the request data, make the following replacements:

  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{ "indexEndpoints": [   {     "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID",     "displayName": "INDEX_ENDPOINT_DISPLAY_NAME",     "deployedIndexes": [       {         "id": "DEPLOYED_INDEX_ID",         "index": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID",         "displayName": "DEPLOYED_INDEX_DISPLAY_NAME",         "createTime": "2021-06-04T02:23:40.178286Z",         "privateEndpoints": {           "matchGrpcAddress": "GRPC_ADDRESS"         },         "indexSyncTime": "2022-01-13T04:22:00.151916Z",         "automaticResources": {           "minReplicaCount": 2,           "maxReplicaCount": 10         }       }     ],     "etag": "AMEw9yP367UitPkLo-khZ1OQvqIK8Q0vLAzZVF7QjdZ5O3l7Zow-mzBo2l6xmiuuMljV",     "createTime": "2021-03-17T04:47:28.460373Z",     "updateTime": "2021-06-04T02:23:40.930513Z",     "network": "VPC_NETWORK_NAME"   } ]}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_list_index_endpoint(project:str,location:str)->List[aiplatform.MatchingEngineIndexEndpoint]:"""List vector search index endpoints.Args:project(str):Required.ProjectIDlocation(str):Required.TheregionnameReturns:Listofaiplatform.MatchingEngineIndexEndpoint"""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#ListIndexEndpointsreturnaiplatform.MatchingEngineIndexEndpoint.list()

Console

Use these instructions to view a list of your index endpoints.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. On the top of the page, select theIndex endpoint tab.
  3. All of the existing index endpoints are displayed.

For more information, see the reference documentation forIndexEndpoint.

Undeploy an index

To undeploy an index, run the following code:

gcloud

The following example uses thegcloud ai index-endpoints undeploy-index command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsundeploy-indexINDEX_ENDPOINT_ID\--deployed-index-id=DEPLOYED_INDEX_ID\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsundeploy-indexINDEX_ENDPOINT_ID`--deployed-index-id=DEPLOYED_INDEX_ID`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsundeploy-indexINDEX_ENDPOINT_ID^--deployed-index-id=DEPLOYED_INDEX_ID^--region=LOCATION^--project=PROJECT_ID

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index.It must start with a letter and contain only letters, numbers or underscores.SeeDeployedIndex.id for format guidelines.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:undeployIndex

Request JSON body:

{ "deployed_index_id": "DEPLOYED_INDEX_ID"}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:undeployIndex"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:undeployIndex" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": {   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UndeployIndexOperationMetadata",   "genericMetadata": {     "createTime": "2022-01-13T04:09:56.641107Z",     "updateTime": "2022-01-13T04:09:56.641107Z"   } }}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_undeploy_index(project:str,location:str,index_endpoint_name:str,deployed_index_id:str,)->None:"""Mutate the deployment resources of an already deployed index.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_endpoint_name(str):Required.Indexendpointtorunthequeryagainst.deployed_index_id(str):Required.TheIDoftheDeployedIndextorunthequeriesagainst."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexendpointinstancefromanexistingendpointindex_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#Undeploytheindexindex_endpoint.undeploy_index(deployed_index_id=deployed_index_id,)

Console

Use these instructions to undeploy an index.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. Select the index you want to undeploy. The index details page opens.
  4. Under theDeployed indexes section, identify the index endpoint you want to undeploy.
  5. Click the options menu that is in the same row as the index endpoint and selectUndeploy.
  6. A confirmation screen opens. ClickUndeploy. Note: It can take up to 30 minutes to be undeployed.
Note: It takes 10 to 20 minutes for Google backend jobs to clean upthe deployment. You can't reuse the DeployedIndex ID until after the deploymentis cleaned up.

Delete anIndexEndpoint

Before you delete anIndexEndpoint, you mustundeploy allindexes deploy to the endpoint.

gcloud

The following example uses thegcloud ai index-endpoints delete command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.

Execute the following command:

Linux, macOS, or Cloud Shell

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeleteINDEX_ENDPOINT_ID\--region=LOCATION\--project=PROJECT_ID

Windows (PowerShell)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeleteINDEX_ENDPOINT_ID`--region=LOCATION`--project=PROJECT_ID

Windows (cmd.exe)

Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.
gcloudaiindex-endpointsdeleteINDEX_ENDPOINT_ID^--region=LOCATION^--project=PROJECT_ID

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID: The ID of the index endpoint.
  • LOCATION: The region where you are using Vertex AI.
  • PROJECT_ID: Your Google Cloudproject ID.
  • PROJECT_NUMBER: Your project's automatically generatedproject number.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Execute the following command:

curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID"

PowerShell (Windows)

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": {   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata",   "genericMetadata": {     "createTime": "2022-01-13T04:36:19.142203Z",     "updateTime": "2022-01-13T04:36:19.142203Z"   } }, "done": true, "response": {   "@type": "type.googleapis.com/google.protobuf.Empty" }}

Python

To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.

defvector_search_delete_index_endpoint(project:str,location:str,index_endpoint_name:str,force:bool=False)->None:"""Delete a vector search index endpoint.Args:project(str):Required.ProjectIDlocation(str):Required.Theregionnameindex_endpoint_name(str):Required.Indexendpointtorunthequeryagainst.force(bool):Required.Iftrue,undeployanydeployedindexesonthisendpointbeforedeletion."""#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexendpointinstancefromanexistingendpointindex_endpoint=aiplatform.MatchingEngineIndexEndpoint(index_endpoint_name=index_endpoint_name)#Deletetheindexendpointindex_endpoint.delete(force=force)

Console

Use these instructions to delete an index endpoint.

  1. In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search

    Go to Vector Search

  2. On the top of the page, select theIndex endpoints tab.
  3. All of the existing index endpoints are displayed.
  4. Click the options menu that is in the same row as the index endpoint you want to delete and selectDelete.
  5. A confirmation screen opens. ClickDelete. Your index endpoint is now deleted.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.