Update and rebuild an active index Stay organized with collections Save and categorize content based on your preferences.
With large search queries, updating your indexes is important to alwayshave the most accurate information. You can update yourVector Search indexes in a few different ways:
- Replace an entire index
- Partially update a batch index
- Partially update a streaming index
- Update index metadata
Replace an entire index
To replace the content of an existing batch update or streamingIndex,use theIndexService.UpdateIndex method.
- Set
Index.metadata.contentsDeltaUrito the Cloud Storage URI thatincludes the vectors you want to update. - Set
Index.metadata.isCompleteOverwriteto true. When set to true, the entire index iscompletely overwritten with the new metadata file that you provide.
gcloud
- Update your index metadata file to set
contentsDeltaUriandisCompleteOverwrite=true. - Use the
gcloud ai indexes updatecommand.
Before using any of the command data below, make the following replacements:
- LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloudproject ID.
Execute the following command:
Linux, macOS, or Cloud Shell
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID\--metadata-file=LOCAL_PATH_TO_METADATA_FILE\--region=LOCATION\--project=PROJECT_ID
Windows (PowerShell)
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID`--metadata-file=LOCAL_PATH_TO_METADATA_FILE`--region=LOCATION`--project=PROJECT_ID
Windows (cmd.exe)
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID^--metadata-file=LOCAL_PATH_TO_METADATA_FILE^--region=LOCATION^--project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INPUT_DIR: The Cloud Storage directory path of the index content.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloudproject ID.
- PROJECT_NUMBER: Your project's automatically generatedproject number.
HTTP method and URL:
PATCH https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID
Request JSON body:
{ "metadata": { "contentsDeltaUri": "INPUT_DIR", "isCompleteOverwrite": true }}To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID"
PowerShell (Windows)
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UpdateIndexOperationMetadata", "genericMetadata": { "createTime": "2022-01-12T23:56:14.480948Z", "updateTime": "2022-01-12T23:56:14.480948Z" } }}Python
To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.
When invoking the method below, setis_complete_overwrite=True to fully replace the contents of the index.
defvector_search_update_index_embeddings(project:str,location:str,index_name:str,gcs_uri:str,is_complete_overwrite:Optional[bool]=None,)->None:"""Update a vector search index. Args: project (str): Required. Project ID location (str): Required. The region name index_name (str): Required. The index to update. A fully-qualified index resource name or a index ID. Example: "projects/123/locations/us-central1/indexes/my_index_id" or "my_index_id". gcs_uri (str): Required. The Google Cloud Storage uri for index content is_complete_overwrite (bool): Optional. If true, the index content will be overwritten wth the contents at gcs_uri. """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)index.update_embeddings(contents_delta_uri=gcs_uri,is_complete_overwrite=is_complete_overwrite)Console
Use these instructions to update a batch index content.
- In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
- Select the index you want to update. TheIndex info page opens.
- SelectEdit Index. An edit index pane opens.
- In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
- Check the complete overwrite box to overwrite all the existing data.
- ClickUpdate
- ClickDone to close out the panel.
Partially update a batch index
To update the embeddings of an existing batchIndex, use theIndexService.UpdateIndex method.
- Set
Index.metadata.contentsDeltaUrito the Cloud Storage URI thatincludes the vectors you want to update. - Set
Index.metadata.isCompleteOverwriteto false.
Only the vectors specified inIndex.metadata.contentsDeltaUri are updated,inserted, or deleted. The other existing embeddings in the index remain.
gcloud
Before using any of the command data below, make the following replacements:
- LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloudproject ID.
Execute the following command:
Linux, macOS, or Cloud Shell
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID\--metadata-file=LOCAL_PATH_TO_METADATA_FILE\--region=LOCATION\--project=PROJECT_ID
Windows (PowerShell)
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID`--metadata-file=LOCAL_PATH_TO_METADATA_FILE`--region=LOCATION`--project=PROJECT_ID
Windows (cmd.exe)
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID^--metadata-file=LOCAL_PATH_TO_METADATA_FILE^--region=LOCATION^--project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INPUT_DIR: The Cloud Storage directory path of the index content.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloudproject ID.
- PROJECT_NUMBER: Your project's automatically generatedproject number.
HTTP method and URL:
PATCH https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID
Request JSON body:
{ "metadata": { "contentsDeltaUri": "INPUT_DIR", "isCompleteOverwrite": false }}To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID"
PowerShell (Windows)
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UpdateIndexOperationMetadata", "genericMetadata": { "createTime": "2022-01-12T23:56:14.480948Z", "updateTime": "2022-01-12T23:56:14.480948Z" } }}Python
To learn how to install or update the Vertex AI SDK for Python, seeInstall the Vertex AI SDK for Python. For more information, see thePython API reference documentation.
When invoking the method below, setis_complete_overwrite=False.
defvector_search_update_index_embeddings(project:str,location:str,index_name:str,gcs_uri:str,is_complete_overwrite:Optional[bool]=None,)->None:"""Update a vector search index. Args: project (str): Required. Project ID location (str): Required. The region name index_name (str): Required. The index to update. A fully-qualified index resource name or a index ID. Example: "projects/123/locations/us-central1/indexes/my_index_id" or "my_index_id". gcs_uri (str): Required. The Google Cloud Storage uri for index content is_complete_overwrite (bool): Optional. If true, the index content will be overwritten wth the contents at gcs_uri. """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)index.update_embeddings(contents_delta_uri=gcs_uri,is_complete_overwrite=is_complete_overwrite)Console
Use these instructions to update a batch index content.
- In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
- Select the index you want to update. TheIndex info page opens.
- SelectEdit Index. An edit index pane opens.
- In the Cloud Storage field, search and select the Cloud Storage folder where your vector data is stored.
- Ensure the complete overwrite box is clear.
- ClickUpdate
- ClickDone to close out the panel.
If theIndex has any associated deployments(see theIndex.deployed_indexes field), then when certain changes to theoriginalIndex are done, theDeployedIndex is automatically updatedasynchronously in the background to reflect these changes.
To check whether the change has been propagated, compare the update indexoperation finish time and theDeployedIndex.index_sync_time.
Partially update a streaming index
With streaming updates, you can update and query your index within a few seconds.At this time, you can't use streaming updates on an existing batch update index,you must create a new index. SeeCreate an index for streaming updateto learn more.
You are charged $0.45 per GB used for streaming updates.To learn more about pricing, see theVertex AI pricing page.Streaming updates are directly applied to the deployed indexes in memory,which are then reflected in query results after a short delay.
Note: After you've created an index for streaming updates, you can't perform a batchupdate on this index. You can perform a complete overwrite to use the index or youcan create a new index prepared for batch updates. SeeCreate an indexto learn more.Upsert data points
Use these samples to see how to upsert a data point. Remember,upsert-datapointsaccepts JSON in array format only.
Python
Python
defvector_search_upsert_datapoints(project:str,location:str,index_name:str,datapoints:Sequence[aiplatform.compat.types.index_v1beta1.IndexDatapoint],)->None:"""Upsert datapoints to the index. Args: project (str): Required. The Project ID location (str): Required. The region name, e.g. "us-central1" index_name (str): Required. The index to update. A fully-qualified index resource name or a index ID. Example: "projects/123/locations/us-central1/indexes/my_index_id" or "my_index_id". datapoints: Sequence[IndexDatapoint]: Required. The datapoints to be updated. For example: [IndexDatapoint(datapoint_id="1", feature_vector=[1.0, 2.0, 3.0]), IndexDatapoint(datapoint_id="2", feature_vector=[4.0, 5.0, 6.0])] """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexwithstream_update#enabledmy_index=aiplatform.MatchingEngineIndex(index_name=index_name)#Upsertthedatapointstotheindexmy_index.upsert_datapoints(datapoints=datapoints)Curl
The throughput quota limit relates to the amount of data that is included in an upsert. If the data point ID exists in the index, the embedding is updated, otherwise, a new embedding is added.
DATAPOINT_ID_1= DATAPOINT_ID_2= curl -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexes/${INDEX_ID}:upsertDatapoints \ -d '{datapoints: [{datapoint_id: "'${DATAPOINT_ID_1}'", feature_vector: [...]}, {datapoint_id: "'${DATAPOINT_ID_2}'", feature_vector: [...]}]}'With hybrid search, sparse and dense emebdding representations for a datapoint are supported. In an upsert operation, omitting a dense embedding deletes the dense representation, and omitting a sparse embedding deletes the sparse representation.
This example updates both dense embeddings and sparse embeddings.
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexes/${INDEX_ID}:upsertDatapoints -d '{datapoints: [{datapoint_id: "111", feature_vector: [0.111, 0.111], "sparse_embedding": {"values": [111.0,111.1,111.2], "dimensions": [10,20,30]}}]}'This example updates dense embeddings and removes sparse embeddings.
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexes/${INDEX_ID}:upsertDatapoints -d '{datapoints: [{datapoint_id: "111", feature_vector: [0.111, 0.111]}]}'This example updates sparse embeddings and removes dense embeddings.
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexes/${INDEX_ID}:upsertDatapoints -d '{datapoints: [{datapoint_id: "111", "sparse_embedding": {"values": [111.0,111.1,111.2], "dimensions": [10,20,30]}}]}'Console
Console
Use these instructions to update content to a streaming index.
- In Google Cloud console, go to theVector Search page.
- Select the index you want to update. TheIndex info page opens.
- ClickEdit Index. An edit index pane opens.
- From the pane, select theUpsert data point tab for adding content.
- Enter the data point ID.
- Enter at least one type of embedding:
- Dense embedding: Enter an array of comma-separated floating point values. The number of values must match theindex's dimensions.
- Sparse embedding:
- Enter sparse embedding dimensions as an array of comma-separated integers. The number of values doesn't have to match the index's dimensions.
- Enter values as an array of comma-separated floating point values. The number of values must match the number of sparse embedding dimensions.
- Optional: To enable filtering by token restricts on this data point, clickAdd token restrict, and then enter a namespace and comma-separated strings as tokens.
- Optional: To enable filtering by numeric restricts on this data point, clickAdd numeric restrict, enter a namespace, select a number type, and enter a value.
- Optional: To help prevent many similar results, enter a crowding tag string.
- ClickUpsert.
- ClickDone to close out the panel.
The throughput quota limit relates to the amount of data that is included in an upsert.If the data point ID exists in the index, the embedding is updated, otherwise,a new embedding is added.
Update embedding metadata
There are many reasons you might need to update streaming restricts or numericrestricts. For example, when dealing with high-volume, fast-moving data, you mightwant to prioritize certain data streams. Directly updating restricts or numericrestricts lets you refine the focus in real-time, ensuring the most importantdata is processed or highlighted immediately.
You can directly update data point restricts and numeric restricts inside a streamingindex without the compaction cost of full update.
To perform these metadata-only updates, you need to add the fieldupdate_maskto the request. The value ofupdate_mask must be set toall_restricts.The restrict and numeric restrict values set in the data points should bethe new values you want to apply in the update.
The following example shows how to add restricts to two existing data points.
DATAPOINT_ID_1=DATAPOINT_ID_2=curl-H"Content-Type: application/json"-H"Authorization: Bearer `gcloud auth print-access-token`"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexes/${INDEX_ID}:upsertDatapoints \-d'{datapoints:[{datapoint_id:"'${DATAPOINT_ID_1}'",feature_vector:[...],restricts:[{namespace:"color",allow_list:["red"]}]},{datapoint_id:"'${DATAPOINT_ID_2}'",feature_vector:[...],restricts:[{namespace:"color",allow_list:["red"]}]}],update_mask:"all_restricts"}'Remove data points
You might need to remove data points from your streaming index. You cando this using curl or from the Google Cloud console.
A key use case for deleting a data point from an index is to maintainparity between the index and its real-world source. Consider abookseller who uses a vector embedding to represent their book inventory forsearch and recommendation purposes. When a book is sold out or removed fromstock, deleting its corresponding data point from the index ensures that searchresults and recommendations remain accurate and up-to-date.
Curl
curl -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{REGION}/indexes/{INDEX_ID}:removeDatapoints -d '{datapoint_ids: ["'{DATAPOINT_ID_1}'", "'{DATAPOINT_ID_2}'"]}'Console
Console
Use these instructions to delete a data point from streaming index.
- In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
- Select the streaming index you want to update. TheIndex info page opens.
- SelectEdit Index. An edit index pane opens.
- From the pane, select theRemove data points tab.
- Add up to 20 data points by providing a comma delimited list of data point IDs
- ClickRemove.
- ClickDone to close out the panel.
Python
Python
defvector_search_remove_datapoints(project:str,location:str,index_name:str,datapoint_ids:Sequence[str],)->None:"""Remove datapoints from a vector search index. Args: project (str): Required. Project ID location (str): Required. The region name index_name (str): Required. The index to update. A fully-qualified index resource name or a index ID. Example: "projects/123/locations/us-central1/indexes/my_index_id" or "my_index_id". datapoint_ids (Sequence[str]): Required. The datapoint IDs to remove. """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)index.remove_datapoints(datapoint_ids=datapoint_ids)Update index metadata
IndexService.UpdateIndex can also be used to update the metadata fieldsdisplay_name,description, andlabels for batch and streaming indexes. Notethat a single call toUpdateIndex can update the index embeddingsor thesemetadata fields, but not both at once.
gcloud
Before using any of the command data below, make the following replacements:
- LOCAL_PATH_TO_METADATA_FILE: The local path to the metadata file.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloudproject ID.
Execute the following command:
Linux, macOS, or Cloud Shell
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID\--metadata-file=LOCAL_PATH_TO_METADATA_FILE\--region=LOCATION\--project=PROJECT_ID
Windows (PowerShell)
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID`--metadata-file=LOCAL_PATH_TO_METADATA_FILE`--region=LOCATION`--project=PROJECT_ID
Windows (cmd.exe)
Note: Ensure you have initialized the Google Cloud CLI with authentication and a project by running eithergcloud init; orgcloud auth login andgcloud config set project.gcloudaiindexesupdateINDEX_ID^--metadata-file=LOCAL_PATH_TO_METADATA_FILE^--region=LOCATION^--project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INPUT_DIR: The Cloud Storage directory path of the index content.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloudproject ID.
- PROJECT_NUMBER: Your project's automatically generatedproject number.
HTTP method and URL:
PATCH https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID
Request JSON body:
{ "metadata": { "description": "Updated description", "display_name": "Updated display name" }}To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID"
PowerShell (Windows)
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UpdateIndexOperationMetadata", "genericMetadata": { "createTime": "2022-01-12T23:56:14.480948Z", "updateTime": "2022-01-12T23:56:14.480948Z" } }}Python
defvector_search_update_index_metadata(project:str,location:str,index_name:str,display_name:Optional[str]=None,description:Optional[str]=None,labels:Optional[Dict[str, str]]=None,)->None:"""Update a vector search index. Args: project (str): Required. Project ID location (str): Required. The region name index_name (str): Required. The index to update. A fully-qualified index resource name or a index ID. Example: "projects/123/locations/us-central1/indexes/my_index_id" or "my_index_id". display_name (str): Optional. The display name of the Index. The name can be up to 128 characters long and can be consist of any UTF-8 characters. description (str): Optional. The description of the Index. labels (Dict[str, str]): Optional. The labels with user-defined metadata to organize your Indexs. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one Index (System labels are excluded). System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable. """#InitializetheVertexAIclientaiplatform.init(project=project,location=location)#Createtheindexinstancefromanexistingindexindex=aiplatform.MatchingEngineIndex(index_name=index_name)index.update_metadata(display_name=display_name,description=description,labels=labels,)Console
Use these instructions to update index metadata (the console is limited to updatingdisplay_name anddescription).
- In the Vertex AI section of the Google Cloud console, go to theDeploy and Use section. SelectVector Search
- Select the index you want to update. TheIndex info page opens.
- SelectEdit Index. An edit index pane opens.
- Update the desired metadata fields.
- ClickUpdate
- ClickDone to close out the panel.
Compaction
Periodically, your index is rebuilt to account for all new updates since yourlast rebuild. This rebuild, or "compaction", improves query performance andreliability. Compactions occur for both streaming updates and batch updates.
Streaming update: Vector Search uses heuristics-based metrics todetermine when to trigger compaction. If the oldest uncompacted data is fivedays old, compaction is always triggered. You are billed for the cost ofrebuilding the index at the same rate of a batch update, in addition to thestreaming update costs.
Batch update: Occurs when the incremental dataset size is > 20% of the basedataset size.
Rebuild and query your index
You can send match or batch match requests as usual with the grpc cli, the client library,or the Vertex AI SDK for Python. When you rebuild the query you can expect to see your updateswithin a few seconds. To learn how to query an index, seeQuery indexes to get nearest neighbors.
Optional fields
When you create an index, there are some optional fields you can use tofine-tune your queries.
Upsert with restricts
Upserting your index and adding a restrict is a way of tagging your data pointsso they are already identified for filtering at query time.You might want to add restrict tags to limit the results that presentson your data before a query is sent. For example, a customer wants torun a query on an index, but wants to make sure the results only display itemsthat match "red" in a search for footwear. In the following example, theindex is being upserted and is filtering in all red shoes, but denying blue ones.This ensures the search filters in the best specific options from a large and variedindex before running.
In addition to token restricts, the example uses numeric restricts. In thiscase, the datapoint is associated with a price of 20, length of 0.3, and widthof 0.5. At the time of query, you can use these numeric restricts to filter theresults to limit the query results on the values of price, length, and width.For example, this datapoint would appear in a query that filters for price > 25,length < 1, and width < 1.
To learn more about filtering, seeVector Search for Indexing.
Python
# Upsert datapoints_TEST_DATAPOINT_1=aiplatform_v1.types.index.IndexDatapoint(datapoint_id="3",feature_vector=[0.00526886899,-0.0198396724],restricts=[aiplatform_v1.types.index.IndexDatapoint.Restriction(namespace="Color",allow_list=["red"])],numeric_restricts=[aiplatform_v1.types.index.IndexDatapoint.NumericRestriction(namespace="cost",value_int=1,)],)_TEST_DATAPOINT_2=aiplatform_v1.types.index.IndexDatapoint(datapoint_id="4",feature_vector=[0.00526886899,-0.0198396724],numeric_restricts=[aiplatform_v1.types.index.IndexDatapoint.NumericRestriction(namespace="cost",value_double=0.1,)],crowding_tag=aiplatform_v1.types.index.IndexDatapoint.CrowdingTag(crowding_attribute="crowding"),)_TEST_DATAPOINT_3=aiplatform_v1.types.index.IndexDatapoint(datapoint_id="5",feature_vector=[0.00526886899,-0.0198396724],numeric_restricts=[aiplatform_v1.types.index.IndexDatapoint.NumericRestriction(namespace="cost",value_float=1.1,)],)_TEST_DATAPOINTS=[_TEST_DATAPOINT_1,_TEST_DATAPOINT_2,_TEST_DATAPOINT_3]my_streaming_index=my_streaming_index.upsert_datapoints(datapoints=_TEST_DATAPOINTS)# Dynamic metadata update_TEST_DATAPOINT_4=aiplatform_v1.types.index.IndexDatapoint(datapoint_id="-2",numeric_restricts=[aiplatform_v1.types.index.IndexDatapoint.NumericRestriction(namespace="cost",value_float=1.1,)],)my_streaming_index=my_streaming_index.upsert_datapoints(datapoints=[_TEST_DATAPOINT4],update_mask=["all_restricts"])curl
curl-H"Content-Type: application/json"-H"Authorization: Bearer `gcloud auth print-access-token`"https://${ENDPOINT}/v1/projects/${PROJECT_ID}/locations/us-central1/indexes/${INDEX_ID}:upsertDatapoints \-d'{datapoints:[{datapoint_id:"'${DATAPOINT_ID_1}'",feature_vector:[...],restricts:{namespace:"color",allow_list:["red"],deny_list:["blue"]},numeric_restricts:[{namespace:"price",value_int:20},{namespace:"length",value_float:0.3},{namespace:"width",value_double:0.5}]}]}'Upsert with crowding
The crowding tag limits similar results by improving result diversity.Crowding is a constraint on a neighbor list produced by a nearest neighborsearch requiring that no more than some value, of a group of results, return thesame value ofcrowding_attribute. As an example, assume you were back onlineshopping for shoes. You want to see a wide variety of colors in the results, but maybewant them in a single style, like soccer cleats. You can ask that no more than 3 pairs of shoes withthe same color is returned by settingper_crowding_attribute_num_neighbors = 3in your query, assuming you set crowding_attribute to the color of the shoeswhen inserting the data point.
This field represents the allowed maximum number of matches with the same crowding tag.
curl-H"Content-Type: application/json"-H"Authorization: Bearer `gcloud auth print-access-token`"https://${ENDPOINT}/v1/projects/${PROJECT_ID}/locations/us-central1/indexes/${INDEX_ID}:upsertDatapoints \-d'{datapoints:[{datapoint_id:"'${DATAPOINT_ID_1}'",feature_vector:[...],restricts:{namespace:"type",allow_list:["cleats"]}crowding_tag:{crowding_attribute:"red-shoe"},}]}'What's next
- Learn aboutIndex configuration parameters.
- Learn how toMonitor an index.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.