Use RagManagedDb with Vertex AI RAG Engine

TheVPC-SC security controls and CMEK are supported by Vertex AI RAG Engine. Data residency and AXT security controls aren't supported.

The Vertex AI RAG Engine-managed Spanner instance is used as a vector database and is GA with billing enabled. For more information, seeVertex AI RAG Engine billing.

This page shows you how Vertex AI RAG Engine usesRagManagedDb, whichis an enterprise-ready vector database used to store and manage vectorrepresentations of your documents. The vector database is then used to retrieverelevant documents based on the document's semantic similarity to a given query.

In addition, this page shows you how to implement CMEK.

Important:RagManagedDb is used by default and uses Spanner. Customers will be charged for the use of a Google-managed Spanner instance that's provisioned in a Google-tenant project using standard Spanner SKUs.

Manage your retrieval strategy

RagManagedDb offers the following retrieval strategies to support your RAG usecases:

Retrieval strategyDescription
k-Nearest Neighbors (KNN) (Default)Finds the exact nearest neighbors by comparing all data points in your RAG corpus. If you don't specify a strategy during the creation of your RAG corpus, KNN is the default retrieval strategy used.
  • Verifies the perfect recall (1.0) during retrieval.
  • Great for recall-sensitive applications.
  • Great for small to medium-sized RAG corpora, which stores less than 10,000 RAG files.
  • Requires searching across every single data point, therefore, the latency increases with the number of RAG files in the corpus.
Approximate Nearest Neighbors (ANN)Uses approximation techniques to find similar neighbors faster than the KNN technique.
  • Reduces query latencies significantly on large RAG corpora.
  • Recall slightly lowered due to approximation techniques used.
  • Becomes very effective when you have large RAG corpora, which is approximately more than 10,000 RAG files.
  • The amount of recall loss that's acceptable to you depends on the use case, but in most large-scale cases, losing a bit of recall in return for improved query performance is an acceptable tradeoff.

Create a RAG corpus with KNNRagManagedDb

This code samples demonstrates how to create a RAG corpus using KNNRagManagedDb.

Python

fromvertexai.previewimportragimportvertexaiPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATIONDISPLAY_NAME=YOUR_RAG_CORPUS_DISPLAY_NAME# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location=LOCATION)vector_db=rag.RagManagedDb(retrieval_strategy=rag.KNN())rag_corpus=rag.create_corpus(display_name=DISPLAY_NAME,backend_config=rag.RagVectorDbConfig(vector_db=vector_db))

REST

Replace the following variables:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
PROJECT_ID=PROJECT_IDLOCATION=LOCATIONCORPUS_DISPLAY_NAME=CORPUS_DISPLAY_NAMEcurl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora\-d'{      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',      "vector_db_config": {        "ragManagedDb": {          "knn": {}        }      }    }'

Preview

Some of the RAG features are Preview offerings, subject to the "Pre-GA Offerings Terms" of theGoogle Cloud Service Specific Terms. Pre-GA products and features are available "as-is" and may have limited support, and changes to Pre-GA products and features may not be compatible with other Pre-GA versions. For more information, see the launch stage descriptions. By using the Gemini API on Vertex AI, you agree to the Generative AI Preview terms and conditions(Preview Terms).

Create a RAG corpus with ANNRagManagedDb

To offer the ANN feature,RagManagedDb uses a tree-based structure topartition data and facilitate faster searches. To enable the best recall andlatency, the structure of this tree should be configured by experimentation tofit your data size and distribution.RagManagedDb lets you configure thetree_depth and theleaf_count of the tree.

Thetree_depth determines the number of layers or the levels in the tree.Follow these guidelines:

  • If you have approximately 10,000 RAG files in the RAG corpus, set the value to 2.
  • If you have more RAG files than that, set this to 3.
  • If thetree_depth isn't specified,Vertex AI RAG Engine assigns a default value of 2 for thisparameter.

Theleaf_count determines the number of leaf nodes in the tree-based structure. Each leaf node contains groups of closely related vectors along with their corresponding centroid. Follow these guidelines:

  • The recommended value is10 * sqrt(num of RAG files in your RAG corpus).
  • If not specified, Vertex AI RAG Engine assigns adefaultvalue of 500 for this parameter.

Python

fromvertexai.previewimportragimportvertexaiPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATIONDISPLAY_NAME=YOUR_RAG_CORPUS_DISPLAY_NAMETREE_DEPTH=YOUR_TREE_DEPTH# Optional: Acceptable values are 2 or 3. Default is 2.LEAF_COUNT=YOUR_LEAF_COUNT# Optional: Default is 500.# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location=LOCATION)ann_config=rag.ANN(tree_depth=TREE_DEPTH,leaf_count=LEAF_COUNT)vector_db=rag.RagManagedDb(retrieval_strategy=ann_config)rag_corpus=rag.create_corpus(display_name=DISPLAY_NAME,backend_config=rag.RagVectorDbConfig(vector_db=vector_db))

REST

Replace the following variables:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
  • TREE_DEPTH: Your tree depth.
  • LEAF_COUNT: Your leaf count.
PROJECT_ID=PROJECT_IDLOCATION=LOCATIONCORPUS_DISPLAY_NAME=CORPUS_DISPLAY_NAMETREE_DEPTH=TREE_DEPTHLEAF_COUNT=LEAF_COUNTcurl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora\-d'{      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',      "vector_db_config": {        "ragManagedDb": {          "ann": {            "tree_depth": '"${TREE_DEPTH}"',            "leaf_count": '"${LEAF_COUNT}"'          }        }      }    }'

Preview

Some of the RAG features are Preview offerings, subject to the "Pre-GA Offerings Terms" of theGoogle Cloud Service Specific Terms. Pre-GA products and features are available "as-is" and may have limited support, and changes to Pre-GA products and features may not be compatible with other Pre-GA versions. For more information, see the launch stage descriptions. By using the Gemini API on Vertex AI, you agree to the Generative AI Preview terms and conditions(Preview Terms).

Importing your data into ANNRagManagedDb

You can use either theImportRagFiles API or theUploadRagFile API to importyour data into the ANNRagManagedDb. However, unlike the KNN retrievalstrategy, the ANN approach requires the underlying tree-based index to berebuilt at least once and optionally after importing significant amounts of datafor optimal recall. To have Vertex AI RAG Engine rebuild your ANNindex, set therebuild_ann_index to true in yourImportRagFiles API request.

The following are important:

  1. Before you query the RAG corpus, you must rebuild the ANN index at least once.
  2. Only one concurrent index rebuild is supported on a project in each location.

To upload your local file into your RAG corpus, seeUpload a RAGfile. Toimport data into your RAG corpus and trigger an ANN index rebuild, see thefollowing code sample that demonstrates how to import from Cloud Storage. Tolearn about the supported data sources, seeData sources supported forRAG.

Python

fromvertexai.previewimportragimportvertexaiPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATIONCORPUS_ID=YOUR_CORPUS_IDPATHS=["gs://my_bucket/my_files_dir"]REBUILD_ANN_INDEX=REBUILD_ANN_INDEX# Choose true or false.# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location=LOCATION)corpus_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragCorpora/{CORPUS_ID}"# This is a non blocking call.response=awaitrag.import_files_async(corpus_name=corpus_name,paths=PATHS,rebuild_ann_index=REBUILD_ANN_INDEX)# Wait for the import to complete.awaitresponse.result()

REST

GCS_URI=GCS_URIREBUILD_ANN_INDEX=<true/false>curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${CORPUS_ID}/ragFiles:import\-d'{  "import_rag_files_config": {    "gcs_source": {      "uris": '\""${GCS_URI}"\"',      },    "rebuild_ann_index": '${REBUILD_ANN_INDEX}'  }}'

Manage your encryption

Vertex AI RAG Engine provides robust options for managing howyour data at rest is encrypted. By default, all user data withinRagManagedDbis encrypted using a Google-owned and Google-managed encryption key, which is the defaultsetting. This default setting helps you to verify that your data is securewithout requiring any specific configuration.

If you require more control over your keys used for encryption,Vertex AI RAG Engine supports Customer-Managed Encryption Key(CMEK). With CMEK, you can use your cryptographic keys, managed withinCloud Key Management Service (KMS), to protect your RAG corpus data.

For information on CMEK limitations for RAG corpora, seeCMEK limitations forVertex AI RAG Engine.

Set up your KMS key and grant permissions

Before you can create a RAG corpus encrypted with CMEK, you must set up acryptographic key in Google Cloud KMS, and grant theVertex AI RAG Engine service account the necessary permissions touse this key.

Prerequisites

To perform the following setup steps, verify that your user account has theappropriate Identity and Access Management (IAM) permissions in the Google Cloud project where youintend to create the KMS key and the RAG corpus. Typically, a role like theCloud KMS Admin role (roles/cloudkms.admin) is required.

Enable the API

To enable the Cloud KMS API, do the following:

  1. Navigate to the Google Cloud console.
  2. Select the project where you want to manage your keys, and create your RAGcorpus.
  3. In the search bar, type "Key Management", and select the "Key Management"service.
  4. If the API isn't enabled, clickEnable. You might need to wait a fewminutes for the API to be fully provisioned.

Create your KMS key ring and key

To create a key ring, do the following:

  1. In theKey Management section, clickCreate Key Ring.

    Enter the following:

    • Key ring name: Enter a unique name for your key ring such asrag-engine-cmek-keys.
    • Location type: Select Region. The Cloud Key Management Service key ring must be in thesame region as the Vertex AI RAG Engine endpoint that you'reusing when you're encrypting a RAG corpus with CMEK.
    • Location: Choose the selected region such asus-central1. This regionshould ideally match the region where your RAG Engine resources will reside.
  2. ClickCreate.

To create a key within the key ring, do the following:

  1. After the key ring is created, you'll be prompted, or you can navigate toCreate Key.

    Enter the following:

    • Key name: Enter a unique name for your key such asmy-rag-corpus-key.
    • Protection level: Choose a protection level (Software orHSM). Ifyou require hardware-backed keys, selectHSM.
    • Purpose: SelectSymmetric encrypt/decrypt. This is required for CMEK.
    • Key material source: SelectGenerated key.
    • Rotation period: Optional. Recommended. Configure a key rotationschedule according to your organization's security policies such as every 90days.
  2. ClickCreate.

To copy the key resource name, do the following:

  1. After the key is created, navigate to its details page.

  2. Locate the resource name. The format isprojects/YOUR_PROJECT_ID/locations/YOUR_REGION/keyRings/YOUR_KEY_RING_NAME/cryptoKeys/YOUR_KEY_NAME/cryptoKeyVersions/1.

    Important: For theEncryptionSpec in your RAG corpus, you must use the keyresource name without the version number.
  3. Copy the resource name, and remove the/cryptoKeyVersions/VERSION_NUMBERpart. The correctly formatted resource name isprojects/YOUR_PROJECT_ID/locations/YOUR_REGION/keyRings/YOUR_KEY_RING_NAME/cryptoKeys/YOUR_KEY_NAME.

Grant Permissions to the Vertex AI RAG Engine service agent

For the Vertex AI RAG Engine to encrypt and decrypt data usingyour KMS key, its service agent needs appropriate permissions on that specifickey.

To identify your Vertex AI RAG Engine service agent, do thefollowing:

  1. Navigate to theIAM & Admin > IAM page in the Google Cloud console foryour project.

  2. On the Identity and Access Management page, enable theInclude Google-provided role grantscheckbox.

  3. In the filter or search bar for the principals list, search for theVertex AI RAG Engine service agent. It follows the patternservice-YOUR_PROJECT_NUMBER@gcp-sa-vertex-rag.iam.gserviceaccount.com.

    ReplaceYOUR_PROJECT_NUMBER with your Google Cloud projectnumber.

If your Vertex AI RAG Engine serviceagent isn't present yet, do the following to trigger service agent creation:

  1. Enablethe Resource Manager API.

  2. Execute this command in the Cloud Shell or command line:

    gcloudbetaservicesidentitycreate--service=aiplatform.googleapis.com\--projects=PROJECT_ID

    Alternatively, send the REST API call:

    curl-XPOST-H"Authorization: Bearer$(gcloudauthprint-access-token)"-H"Content-Type: application/json; charset=utf-8"-d"""https://serviceusage.googleapis.com/v1beta1/projects/PROJECT_ID/services/aiplatform.googleapis.com:generateServiceIdentity"
  3. Verify that the Vertex AI RAG Engine service agent was created.

To grant permissions on the KMS key, do the following:

  1. Go back to the Key Management service in the Google Cloud console.

  2. Select the key ring containing the key you created.

  3. Select the specific key you created.

  4. In the key's details page, go to thePermissions tab.

  5. ClickAdd Principal.

  6. In theNew principals field, type the Vertex AI RAG Engineservice agent's email address.

  7. In theSelect a role drop-down, select the Cloud KMS CryptoKeyEncrypter/Decrypter role (roles/cloudkms.cryptoKeyEncrypterDecrypter). Thisrole grants the service agent the necessary permissions to use the key forencryption and decryption operations.

  8. ClickSave.

Create a RAG corpus with customer-managed encryption

This code sample demonstrates how to create a RAG corpus encrypted with aCustomer Managed Encrypted Key (CMEK).

Replace the variables in the following code samples:

Python

importvertexaifromgoogle.cloudimportaiplatformfromvertexaiimportragfromgoogle.cloud.aiplatform_v1.types.encryption_specimportEncryptionSpecPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATIONDISPLAY_NAME=YOUR_RAG_CORPUS_DISPLAY_NAMEKMS_KEY_NAME=YOUR_KMS_KEY_NAMEvertexai.init(project=PROJECT_ID)rag_corpus=rag.create_corpus(display_name=DISPLAY_NAME,encryption_spec=EncryptionSpec(kms_key_name=KMS_KEY_NAME))

REST

PROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATIONDISPLAY_NAME=YOUR_RAG_CORPUS_DISPLAY_NAMEKMS_KEY_NAME=YOUR_KMS_KEY_NAMEcurl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora\-d'{      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',      "encryption_spec" : {        "kms_key_name" : '\""${KMS_KEY_NAME}"\"'      }    }'

Quotas

When you use CMEK with Vertex AI services, such as theVertex AI RAG Engine, there's a quota on the number of uniqueCloud KMS keys that can be in use per project per region. This quota istracked by the metricaiplatform.googleapis.com/in_use_customer_managed_encryption_keys.

Each time you use a new, unique KMS key to create a resource like a RAG corpuswithin a project and region, the KMS key consumes one unit of this quota. Thisquota unit isn't released even if the resources using that specific key aredeleted.

If you need more unique keys than the current limit, you must request a quota increase foraiplatform.googleapis.com/in_use_customer_managed_encryption_keys for theselect region.

For more information on how to request a quota increase, seeView and edit thequotas in the Google Cloud console.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.