Vertex AI RAG Engine overview Stay organized with collections Save and categorize content based on your preferences.
TheVPC-SC security controls and CMEK are supported by Vertex AI RAG Engine. Data residency and AXT security controls aren't supported.
If you use a Vertex AI RAG Engine-managed Spanner instance as a vector database in a location that is GA, then Google Cloud will bill you for that Spanner instance. For more information, seeVertex AI RAG Engine billing.
You must be added to the allowlist to access Vertex AI RAG Engine inus-central1 andus-east4. For users with existing projects, there is no impact. For users with new projects, you can try other regions, or contactvertex-ai-rag-engine-support@google.com to onboard tous-central1.
This page describes what Vertex AI RAG Engine is and how itworks.
| Description | Console |
|---|---|
| To learn how to use the Vertex AI SDK to run Vertex AI RAG Engine tasks, see theRAG quickstart for Python. |
Overview
Vertex AI RAG Engine, a component of the Vertex AIPlatform, facilitates Retrieval-Augmented Generation (RAG).Vertex AI RAG Engine is also a data framework for developingcontext-augmented large language model (LLM) applications. Context augmentationoccurs when you apply an LLM to your data. This implements retrieval-augmentedgeneration (RAG).
A common problem with LLMs is that they don't understand private knowledge, thatis, your organization's data. With Vertex AI RAG Engine, you canenrich the LLM context with additional private information, because the modelcan reduce hallucination and answer questions more accurately.
By combining additional knowledge sources with the existing knowledge that LLMshave, a better context is provided. The improved context along with the queryenhances the quality of the LLM's response.
The following image illustrates the key concepts to understandingVertex AI RAG Engine.

These concepts are listed in the order of the retrieval-augmented generation(RAG) process.
Data ingestion: Intake data from different data sources. For example,local files, Cloud Storage, and Google Drive.
Data transformation:Conversion of the data in preparation for indexing. For example, data issplit into chunks.
Embedding: Numericalrepresentations of words or pieces of text. These numbers capture thesemantic meaning and context of the text. Similar or related words or texttend to have similar embeddings, which means they are closer together in thehigh-dimensional vector space.
Data indexing: Vertex AI RAG Engine creates an index called acorpus.The index structures the knowledge base so it's optimized for searching. Forexample, the index is like a detailed table of contents for a massivereference book.
Retrieval: When a user asks a question or provides a prompt, the retrievalcomponent in Vertex AI RAG Engine searches through its knowledgebase to find information that is relevant to the query.
Generation: The retrieved information becomes the context added to theoriginal user query as a guide for the generative AI model to generatefactuallygrounded and relevant responses.
Supported regions
Vertex AI RAG Engine is supported in the following regions:
| Region | Location | Description | Launch stage |
|---|---|---|---|
us-central1 | Iowa | v1 andv1beta1 versions are supported. | Allowlist |
us-east4 | Virginia | v1 andv1beta1 versions are supported. | Allowlist |
europe-west3 | Frankfurt, Germany | v1 andv1beta1 versions are supported. | GA |
europe-west4 | Eemshaven, Netherlands | v1 andv1beta1 versions are supported. | GA |
asia-east1 | Taiwan | v1 andv1beta1 versions are supported. | Preview |
asia-northeast1 | Tokyo | v1 andv1beta1 versions are supported. | Preview |
asia-northeast3 | Seoul | v1 andv1beta1 versions are supported. | Preview |
asia-south1 | Mumbai | v1 andv1beta1 versions are supported. | Preview |
asia-southeast1 | Singapore | v1 andv1beta1 versions are supported. | Preview |
europe-central2 | Warsaw | v1 andv1beta1 versions are supported. | Preview |
europe-north1 | Finland | v1 andv1beta1 versions are supported. | Preview |
europe-southwest1 | Madrid | v1 andv1beta1 versions are supported. | Preview |
europe-west1 | Belgium | v1 andv1beta1 versions are supported. | Preview |
europe-west2 | London | v1 andv1beta1 versions are supported. | Preview |
europe-west6 | Zürich | v1 andv1beta1 versions are supported. | Preview |
europe-west8 | Milan | v1 andv1beta1 versions are supported. | Preview |
europe-west9 | Paris | v1 andv1beta1 versions are supported. | Preview |
us-east1 | Moncks Corner, SC | v1 andv1beta1 versions are supported. | Preview |
us-east5 | Columbus, OH | v1 andv1beta1 versions are supported. | Preview |
us-south1 | Dallas, TX | v1 andv1beta1 versions are supported. | Preview |
us-west1 | Oregon | v1 andv1beta1 versions are supported. | Preview |
us-west4 | Las Vegas, NV | v1 andv1beta1 versions are supported. | Preview |
us-central1andus-east4are changed toAllowlist. If you'd like to experiment withVertex AI RAG Engine, try other regions. If you plan to onboardyour production traffic to these regions, contactvertex-ai-rag-engine-support@google.com.
Delete Vertex AI RAG Engine
The following code samples demonstrate how to delete aVertex AI RAG Engine for the Google Cloud console, Python, and REST:
Version 1 (v1) APIparametersandcode samples.
v1beta1 APIparametersandcode samples.
Submit feedback
To chat with Google support, go to theVertex AI RAG Enginesupportgroup.
To send an email, use the email addressvertex-ai-rag-engine-support@google.com.
What's next
- To learn how to use the Vertex AI SDK to runVertex AI RAG Engine tasks, seeRAG quickstart forPython.
- To learn about grounding, seeGroundingoverview.
- To learn more about the responses from RAG, seeRetrieval and Generation Output of Vertex AI RAG Engine.
- To learn about the RAG architecture:
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.