REST Resource: projects.locations.ragCorpora Stay organized with collections Save and categorize content based on your preferences.
Resource: RagCorpus
A RagCorpus is a RagFile container and a project can have multiple RagCorpora.
namestringOutput only. The resource name of the RagCorpus.
displayNamestringRequired. The display name of the RagCorpus. The name can be up to 128 characters long and can consist of any UTF-8 characters.
descriptionstringOptional. The description of the RagCorpus.
ragEmbeddingModelConfig
(deprecated)object (RagEmbeddingModelConfig)ragVectorDbConfig
(deprecated)object (RagVectorDbConfig)createTimestring (Timestamp format)Output only. timestamp when this RagCorpus was created.
Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples:"2014-10-02T15:01:23Z","2014-10-02T15:01:23.045123456Z" or"2014-10-02T15:01:23+05:30".
updateTimestring (Timestamp format)Output only. timestamp when this RagCorpus was last updated.
Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples:"2014-10-02T15:01:23Z","2014-10-02T15:01:23.045123456Z" or"2014-10-02T15:01:23+05:30".
corpusStatusobject (CorpusStatus)Output only. RagCorpus state.
ragFilesCountintegerOutput only. Number of RagFiles in the RagCorpus.
NOTE: This field is not populated in the response ofVertexRagDataService.ListRagCorpora.
encryptionSpecobject (EncryptionSpec)Optional. Immutable. The CMEK key name used to encrypt at-rest data related to this Corpus. Only applicable to RagManagedDb option for Vector DB. This field can only be set at corpus creation time, and cannot be updated or deleted.
corpusTypeConfigobject (CorpusTypeConfig)Optional. The corpus type config of the RagCorpus.
satisfiesPzsbooleanOutput only. reserved for future use.
satisfiesPzibooleanOutput only. reserved for future use.
backend_configUnion typebackend_config can be only one of the following:vectorDbConfigobject (RagVectorDbConfig)Optional. Immutable. The config for the Vector DBs.
vertexAiSearchConfigobject (VertexAiSearchConfig)Optional. Immutable. The config for the Vertex AI Search.
| JSON representation |
|---|
{"name":string,"displayName":string,"description":string,"ragEmbeddingModelConfig":{object ( |
RagVectorDbConfig
Config for the Vector DB to use for RAG.
apiAuthobject (ApiAuth)Authentication config for the chosen Vector DB.
ragEmbeddingModelConfigobject (RagEmbeddingModelConfig)Optional. Immutable. The embedding model config of the Vector DB.
vector_dbUnion typevector_db can be only one of the following:ragManagedDbobject (RagManagedDb)The config for the RAG-managed Vector DB.
weaviateobject (Weaviate)The config for the Weaviate.
pineconeobject (Pinecone)The config for the Pinecone.
vertexFeatureStoreobject (VertexFeatureStore)The config for the Vertex feature Store.
vertexVectorSearchobject (VertexVectorSearch)The config for the Vertex Vector Search.
ragManagedVertexVectorSearchobject (RagManagedVertexVectorSearch)The config for the RAG-managed Vertex Vector Search 2.0.
| JSON representation |
|---|
{"apiAuth":{object ( |
RagManagedDb
The config for the default RAG-managed Vector DB.
retrieval_strategyUnion typeretrieval_strategy can be only one of the following:knnobject (KNN)Performs a KNN search on RagCorpus. Default choice if not specified.
annobject (ANN)Performs an ANN search on RagCorpus. Use this if you have a lot of files (> 10K) in your RagCorpus and want to reduce the search latency.
KNN
This type has no fields.
Config for KNN search.
ANN
Config for ANN search.
RagManagedDb uses a tree-based structure to partition data and facilitate faster searches. As a tradeoff, it requires longer indexing time and manual triggering of index rebuild via the ImportRagFiles and ragCorpora.patch API.
treeDepthintegerThe depth of the tree-based structure. Only depth values of 2 and 3 are supported.
Recommended value is 2 if you have if you have O(10K) files in the RagCorpus and set this to 3 if more than that.
Default value is 2.
leafCountintegerNumber of leaf nodes in the tree-based structure. Each leaf node contains groups of closely related vectors along with their corresponding centroid.
Recommended value is 10 * sqrt(num of RagFiles in your RagCorpus).
Default value is 500.
| JSON representation |
|---|
{"treeDepth":integer,"leafCount":integer} |
Weaviate
The config for the Weaviate.
httpEndpointstringWeaviate DB instance HTTP endpoint. e.g. 34.56.78.90:8080 Vertex RAG only supports HTTP connection to Weaviate. This value cannot be changed after it's set.
collectionNamestringThe corresponding collection this corpus maps to. This value cannot be changed after it's set.
| JSON representation |
|---|
{"httpEndpoint":string,"collectionName":string} |
Pinecone
The config for the Pinecone.
indexNamestringPinecone index name. This value cannot be changed after it's set.
| JSON representation |
|---|
{"indexName":string} |
VertexFeatureStore
The config for the Vertex feature Store.
featureViewResourceNamestringThe resource name of the FeatureView. Format:projects/{project}/locations/{location}/featureOnlineStores/{featureOnlineStore}/featureViews/{featureView}
| JSON representation |
|---|
{"featureViewResourceName":string} |
VertexVectorSearch
The config for the Vertex Vector Search.
indexEndpointstringThe resource name of the Index Endpoint. Format:projects/{project}/locations/{location}/indexEndpoints/{indexEndpoint}
indexstringThe resource name of the Index. Format:projects/{project}/locations/{location}/indexes/{index}
| JSON representation |
|---|
{"indexEndpoint":string,"index":string} |
RagManagedVertexVectorSearch
The config for the RAG-managed Vertex Vector Search 2.0.
collectionNamestringOutput only. The resource name of the Vector Search 2.0 Collection that RAG Created for the corpus. Only populated after the corpus is successfully created. Format:projects/{project}/locations/{location}/collections/{collectionId}
| JSON representation |
|---|
{"collectionName":string} |
ApiAuth
The generic reusable api auth config. Deprecated. Please use AuthConfig (google/cloud/aiplatform/master/auth.proto) instead.
auth_configUnion typeauth_config can be only one of the following:apiKeyConfigobject (ApiKeyConfig)The API secret.
| JSON representation |
|---|
{// auth_config"apiKeyConfig":{object ( |
RagEmbeddingModelConfig
Config for the embedding model to use for RAG.
model_configUnion typemodel_config can be only one of the following:vertexPredictionEndpointobject (VertexPredictionEndpoint)The Vertex AI Prediction Endpoint that either refers to a publisher model or an endpoint that is hosting a 1P fine-tuned text embedding model. endpoints hosting non-1P fine-tuned text embedding models are currently not supported. This is used for dense vector search.
hybridSearchConfigobject (HybridSearchConfig)Configuration for hybrid search.
| JSON representation |
|---|
{// model_config"vertexPredictionEndpoint":{object ( |
VertexPredictionEndpoint
Config representing a model hosted on Vertex Prediction Endpoint.
endpointstringRequired. The endpoint resource name. Format:projects/{project}/locations/{location}/publishers/{publisher}/models/{model} orprojects/{project}/locations/{location}/endpoints/{endpoint}
modelstringOutput only. The resource name of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model. Pattern:projects/{project}/locations/{location}/models/{model}
modelVersionIdstringOutput only. version id of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model.
| JSON representation |
|---|
{"endpoint":string,"model":string,"modelVersionId":string} |
HybridSearchConfig
Config for hybrid search.
sparseEmbeddingConfigobject (SparseEmbeddingConfig)Optional. The configuration for sparse embedding generation. This field is optional the default behavior depends on the vector database choice on the RagCorpus.
denseEmbeddingModelPredictionEndpointobject (VertexPredictionEndpoint)Required. The Vertex AI Prediction Endpoint that hosts the embedding model for dense embedding generations.
| JSON representation |
|---|
{"sparseEmbeddingConfig":{object ( |
SparseEmbeddingConfig
Bm25
message for BM25 parameters.
multilingualbooleanOptional. Use multilingual tokenizer if set to true.
k1numberOptional. The parameter to control term frequency saturation. It determines the scaling between the matching term frequency and final score. k1 is in the range of [1.2, 3]. The default value is 1.2.
bnumberOptional. The parameter to control document length normalization. It determines how much the document length affects the final score. b is in the range of [0, 1]. The default value is 0.75.
| JSON representation |
|---|
{"multilingual":boolean,"k1":number,"b":number} |
VertexAiSearchConfig
Config for the Vertex AI Search.
servingConfigstringVertex AI Search Serving Config resource full name. For example,projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{servingConfig} orprojects/{project}/locations/{location}/collections/{collection}/dataStores/{dataStore}/servingConfigs/{servingConfig}.
| JSON representation |
|---|
{"servingConfig":string} |
CorpusStatus
State
RagCorpus life state.
| Enums | |
|---|---|
UNKNOWN | This state is not supposed to happen. |
INITIALIZED | RagCorpus resource entry is initialized, but hasn't done validation. |
ACTIVE | RagCorpus is provisioned successfully and is ready to serve. |
ERROR | RagCorpus is in a problematic situation. SeeerrorMessage field for details. |
CorpusTypeConfig
The config for the corpus type of the RagCorpus.
corpus_type_configUnion typecorpus_type_config can be only one of the following:documentCorpusobject (DocumentCorpus)Optional. Config for the document corpus.
memoryCorpusobject (MemoryCorpus)Optional. Config for the memory corpus.
| JSON representation |
|---|
{// corpus_type_config"documentCorpus":{object ( |
DocumentCorpus
This type has no fields.
Config for the document corpus.
MemoryCorpus
LlmParser
Specifies the LLM parsing for RagFiles.
modelNamestringThe name of a LLM model used for parsing. Format: *projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}
maxParsingRequestsPerMinintegerThe maximum number of requests the job is allowed to make to the LLM model per minute. Consulthttps://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If unspecified, a default value of 5000 QPM would be used.
globalMaxParsingRequestsPerMinintegerThe maximum number of requests the job is allowed to make to the LLM model per minute in this project. Consulthttps://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If this value is not specified, maxParsingRequestsPerMin will be used by indexing pipeline job as the global limit.
customParsingPromptstringThe prompt to use for parsing. If not specified, a default prompt will be used.
| JSON representation |
|---|
{"modelName":string,"maxParsingRequestsPerMin":integer,"globalMaxParsingRequestsPerMin":integer,"customParsingPrompt":string} |
Methods | |
|---|---|
| Creates a RagCorpus. |
| Deletes a RagCorpus. |
| Gets a RagCorpus. |
| Lists RagCorpora in a Location. |
| Updates a RagCorpus. |
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-17 UTC.