REST Resource: projects.locations.ragCorpora

Resource: RagCorpus

A RagCorpus is a RagFile container and a project can have multiple RagCorpora.

Fields

namestring

Output only. The resource name of the RagCorpus.

displayNamestring

Required. The display name of the RagCorpus. The name can be up to 128 characters long and can consist of any UTF-8 characters.

descriptionstring

Optional. The description of the RagCorpus.

ragEmbeddingModelConfig
(deprecated)

object (RagEmbeddingModelConfig)

This item is deprecated!

Optional. Immutable. The embedding model config of the RagCorpus.

ragVectorDbConfig
(deprecated)

object (RagVectorDbConfig)

This item is deprecated!

Optional. Immutable. The Vector DB config of the RagCorpus.

createTimestring (Timestamp format)

Output only. timestamp when this RagCorpus was created.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples:"2014-10-02T15:01:23Z","2014-10-02T15:01:23.045123456Z" or"2014-10-02T15:01:23+05:30".

updateTimestring (Timestamp format)

Output only. timestamp when this RagCorpus was last updated.

corpusStatusobject (CorpusStatus)

Output only. RagCorpus state.

ragFilesCountinteger

Output only. Number of RagFiles in the RagCorpus.

NOTE: This field is not populated in the response ofVertexRagDataService.ListRagCorpora.

encryptionSpecobject (EncryptionSpec)

Optional. Immutable. The CMEK key name used to encrypt at-rest data related to this Corpus. Only applicable to RagManagedDb option for Vector DB. This field can only be set at corpus creation time, and cannot be updated or deleted.

corpusTypeConfigobject (CorpusTypeConfig)

Optional. The corpus type config of the RagCorpus.

satisfiesPzsboolean

Output only. reserved for future use.

satisfiesPziboolean

Output only. reserved for future use.

backend_configUnion type

The backend config of the RagCorpus. It can be data store and/or retrieval engine.backend_config can be only one of the following:

vectorDbConfigobject (RagVectorDbConfig)

Optional. Immutable. The config for the Vector DBs.

vertexAiSearchConfigobject (VertexAiSearchConfig)

Optional. Immutable. The config for the Vertex AI Search.

JSON representation

JSON representation
{"name":string,"displayName":string,"description":string,"ragEmbeddingModelConfig":{object (`RagEmbeddingModelConfig`)},"ragVectorDbConfig":{object (`RagVectorDbConfig`)},"createTime":string,"updateTime":string,"corpusStatus":{object (`CorpusStatus`)},"ragFilesCount":integer,"encryptionSpec":{object (`EncryptionSpec`)},"corpusTypeConfig":{object (`CorpusTypeConfig`)},"satisfiesPzs":boolean,"satisfiesPzi":boolean,// backend_config"vectorDbConfig":{object (`RagVectorDbConfig`)},"vertexAiSearchConfig":{object (`VertexAiSearchConfig`)}// Union type}

{"name":string,"displayName":string,"description":string,"ragEmbeddingModelConfig":{object (RagEmbeddingModelConfig)},"ragVectorDbConfig":{object (RagVectorDbConfig)},"createTime":string,"updateTime":string,"corpusStatus":{object (CorpusStatus)},"ragFilesCount":integer,"encryptionSpec":{object (EncryptionSpec)},"corpusTypeConfig":{object (CorpusTypeConfig)},"satisfiesPzs":boolean,"satisfiesPzi":boolean,// backend_config"vectorDbConfig":{object (RagVectorDbConfig)},"vertexAiSearchConfig":{object (VertexAiSearchConfig)}// Union type}

RagVectorDbConfig

Config for the Vector DB to use for RAG.

Fields

apiAuthobject (ApiAuth)

Authentication config for the chosen Vector DB.

ragEmbeddingModelConfigobject (RagEmbeddingModelConfig)

Optional. Immutable. The embedding model config of the Vector DB.

vector_dbUnion type

The config for the Vector DB.vector_db can be only one of the following:

ragManagedDbobject (RagManagedDb)

The config for the RAG-managed Vector DB.

weaviateobject (Weaviate)

The config for the Weaviate.

pineconeobject (Pinecone)

The config for the Pinecone.

vertexFeatureStoreobject (VertexFeatureStore)

The config for the Vertex feature Store.

vertexVectorSearchobject (VertexVectorSearch)

The config for the Vertex Vector Search.

ragManagedVertexVectorSearchobject (RagManagedVertexVectorSearch)

The config for the RAG-managed Vertex Vector Search 2.0.

JSON representation

JSON representation
{"apiAuth":{object (`ApiAuth`)},"ragEmbeddingModelConfig":{object (`RagEmbeddingModelConfig`)},// vector_db"ragManagedDb":{object (`RagManagedDb`)},"weaviate":{object (`Weaviate`)},"pinecone":{object (`Pinecone`)},"vertexFeatureStore":{object (`VertexFeatureStore`)},"vertexVectorSearch":{object (`VertexVectorSearch`)},"ragManagedVertexVectorSearch":{object (`RagManagedVertexVectorSearch`)}// Union type}

{"apiAuth":{object (ApiAuth)},"ragEmbeddingModelConfig":{object (RagEmbeddingModelConfig)},// vector_db"ragManagedDb":{object (RagManagedDb)},"weaviate":{object (Weaviate)},"pinecone":{object (Pinecone)},"vertexFeatureStore":{object (VertexFeatureStore)},"vertexVectorSearch":{object (VertexVectorSearch)},"ragManagedVertexVectorSearch":{object (RagManagedVertexVectorSearch)}// Union type}

RagManagedDb

The config for the default RAG-managed Vector DB.

Fields

retrieval_strategyUnion type

Choice of retrieval strategy.retrieval_strategy can be only one of the following:

knnobject (KNN)

Performs a KNN search on RagCorpus. Default choice if not specified.

annobject (ANN)

Performs an ANN search on RagCorpus. Use this if you have a lot of files (> 10K) in your RagCorpus and want to reduce the search latency.

JSON representation
{// retrieval_strategy"knn":{object (`KNN`)},"ann":{object (`ANN`)}// Union type}

KNN

This type has no fields.

Config for KNN search.

ANN

Config for ANN search.

RagManagedDb uses a tree-based structure to partition data and facilitate faster searches. As a tradeoff, it requires longer indexing time and manual triggering of index rebuild via the ImportRagFiles and ragCorpora.patch API.

Fields

treeDepthinteger

The depth of the tree-based structure. Only depth values of 2 and 3 are supported.

Recommended value is 2 if you have if you have O(10K) files in the RagCorpus and set this to 3 if more than that.

Default value is 2.

leafCountinteger

Number of leaf nodes in the tree-based structure. Each leaf node contains groups of closely related vectors along with their corresponding centroid.

Recommended value is 10 * sqrt(num of RagFiles in your RagCorpus).

Default value is 500.

JSON representation
{"treeDepth":integer,"leafCount":integer}

Weaviate

The config for the Weaviate.

Fields

httpEndpointstring

Weaviate DB instance HTTP endpoint. e.g. 34.56.78.90:8080 Vertex RAG only supports HTTP connection to Weaviate. This value cannot be changed after it's set.

collectionNamestring

The corresponding collection this corpus maps to. This value cannot be changed after it's set.

JSON representation
{"httpEndpoint":string,"collectionName":string}

Pinecone

The config for the Pinecone.

Fields

indexNamestring

Pinecone index name. This value cannot be changed after it's set.

JSON representation
{"indexName":string}

VertexFeatureStore

The config for the Vertex feature Store.

Fields

featureViewResourceNamestring

The resource name of the FeatureView. Format:projects/{project}/locations/{location}/featureOnlineStores/{featureOnlineStore}/featureViews/{featureView}

JSON representation
{"featureViewResourceName":string}

VertexVectorSearch

The config for the Vertex Vector Search.

Fields

indexEndpointstring

The resource name of the Index Endpoint. Format:projects/{project}/locations/{location}/indexEndpoints/{indexEndpoint}

indexstring

The resource name of the Index. Format:projects/{project}/locations/{location}/indexes/{index}

JSON representation
{"indexEndpoint":string,"index":string}

RagManagedVertexVectorSearch

The config for the RAG-managed Vertex Vector Search 2.0.

Fields

collectionNamestring

Output only. The resource name of the Vector Search 2.0 Collection that RAG Created for the corpus. Only populated after the corpus is successfully created. Format:projects/{project}/locations/{location}/collections/{collectionId}

JSON representation
{"collectionName":string}

ApiAuth

The generic reusable api auth config. Deprecated. Please use AuthConfig (google/cloud/aiplatform/master/auth.proto) instead.

Fields

auth_configUnion type

The auth config.auth_config can be only one of the following:

apiKeyConfigobject (ApiKeyConfig)

The API secret.

JSON representation
{// auth_config"apiKeyConfig":{object (`ApiKeyConfig`)}// Union type}

RagEmbeddingModelConfig

Config for the embedding model to use for RAG.

Fields

model_configUnion type

The model config to use.model_config can be only one of the following:

vertexPredictionEndpointobject (VertexPredictionEndpoint)

The Vertex AI Prediction Endpoint that either refers to a publisher model or an endpoint that is hosting a 1P fine-tuned text embedding model. endpoints hosting non-1P fine-tuned text embedding models are currently not supported. This is used for dense vector search.

hybridSearchConfigobject (HybridSearchConfig)

Configuration for hybrid search.

JSON representation
{// model_config"vertexPredictionEndpoint":{object (`VertexPredictionEndpoint`)},"hybridSearchConfig":{object (`HybridSearchConfig`)}// Union type}

VertexPredictionEndpoint

Config representing a model hosted on Vertex Prediction Endpoint.

Fields

endpointstring

Required. The endpoint resource name. Format:projects/{project}/locations/{location}/publishers/{publisher}/models/{model} orprojects/{project}/locations/{location}/endpoints/{endpoint}

modelstring

Output only. The resource name of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model. Pattern:projects/{project}/locations/{location}/models/{model}

modelVersionIdstring

Output only. version id of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model.

JSON representation
{"endpoint":string,"model":string,"modelVersionId":string}

HybridSearchConfig

Config for hybrid search.

Fields

sparseEmbeddingConfigobject (SparseEmbeddingConfig)

Optional. The configuration for sparse embedding generation. This field is optional the default behavior depends on the vector database choice on the RagCorpus.

denseEmbeddingModelPredictionEndpointobject (VertexPredictionEndpoint)

Required. The Vertex AI Prediction Endpoint that hosts the embedding model for dense embedding generations.

JSON representation
{"sparseEmbeddingConfig":{object (`SparseEmbeddingConfig`)},"denseEmbeddingModelPredictionEndpoint":{object (`VertexPredictionEndpoint`)}}

SparseEmbeddingConfig

Configuration for sparse emebdding generation.

Fields

modelUnion type

The model to use for sparse embedding generation.model can be only one of the following:

bm25object (Bm25)

Use BM25 scoring algorithm.

JSON representation
{// model"bm25":{object (`Bm25`)}// Union type}

Bm25

message for BM25 parameters.

Fields

multilingualboolean

Optional. Use multilingual tokenizer if set to true.

k1number

Optional. The parameter to control term frequency saturation. It determines the scaling between the matching term frequency and final score. k1 is in the range of [1.2, 3]. The default value is 1.2.

bnumber

Optional. The parameter to control document length normalization. It determines how much the document length affects the final score. b is in the range of [0, 1]. The default value is 0.75.

JSON representation
{"multilingual":boolean,"k1":number,"b":number}

VertexAiSearchConfig

Config for the Vertex AI Search.

Fields

servingConfigstring

Vertex AI Search Serving Config resource full name. For example,projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{servingConfig} orprojects/{project}/locations/{location}/collections/{collection}/dataStores/{dataStore}/servingConfigs/{servingConfig}.

JSON representation
{"servingConfig":string}

CorpusStatus

RagCorpus status.

Fields

stateenum (State)

Output only. RagCorpus life state.

errorStatusstring

Output only. Only when thestate field is ERROR.

JSON representation
{"state":enum (`State`),"errorStatus":string}

State

RagCorpus life state.

Enums
`UNKNOWN`	This state is not supposed to happen.
`INITIALIZED`	RagCorpus resource entry is initialized, but hasn't done validation.
`ACTIVE`	RagCorpus is provisioned successfully and is ready to serve.
`ERROR`	RagCorpus is in a problematic situation. See`errorMessage` field for details.

CorpusTypeConfig

The config for the corpus type of the RagCorpus.

Fields

corpus_type_configUnion type

Optional. Whether the RagCorpus is used as document store or memory store.corpus_type_config can be only one of the following:

documentCorpusobject (DocumentCorpus)

Optional. Config for the document corpus.

memoryCorpusobject (MemoryCorpus)

Optional. Config for the memory corpus.

JSON representation
{// corpus_type_config"documentCorpus":{object (`DocumentCorpus`)},"memoryCorpus":{object (`MemoryCorpus`)}// Union type}

DocumentCorpus

This type has no fields.

Config for the document corpus.

MemoryCorpus

Config for the memory corpus.

Fields

llmParserobject (LlmParser)

The LLM parser to use for the memory corpus.

JSON representation
{"llmParser":{object (`LlmParser`)}}

LlmParser

Specifies the LLM parsing for RagFiles.

Fields

modelNamestring

The name of a LLM model used for parsing. Format: *projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}

maxParsingRequestsPerMininteger

The maximum number of requests the job is allowed to make to the LLM model per minute. Consulthttps://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If unspecified, a default value of 5000 QPM would be used.

globalMaxParsingRequestsPerMininteger

The maximum number of requests the job is allowed to make to the LLM model per minute in this project. Consulthttps://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If this value is not specified, maxParsingRequestsPerMin will be used by indexing pipeline job as the global limit.

customParsingPromptstring

The prompt to use for parsing. If not specified, a default prompt will be used.

JSON representation
{"modelName":string,"maxParsingRequestsPerMin":integer,"globalMaxParsingRequestsPerMin":integer,"customParsingPrompt":string}

Methods
`create`	Creates a RagCorpus.
`delete`	Deletes a RagCorpus.
`get`	Gets a RagCorpus.
`list`	Lists RagCorpora in a Location.
`patch`	Updates a RagCorpus.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.

Movatterモバイル変換

REST Resource: projects.locations.ragCorpora

Resource: RagCorpus

RagVectorDbConfig

RagManagedDb

KNN

ANN

Weaviate

Pinecone

VertexFeatureStore

VertexVectorSearch

RagManagedVertexVectorSearch

ApiAuth

RagEmbeddingModelConfig

VertexPredictionEndpoint

HybridSearchConfig

SparseEmbeddingConfig

Bm25

VertexAiSearchConfig

CorpusStatus

State

CorpusTypeConfig

DocumentCorpus

MemoryCorpus

LlmParser

Methods

`create`

`delete`

`get`

`list`

`patch`

Movatterモバイル変換

REST Resource: projects.locations.ragCorpora Stay organized with collections Save and categorize content based on your preferences.

Resource: RagCorpus

RagVectorDbConfig

RagManagedDb

KNN

ANN

Weaviate

Pinecone

VertexFeatureStore

VertexVectorSearch

RagManagedVertexVectorSearch

ApiAuth

RagEmbeddingModelConfig

VertexPredictionEndpoint

HybridSearchConfig

SparseEmbeddingConfig

Bm25

VertexAiSearchConfig

CorpusStatus

State

CorpusTypeConfig

DocumentCorpus

MemoryCorpus

LlmParser

Methods

create

delete

get

list

patch

REST Resource: projects.locations.ragCorpora

`create`

`delete`

`get`

`list`

`patch`