Vertex AI GenAI API

Service: aiplatform.googleapis.com

To call this service, we recommend that you use the Google-providedclient libraries. If your application needs to use your own libraries to call this service, use the following information when you make the API requests.

Discovery document

ADiscovery Document is a machine-readable specification for describing and consuming REST APIs. It is used to build client libraries, IDE plugins, and other tools that interact with Google APIs. One service may provide multiple discovery documents. This service provides the following discovery documents:

Service endpoint

Aservice endpoint is a base URL that specifies the network address of an API service. One service might have multiple service endpoints. This service has the following service endpoint and all URIs below are relative to this service endpoint:

  • https://aiplatform.googleapis.com

REST Resource:v1.media

Methods
uploadPOST /v1/{parent}/ragFiles:upload
POST /upload/v1/{parent}/ragFiles:upload
Upload a file into a RagCorpus.

REST Resource:v1.projects

Methods
getCacheConfigGET /v1/{name}
Gets a GenAI cache config.
updateCacheConfigPATCH /v1/{cacheConfig.name}
Updates a cache config.

REST Resource:v1.projects.locations

Methods
augmentPromptPOST /v1/{parent}:augmentPrompt
Given an input prompt, it returns augmented prompt from vertex rag store to guide LLM towards generating grounded responses.
corroborateContentPOST /v1/{parent}:corroborateContent
Given an input text, it returns a score that evaluates the factuality of the text.
evaluateDatasetPOST /v1/{location}:evaluateDataset
Evaluates a dataset based on a set of given metrics.
evaluateInstancesPOST /v1/{location}:evaluateInstances
Evaluates instances based on a given metric.
generateInstanceRubricsPOST /v1/{location}:generateInstanceRubrics
Generates rubrics for a given prompt.
generateSyntheticDataPOST /v1/{location}:generateSyntheticData
Generates synthetic data based on the provided configuration.
getRagEngineConfigGET /v1/{name}
Gets a RagEngineConfig.
retrieveContextsPOST /v1/{parent}:retrieveContexts
Retrieves relevant contexts for a query.
updateRagEngineConfigPATCH /v1/{ragEngineConfig.name}
Updates a RagEngineConfig.

REST Resource:v1.projects.locations.cachedContents

Methods
createPOST /v1/{parent}/cachedContents
Creates cached content, this call will initialize the cached content in the data storage, and users need to pay for the cache data storage.
deleteDELETE /v1/{name}
Deletes cached content
getGET /v1/{name}
Gets cached content configurations
listGET /v1/{parent}/cachedContents
Lists cached contents in a project
patchPATCH /v1/{cachedContent.name}
Updates cached content configurations

REST Resource:v1.projects.locations.endpoints

Methods
computeTokensPOST /v1/{endpoint}:computeTokens
Return a list of tokens based on the input text.
countTokensPOST /v1/{endpoint}:countTokens
Perform a token counting.
fetchPredictOperationPOST /v1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContentPOST /v1/{model}:generateContent
Generate content with multimodal inputs.
predictPOST /v1/{endpoint}:predict
Request message for running inference on Google's generative AI models on Vertex AI.
predictLongRunningPOST /v1/{endpoint}:predictLongRunning
rawPredictPOST /v1/{endpoint}:rawPredict
Perform an online prediction with an arbitrary HTTP payload.
serverStreamingPredictPOST /v1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
streamGenerateContentPOST /v1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.
streamRawPredictPOST /v1/{endpoint}:streamRawPredict
Perform a streaming online prediction with an arbitrary HTTP payload.

REST Resource:v1.projects.locations.endpoints.chat

Methods
completionsPOST /v1/{endpoint}/chat/completions
Exposes an OpenAI-compatible endpoint for chat completions.

REST Resource:v1.projects.locations.endpoints.deployedModels.invoke

Methods
invokePOST /v1/{endpoint}/deployedModels/{deployedModelId}/invoke/**
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1.projects.locations.endpoints.google.science

Methods
inferencePOST /v1/{endpoint}/science/inference
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1.projects.locations.endpoints.invoke

Methods
invokePOST /v1/{endpoint}/invoke/**
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1.projects.locations.endpoints.openapi

Methods
embeddingsPOST /v1/{endpoint}/embeddings
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1.projects.locations.evaluationItems

Methods
createPOST /v1/{parent}/evaluationItems
Creates an Evaluation Item.
deleteDELETE /v1/{name}
Deletes an Evaluation Item.
getGET /v1/{name}
Gets an Evaluation Item.
listGET /v1/{parent}/evaluationItems
Lists Evaluation Items.

REST Resource:v1.projects.locations.evaluationRuns

Methods
cancelPOST /v1/{name}:cancel
Cancels an Evaluation Run.
createPOST /v1/{parent}/evaluationRuns
Creates an Evaluation Run.
deleteDELETE /v1/{name}
Deletes an Evaluation Run.
getGET /v1/{name}
Gets an Evaluation Run.
listGET /v1/{parent}/evaluationRuns
Lists Evaluation Runs.

REST Resource:v1.projects.locations.evaluationSets

Methods
createPOST /v1/{parent}/evaluationSets
Creates an Evaluation Set.
deleteDELETE /v1/{name}
Deletes an Evaluation Set.
getGET /v1/{name}
Gets an Evaluation Set.
listGET /v1/{parent}/evaluationSets
Lists Evaluation Sets.
patchPATCH /v1/{evaluationSet.name}
Updates an Evaluation Set.

REST Resource:v1.projects.locations.models

Methods
getIamPolicyPOST /v1/{resource}:getIamPolicy
Gets the access control policy for a resource.
setIamPolicyPOST /v1/{resource}:setIamPolicy
Sets the access control policy on the specified resource.
testIamPermissionsPOST /v1/{resource}:testIamPermissions
Returns permissions that a caller has on the specified resource.

REST Resource:v1.projects.locations.operations

Methods
cancelPOST /v1/{name}:cancel
Starts asynchronous cancellation on a long-running operation.
deleteDELETE /v1/{name}
Deletes a long-running operation.
getGET /v1/{name}
Gets the latest state of a long-running operation.
listGET /v1/{name}/operations
Lists operations that match the specified filter in the request.
waitPOST /v1/{name}:wait
Waits until the specified long-running operation is done or reaches at most a specified timeout, returning the latest state.

REST Resource:v1.projects.locations.publishers.models

Methods
computeTokensPOST /v1/{endpoint}:computeTokens
Return a list of tokens based on the input text.
countTokensPOST /v1/{endpoint}:countTokens
Perform a token counting.
embedContentPOST /v1/{model}:embedContent
Embed content with multimodal inputs.
fetchPredictOperationPOST /v1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContentPOST /v1/{model}:generateContent
Generate content with multimodal inputs.
predictPOST /v1/{endpoint}:predict
Request message for running inference on Google's generative AI models on Vertex AI.
predictLongRunningPOST /v1/{endpoint}:predictLongRunning
rawPredictPOST /v1/{endpoint}:rawPredict
Perform an online prediction with an arbitrary HTTP payload.
serverStreamingPredictPOST /v1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
streamGenerateContentPOST /v1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.
streamRawPredictPOST /v1/{endpoint}:streamRawPredict
Perform a streaming online prediction with an arbitrary HTTP payload.

REST Resource:v1.projects.locations.ragCorpora

Methods
createPOST /v1/{parent}/ragCorpora
Creates a RagCorpus.
deleteDELETE /v1/{name}
Deletes a RagCorpus.
getGET /v1/{name}
Gets a RagCorpus.
listGET /v1/{parent}/ragCorpora
Lists RagCorpora in a Location.
patchPATCH /v1/{ragCorpus.name}
Updates a RagCorpus.

REST Resource:v1.projects.locations.ragCorpora.ragFiles

Methods
deleteDELETE /v1/{name}
Deletes a RagFile.
getGET /v1/{name}
Gets a RagFile.
importPOST /v1/{parent}/ragFiles:import
Import files from Google Cloud Storage or Google Drive into a RagCorpus.
listGET /v1/{parent}/ragFiles
Lists RagFiles in a RagCorpus.

REST Resource:v1.projects.locations.reasoningEngines

Methods
createPOST /v1/{parent}/reasoningEngines
Creates a reasoning engine.
deleteDELETE /v1/{name}
Deletes a reasoning engine.
getGET /v1/{name}
Gets a reasoning engine.
listGET /v1/{parent}/reasoningEngines
Lists reasoning engines in a location.
patchPATCH /v1/{reasoningEngine.name}
Updates a reasoning engine.
queryPOST /v1/{name}:query
Queries using a reasoning engine.
streamQueryPOST /v1/{name}:streamQuery
Streams queries using a reasoning engine.

REST Resource:v1.projects.locations.tuningJobs

Methods
cancelPOST /v1/{name}:cancel
Cancels a TuningJob.
createPOST /v1/{parent}/tuningJobs
Creates a TuningJob.
getGET /v1/{name}
Gets a TuningJob.
listGET /v1/{parent}/tuningJobs
Lists TuningJobs in a Location.
rebaseTunedModelPOST /v1/{parent}/tuningJobs:rebaseTunedModel
Rebase a TunedModel.

REST Resource:v1beta1.media

Methods
uploadPOST /v1beta1/{parent}/ragFiles:upload
POST /upload/v1beta1/{parent}/ragFiles:upload
Upload a file into a RagCorpus.

REST Resource:v1beta1.projects

Methods
getCacheConfigGET /v1beta1/{name}
Gets a GenAI cache config.
updateCacheConfigPATCH /v1beta1/{cacheConfig.name}
Updates a cache config.

REST Resource:v1beta1.projects.locations

Methods
augmentPromptPOST /v1beta1/{parent}:augmentPrompt
Given an input prompt, it returns augmented prompt from vertex rag store to guide LLM towards generating grounded responses.
corroborateContentPOST /v1beta1/{parent}:corroborateContent
Given an input text, it returns a score that evaluates the factuality of the text.
evaluateDatasetPOST /v1beta1/{location}:evaluateDataset
Evaluates a dataset based on a set of given metrics.
evaluateInstancesPOST /v1beta1/{location}:evaluateInstances
Evaluates instances based on a given metric.
generateInstanceRubricsPOST /v1beta1/{location}:generateInstanceRubrics
Generates rubrics for a given prompt.
generateSyntheticDataPOST /v1beta1/{location}:generateSyntheticData
Generates synthetic data based on the provided configuration.
getRagEngineConfigGET /v1beta1/{name}
Gets a RagEngineConfig.
retrieveContextsPOST /v1beta1/{parent}:retrieveContexts
Retrieves relevant contexts for a query.
updateRagEngineConfigPATCH /v1beta1/{ragEngineConfig.name}
Updates a RagEngineConfig.

REST Resource:v1beta1.projects.locations.cachedContents

Methods
createPOST /v1beta1/{parent}/cachedContents
Creates cached content, this call will initialize the cached content in the data storage, and users need to pay for the cache data storage.
deleteDELETE /v1beta1/{name}
Deletes cached content
getGET /v1beta1/{name}
Gets cached content configurations
listGET /v1beta1/{parent}/cachedContents
Lists cached contents in a project
patchPATCH /v1beta1/{cachedContent.name}
Updates cached content configurations

REST Resource:v1beta1.projects.locations.endpoints

Methods
computeTokensPOST /v1beta1/{endpoint}:computeTokens
Return a list of tokens based on the input text.
countTokensPOST /v1beta1/{endpoint}:countTokens
Perform a token counting.
fetchPredictOperationPOST /v1beta1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContentPOST /v1beta1/{model}:generateContent
Generate content with multimodal inputs.
getIamPolicyPOST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.
predictPOST /v1beta1/{endpoint}:predict
Request message for running inference on Google's generative AI models on Vertex AI.
predictLongRunningPOST /v1beta1/{endpoint}:predictLongRunning
rawPredictPOST /v1beta1/{endpoint}:rawPredict
Perform an online prediction with an arbitrary HTTP payload.
serverStreamingPredictPOST /v1beta1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
setIamPolicyPOST /v1beta1/{resource}:setIamPolicy
Sets the access control policy on the specified resource.
streamGenerateContentPOST /v1beta1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.
streamRawPredictPOST /v1beta1/{endpoint}:streamRawPredict
Perform a streaming online prediction with an arbitrary HTTP payload.
testIamPermissionsPOST /v1beta1/{resource}:testIamPermissions
Returns permissions that a caller has on the specified resource.

REST Resource:v1beta1.projects.locations.endpoints.chat

Methods
completionsPOST /v1beta1/{endpoint}/chat/completions
Exposes an OpenAI-compatible endpoint for chat completions.

REST Resource:v1beta1.projects.locations.endpoints.deployedModels.invoke

Methods
invokePOST /v1beta1/{endpoint}/deployedModels/{deployedModelId}/invoke/**
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1beta1.projects.locations.endpoints.google.science

Methods
inferencePOST /v1beta1/{endpoint}/science/inference
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1beta1.projects.locations.endpoints.invoke

Methods
invokePOST /v1beta1/{endpoint}/invoke/**
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1beta1.projects.locations.endpoints.openapi

Methods
embeddingsPOST /v1beta1/{endpoint}/embeddings
Forwards arbitrary HTTP requests for both streaming and non-streaming cases.

REST Resource:v1beta1.projects.locations.evaluationItems

Methods
createPOST /v1beta1/{parent}/evaluationItems
Creates an Evaluation Item.
deleteDELETE /v1beta1/{name}
Deletes an Evaluation Item.
getGET /v1beta1/{name}
Gets an Evaluation Item.
listGET /v1beta1/{parent}/evaluationItems
Lists Evaluation Items.

REST Resource:v1beta1.projects.locations.evaluationRuns

Methods
cancelPOST /v1beta1/{name}:cancel
Cancels an Evaluation Run.
createPOST /v1beta1/{parent}/evaluationRuns
Creates an Evaluation Run.
deleteDELETE /v1beta1/{name}
Deletes an Evaluation Run.
getGET /v1beta1/{name}
Gets an Evaluation Run.
listGET /v1beta1/{parent}/evaluationRuns
Lists Evaluation Runs.

REST Resource:v1beta1.projects.locations.evaluationSets

Methods
createPOST /v1beta1/{parent}/evaluationSets
Creates an Evaluation Set.
deleteDELETE /v1beta1/{name}
Deletes an Evaluation Set.
getGET /v1beta1/{name}
Gets an Evaluation Set.
listGET /v1beta1/{parent}/evaluationSets
Lists Evaluation Sets.
patchPATCH /v1beta1/{evaluationSet.name}
Updates an Evaluation Set.

REST Resource:v1beta1.projects.locations.extensions

Methods
deleteDELETE /v1beta1/{name}
Deletes an Extension.
executePOST /v1beta1/{name}:execute
Executes the request against a given extension.
getGET /v1beta1/{name}
Gets an Extension.
importPOST /v1beta1/{parent}/extensions:import
Imports an Extension.
listGET /v1beta1/{parent}/extensions
Lists Extensions in a location.
patchPATCH /v1beta1/{extension.name}
Updates an Extension.
queryPOST /v1beta1/{name}:query
Queries an extension with a default controller.

REST Resource:v1beta1.projects.locations.models

Methods
getIamPolicyPOST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.
setIamPolicyPOST /v1beta1/{resource}:setIamPolicy
Sets the access control policy on the specified resource.
testIamPermissionsPOST /v1beta1/{resource}:testIamPermissions
Returns permissions that a caller has on the specified resource.

REST Resource:v1beta1.projects.locations.operations

Methods
cancelPOST /v1beta1/{name}:cancel
Starts asynchronous cancellation on a long-running operation.
deleteDELETE /v1beta1/{name}
Deletes a long-running operation.
getGET /v1beta1/{name}
Gets the latest state of a long-running operation.
listGET /v1beta1/{name}/operations
Lists operations that match the specified filter in the request.
waitPOST /v1beta1/{name}:wait
Waits until the specified long-running operation is done or reaches at most a specified timeout, returning the latest state.

REST Resource:v1beta1.projects.locations.publishers

Methods
getIamPolicyPOST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.

REST Resource:v1beta1.projects.locations.publishers.models

Methods
computeTokensPOST /v1beta1/{endpoint}:computeTokens
Return a list of tokens based on the input text.
countTokensPOST /v1beta1/{endpoint}:countTokens
Perform a token counting.
embedContentPOST /v1beta1/{model}:embedContent
Embed content with multimodal inputs.
fetchPredictOperationPOST /v1beta1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContentPOST /v1beta1/{model}:generateContent
Generate content with multimodal inputs.
getIamPolicyPOST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.
predictPOST /v1beta1/{endpoint}:predict
Request message for running inference on Google's generative AI models on Vertex AI.
predictLongRunningPOST /v1beta1/{endpoint}:predictLongRunning
rawPredictPOST /v1beta1/{endpoint}:rawPredict
Perform an online prediction with an arbitrary HTTP payload.
serverStreamingPredictPOST /v1beta1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
streamGenerateContentPOST /v1beta1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.
streamRawPredictPOST /v1beta1/{endpoint}:streamRawPredict
Perform a streaming online prediction with an arbitrary HTTP payload.

REST Resource:v1beta1.projects.locations.ragCorpora

Methods
createPOST /v1beta1/{parent}/ragCorpora
Creates a RagCorpus.
deleteDELETE /v1beta1/{name}
Deletes a RagCorpus.
getGET /v1beta1/{name}
Gets a RagCorpus.
listGET /v1beta1/{parent}/ragCorpora
Lists RagCorpora in a Location.
patchPATCH /v1beta1/{ragCorpus.name}
Updates a RagCorpus.

REST Resource:v1beta1.projects.locations.ragCorpora.ragFiles

Methods
deleteDELETE /v1beta1/{name}
Deletes a RagFile.
getGET /v1beta1/{name}
Gets a RagFile.
importPOST /v1beta1/{parent}/ragFiles:import
Import files from Google Cloud Storage or Google Drive into a RagCorpus.
listGET /v1beta1/{parent}/ragFiles
Lists RagFiles in a RagCorpus.

REST Resource:v1beta1.projects.locations.reasoningEngines

Methods
createPOST /v1beta1/{parent}/reasoningEngines
Creates a reasoning engine.
deleteDELETE /v1beta1/{name}
Deletes a reasoning engine.
getGET /v1beta1/{name}
Gets a reasoning engine.
listGET /v1beta1/{parent}/reasoningEngines
Lists reasoning engines in a location.
patchPATCH /v1beta1/{reasoningEngine.name}
Updates a reasoning engine.
queryPOST /v1beta1/{name}:query
Queries using a reasoning engine.
streamQueryPOST /v1beta1/{name}:streamQuery
Streams queries using a reasoning engine.

REST Resource:v1beta1.projects.locations.reasoningEngines.a2a.v1

Methods
cardGET /v1beta1/{name}/a2a/{a2aEndpoint}
Get request for reasoning engine instance via the A2A get protocol apis.

REST Resource:v1beta1.projects.locations.reasoningEngines.a2a.v1.message

Methods
sendPOST /v1beta1/{name}/a2a/{a2aEndpoint}:send
Send post request for reasoning engine instance via the A2A post protocol apis.
streamPOST /v1beta1/{name}/a2a/{a2aEndpoint}:stream
Streams queries using a reasoning engine instance via the A2A streaming protocol apis.

REST Resource:v1beta1.projects.locations.reasoningEngines.a2a.v1.tasks

Methods
a2aGetReasoningEngineGET /v1beta1/{name}/a2a/{a2aEndpoint}
Get request for reasoning engine instance via the A2A get protocol apis.
cancelPOST /v1beta1/{name}/a2a/{a2aEndpoint}:cancel
Send post request for reasoning engine instance via the A2A post protocol apis.
pushNotificationConfigsGET /v1beta1/{name}/a2a/{a2aEndpoint}
Get request for reasoning engine instance via the A2A get protocol apis.
subscribeGET /v1beta1/{name}/a2a/{a2aEndpoint}:subscribe
Stream get request for reasoning engine instance via the A2A stream get protocol apis.

REST Resource:v1beta1.projects.locations.reasoningEngines.a2a.v1.tasks.pushNotificationConfigs

Methods
a2aGetReasoningEngineGET /v1beta1/{name}/a2a/{a2aEndpoint}
Get request for reasoning engine instance via the A2A get protocol apis.

REST Resource:v1beta1.projects.locations.reasoningEngines.memories

Methods
createPOST /v1beta1/{parent}/memories
Create a Memory.
deleteDELETE /v1beta1/{name}
Delete a Memory.
generatePOST /v1beta1/{parent}/memories:generate
Generate memories.
getGET /v1beta1/{name}
Get a Memory.
listGET /v1beta1/{parent}/memories
List Memories.
patchPATCH /v1beta1/{memory.name}
Update a Memory.
purgePOST /v1beta1/{parent}/memories:purge
Purge memories.
retrievePOST /v1beta1/{parent}/memories:retrieve
Retrieve memories.
rollbackPOST /v1beta1/{name}:rollback
Rollback Memory to a specific revision.

REST Resource:v1beta1.projects.locations.reasoningEngines.memories.revisions

Methods
getGET /v1beta1/{name}
Get a Memory Revision.
listGET /v1beta1/{parent}/revisions
List Memory Revisions for a Memory.

REST Resource:v1beta1.projects.locations.reasoningEngines.sessions

Methods
appendEventPOST /v1beta1/{name}:appendEvent
Appends an event to a given session.
createPOST /v1beta1/{parent}/sessions
Creates a newSession.
deleteDELETE /v1beta1/{name}
Deletes details of the specificSession.
getGET /v1beta1/{name}
Gets details of the specificSession.
listGET /v1beta1/{parent}/sessions
ListsSessions in a given reasoning engine.
patchPATCH /v1beta1/{session.name}
Updates the specificSession.

REST Resource:v1beta1.projects.locations.reasoningEngines.sessions.events

Methods
listGET /v1beta1/{parent}/events
ListsEvents in a given session.

REST Resource:v1beta1.projects.locations.tuningJobs

Methods
cancelPOST /v1beta1/{name}:cancel
Cancels a TuningJob.
createPOST /v1beta1/{parent}/tuningJobs
Creates a TuningJob.
getGET /v1beta1/{name}
Gets a TuningJob.
listGET /v1beta1/{parent}/tuningJobs
Lists TuningJobs in a Location.
optimizePromptPOST /v1beta1/{parent}/tuningJobs:optimizePrompt
Optimizes a prompt.
rebaseTunedModelPOST /v1beta1/{parent}/tuningJobs:rebaseTunedModel
Rebase a TunedModel.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-11-18 UTC.