Vertex AI release notes Stay organized with collections Save and categorize content based on your preferences.
You can see the latest product updates for all of Google Cloud on the Google Cloud page, browse and filter all release notes in theGoogle Cloud console, or programmatically access release notes inBigQuery.
To get the latest product updates delivered to you, add the URL of this page to yourfeed reader, or add thefeed URL directly.
December 17, 2025
Cloud API Registry is availablein the Google Cloud console inPreview. UseCloud API Registry in the Google Cloud console to view and manage the MCPservers and tools your agent has access to.
Gemini 3 Flash
Gemini 3 Flash is now available in public preview. This model is designed totackle the most challenging agentic problems with strong coding andstate-of-the-art reasoning capabilities, and is our best model for complexmultimodal understanding.
For more information, seeGemini 3Flash.
December 16, 2025
Vertex AI Agent Engine
Vertex AI Agent Engine is now available in the following regions:
europe-west6(Zurich)europe-west8(Milan)asia-east2(Hong Kong)asia-northeast3(Seoul)asia-southeast2(Jakarta)northamerica-northeast2(Toronto)southamerica-east1(São Paulo)
For more information, seeVertex AI Agent Builder locations.
Vertex AI Agent Engine
Vertex AI Agent EngineSessionsandMemory Bank are nowGenerally Available.
Updated pricing forVertex AI Agent Engine:
Pricing for Vertex AI Agent Engine Runtime was lowered.
OnJanuary 28, 2026, Sessions, Memory Bank, and Code Execution will begincharging for usage.
For more information, seePricing.
December 12, 2025
Gemini 2.5 Flash with Gemini Live API Native Audio
Gemini 2.5 Flash with Gemini Live API Native Audio (gemini-live-2.5-flash-native-audio) is Generally Available (GA).This model features cutting-edge native audio functionality forGemini Live API, including enhanced voice quality and adaptability, Proactive Audio, and Affective Dialog.
December 10, 2025
DeepSeek-V3.2 is available in Model Garden.DeepSeek-V3.2 is a state-of-the-art large language model fromDeepSeek.DeepSeek-V3.2 is available as a managed API in Model Garden. To learn more, seeDeepSeek-V3.2.
December 09, 2025
The following models are available through Model Garden:
December 08, 2025
Veo 3.1 video extension
Veo 3.1 supports video extension in Preview.
For more information, see the following:
December 02, 2025
The following models are available through Model Garden:
The Vertex AI Model Garden model co-hosting vLLM container is available to use withthis sample notebook. You can use this container to serve multiple replicas of a model and serve multiple models with dynamic loading and unloading. This allows you to maximize resource utilization and serving efficiency, and flexibly adjust the models to serve.
November 24, 2025
Anthropic's Claude Opus 4.5
Claude Opus 4.5is available in Model Garden.
November 17, 2025
Veo video generation
Veo 3.1 is Generally Available, and introduces the following models:
For more information, see the following:
LearnLM in Gemini
The LearnLM model is no longer a separate offering or listing on AI Studio asLearnLM capabilities have been integrated into the latest Gemini models (starting with Gemini 2.5).
Built in collaboration with experts in education,LearnLM represents ourcapabilities fine-tuned for learning informed by rigorous research. Theseadvancements and improvements are available directly in Gemini, enhancingeducational experiences and applications.
Pre-existing learnlm-2.0-flash-experimental projects will not remain functionalpast December 3, 2025 unless an alternative model is manually selected—weencourage developers to switch to the latest Gemini models and optimize theirprompts by reviewing ourLearnLM Partner Prompt Guide.
November 13, 2025
Updated Prompt Caching for Anthropic Claude Models
Prompt caching for Anthropic Claude models now supports a one-hour Time To Live (TTL).
For more information, seePrompt caching.
Kimi K2 Thinking is available in Model Garden. This model isa thinking model that excels at complex problem-solving and deep reasoning.Kimi K2 Thinking is available as a managed API in Model Garden. To learn more, seeKimi K2 Thinking.
November 11, 2025
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.7 Sonnet is deprecated as of November 11, 2025 and will beshut down on May 11, 2026. For more information, seePartner model deprecations.
November 07, 2025
Vertex AI Agent Engine
The following features are now available inPreview:
Configure, manage, and viewobservabilityfeaturessuch as sessions, traces, logs, and events for your agent in the Google Cloud console.
Use theplaygroundto test and interact with your agent in the Google Cloud console.
Evaluate your agents using theGen AI evaluation service's GenAI Client in Vertex AI SDK.
Create and manage memoryrevisions for Memory Bank.
Use Identity Access Management (IAM) to create anagentidentity to manage access andauthentication when using agents on Vertex AI Agent Engine Runtime.
The following features are now available inGA:
Express mode support forVertex AI Agent Engine Runtime.
Use the new free tier with Vertex AI Agent Engine Runtime. For moreinformation, seePricing.
November 04, 2025
MiniMax M2 is available in Model Garden. This model isis built for end-to-end development workflows and has strong capabilitiesin planning and executing complex tool-calling tasks. The model isoptimized to provide a balance of performance, cost, and inference speed.MiniMax M2 is available as a managed API in Model Garden. To learn more, seeMiniMax M2.
October 23, 2025
The following models are available through Model Garden:
October 21, 2025
On September 23, 2025, we discovered a technical issue inthe Vertex AI API that resulted in a limited amount of responsesbeing misrouted between recipients for certain third-party modelswhen using streaming requests. This issue is now resolved.Google models, e.g. Gemini, werenot impacted.
Some internal proxies did not properly handle HTTP requests thathave anExpect: 100-continue header, resulting ina desynchronization in a streaming response connection, wherea response intended for one request was instead delivered asthe response for a subsequent request.
For more information, seeSecurity bulletins.
October 16, 2025
Mistral's Codestral 2
You can use Mistral'sCodestral 2in Model Garden.
October 15, 2025
Anthropic's Claude Haiku 4.5
You can use Anthropic'sClaude Haiku 4.5in Model Garden.
Veo video generation
Veo 2 supports adding and removing objects from videos in Preview.
For more information about Veo 2, seeVeo 2Preview
For more information about adding and removing objects, see the following:
Veo video generation
Veo 3.1 is available in Preview, and introduces the following models:
For more information, see the following:
October 14, 2025
Imagen 4 preview models
The following Imagen 4 preview models will be removed onNovember 30, 2025 :imagen-4.0-generate-preview-06-06,imagen-4.0-ultra-generate-preview-06-06, andimagen-4.0-fast-generate-preview-06-06. To avoid servicedisruption, migrate all workflows that use Imagen 4 preview models beforeNovember 30, 2025 , 2025, to the following Imagen 4 GenerallyAvailable models:imagen-4.0-generate-001,imagen-4.0-ultra-generate-001,imagen-4.0-fast-generate-001.
Imagen subject and style fine-tuning
Imagen subject model and style model tuning will be removed onDecember 31, 2025. We recommend that you useGemini 2.5 Flash Image, which supports most use cases that requirefine-tuning. For more information, see Edit images withGemini.
October 09, 2025
Imagen
Imagen's virtual try-on model,virtual-try-on-preview-08-04was updated on September 30, 2025, to more accurately preserve the person'sbody shape and preserve the garment's identity.
October 07, 2025
The following Qwen models are available inModel Garden:
- Qwen-Image
- Qwen-Image-Edit
- Qwen-Image-Edit-2509
Save and share prompts in Vertex AI Studio: You can now save and share prompts in Vertex AI Studio. Sharing prompts lets you collaborate with team members, ensure consistency, and build a library of effective prompts for various tasks. For more information, seeSave and share prompts.
TheGemini 2.5 Computer Use model and tool (gemini-2.5-computer-use-preview-10-2025) is now available in Preview. The Computer Use model and tool lets you enable your applications to interact with and automate tasks in the browser. With the Computer Use model and tool, you can build agents that can:
Automate repetitive data entry or form filling on websites.
Navigate websites to gather information.
Assist users by performing sequences of actions in web applications.
October 06, 2025
Updated pricing for Vertex AI Agent Engine: Starting onNovember 6, 2025, Vertex AI Agent Engine Runtime will start charging for runtime usage for the following regions:
asia-southeast1(Singapore)australia-southeast2(Melbourne)europe-west2(London)europe-west3(Frankfurt)europe-west4(Netherlands)
For more details, seePricing for Vertex AI Agent Engine.
Access Transparency for Vertex AI Agent Engine: Access Transparency is now available for Vertex AI Agent Engine. For more information, see the overview forEnterprise security.
October 03, 2025
Prompt management
Vertex AI offers tooling to help manage prompts and prompt versions. In addition to the prompt management capabilities in Vertex AI Studio, prompts can be stored and versioned using the Vertex AI SDK.
For more information, see thePrompt management API reference.
October 02, 2025
Gemini 2.5 Flash Image (gemini-2.5-flash-image) is now generally available. This GA release adds support for aspect ratio controls, image-only response modality, regional endpoints,support for batch predictions,image generation from multiple reference images, andimproved multi-turn image editing.
SeeGemini 2.5 Flash Image for more information.
Google Gen AI SDK in C# Preview
Preview: The Google Gen AI SDK is available in C#. Seegoogleapis/dotnet-genai.
This release includes support forGenerateContentAsync,GenerateContentStreamAsync,GenerateImagesAsync, and three Live APIs, which includesSendClientContentAsync,SendRealtimeInputAsync, andSendToolResponseAsync.
September 30, 2025
DeepSeek-V3.2-Exp is available through Model Garden.
September 25, 2025
New preview models forGemini 2.5 Flash and2.5 Flash-Lite are now available. These models are available at the following versioned endpoints:
gemini-2.5-flash-preview-09-2025gemini-2.5-flash-lite-preview-09-2025
September 24, 2025
Access to Gemini's 1.5 models has been discontinued. For more information, see ourModel versions page.
September 23, 2025
Gemini 2.5 Flash with Live API Native Audio Preview
Gemini 2.5 Flash with Live API Native Audio (gemini-live-2.5-flash-preview-native-audio-09-2025) is available inPreview. A single, unified model processes audio input and generates audio output directly, eliminating separate text-to-speech/speech-to-text conversions. This results in-low latency, high-quality, and incredibly human-like conversations. New features and capabilities include:
Improved Barge-in: Interrupt Gemini more naturally and reliably, even in loud and noisy environments.
Robust Function Calling: We've improved the triggering rate, allowing Gemini to successfully execute the functions you define with greater precision.
Accurate Transcription: The accuracy of audio-to-text transcription has been significantly enhanced.
Seamless Multilingual Support: Speak to Gemini in multiple languages, and it will effortlessly switch between them without any pre-configuration. Language is no longer a barrier!
Enhanced Audio Quality: Experience a dramatically improved audio quality that truly feels like speaking with a person.
Proactive Audio: Define Gemini's expertise and set conditions for when it should respond. Gemini can act as a "silent listener," only chiming in when the conversation touches upon its designated area of expertise.
Affective Dialog: Gemini can adapt and adjust its generated voice to match the emotional tone of the speaker, creating more empathetic and natural interactions.
Watch our comprehensive demo to see these features in action, including seamless language switching, expert mode, emotionally aware responses, memory recall, and interactive screen sharing for engineering tasks – all demonstrated directly within Vertex AI Studio without writing a single line of code!
September 22, 2025
DeepSeek-V3.1-Terminus is available through Model Garden.
September 18, 2025
Grounding with Google Maps
Grounding with Google Maps has implemented the following changes:
- Removed the following fields from the API response:
grounding_chunk.maps.textgrounding_chunk.maps.place_answer_sources.review_snippets.author_attributiongrounding_chunk.maps.place_answer_sources.flag_content_urigrounding_chunk.maps.place_answer_sources.review_snippets.flag_content_uri
- The widget context token is only returned when the optional
widget_token_enableinput flag is set.
To learn more, seeGrounding with Google Maps.
September 15, 2025
Imagen
We improved Imagen's virtual try-on model,virtual-try-on-preview-08-04, so that it is better at preserving the person's body shape and preserving the garment product's identity.
September 10, 2025
Vertex AI Agent Engine
Agent Engine now supports the following features:
Agent EngineCode Execution, now in Preview, lets your agent run code in an isolated sandbox environment. For more information, seeCode Execution.
You can now develop, deploy, and use agents that support theAgent-to-Agent (A2A) protocol on Agent Engine. For more information, seeDevelop an Agent2Agent agent.
Agent Engine now supportsbidirectional streaming. For more information, seeBidirectional streaming.
TheAgent Engine page in the Cloud Console UI now has a newMemory Bank tab for displaying and managing memories.
Vertex AI Agent Engine
In versionv1.112.0 of the Vertex AI SDK for Python, theagent_engines module has been refactored to aclient-based design. For information about updating your existing code to the new design, see theMigration guide.
September 09, 2025
AI Singapore's SEA-LION V4 models are available through Model Garden. They are open models for Southeast Asian languages, built by leveraging Vertex Model Development Service for enhanced training efficiency and model accuracy.
EmbeddingGemma andDeepSeek-V3.1 models are available through Model Garden.
September 08, 2025
Veo video generation
Veo 3 support for short-duration videos isgenerally available. You can use Veo 3 to create 4, 6, or 8 second videos. For more information, see the following:
September 03, 2025
Vertex AI RAG Engine: Managed Database (Spanner)
Customers will be charged for the use of a Google-managed Spanner instance that's provisioned in a Google tenant project, using standard Spanner SKUs.
For more information, seeVertex AI RAG Engine billing.
August 26, 2025
Gemini 2.5 Flash Image Preview
Gemini 2.5 Flash Image (gemini-2.5-flash-image-preview) is available inPreview. Gemini 2.5 Flash Image Preview supports additional image generation and editing features such asimage generation from multiple reference images andimproved multi-turn image editing.
Vertex AI model tuning and Gen AI evaluation service
Vertex AI model tuning now supports integration with the Gen AI evaluation service in Preview. You can automatically run evaluations on your tuned models and intermediate checkpoints. For more information, seeCreate a tuning job.
August 21, 2025
Vertex AI Agent Engine
Agent Engine now supports the following enterprise security features:
You can now deploy your agents in a private VPC environment, configuring a Private Service Connect interface, to ensure data privacy and meet security and compliance requirements. For more information, seeConfigure Private Service Connect interface.
You can now use your owncustomer-managed encryption keys (CMEK) to protect data at rest.
You can now specifycustomized resource controls, such as the minimum and maximum number of application instances, resource limits for each container, and concurrency for each container.
As a part of Vertex AI Platform, Vertex AI Agent Engine now supportsHIPAA workloads.
For more information, seeAgent Engine overview.
August 14, 2025
Imagen
Imagen 4 is Generally Available.
Imagen 4 introduces the following models:
For more information, seeGenerate images using text prompts andImage generation API.
Gemma 3 270M,Wan 2.2 andWan 2.1 models are available through Model Garden.
August 13, 2025
OpenAI'sgpt-oss-120b andgpt-oss-20b are available as Model as a Service (MaaS) models in Model Garden.
Qwen3 Coder andQwen3 235B are available as Model as a Service (MaaS) models in Model Garden.
August 08, 2025
Gemini 2.5 Flash-Lite andGemini 2.5 Pro now support supervised fine-tuning. For more information, seeAbout supervised fine-tuning for Gemini models.
August 07, 2025
Vertex AI prompt optimizer
The Vertex AI prompt optimizer is nowgenerally available. For more information, seeOptimize prompts.
We now offer azero-shot prompt optimizer.
Model tuning
You can now perform supervised fine-tuning on open models such as Llama 3.1. For more information, seeTune an open model.
Vertex AI Agent Engine
You can use your owncustom service account for agent identity to manage permissions and access according to your organization's security policies.
August 06, 2025
Imagen
Virtual try-on lets you generate virtual try-on images from an image of aperson and product photos that you provide, and is available in Preview. For more information, seeGenerate Virtual Try-On Images andVirtual Try-On API.
This release note is incorrect; see entry forOctober 9, 2025.
OpenAI's gpt-oss models are available through Model Garden.
July 29, 2025
Veo video generation Veo 3 and Veo 3 Fast are now generally available. For more information, seeGenerate videos using text prompts.
July 23, 2025
Grounding with Google Maps is available in all regions (except for the EEA) as a Preview (Pre-GA) feature.
July 22, 2025
Gemini 2.5 Flash-Lite is now generally available and accessible using the API and Vertex AI Studio. This GA release includes support for explicit caching andbatch prediction, as well as expanded region support.
SeeGemini 2.5 Flash-Lite for more information.
July 17, 2025
Veo 3 preview models now support upscaling for 1080p resolution using the newresolution parameter. For more information, seeVeo on Vertex AI.
July 16, 2025
AddedGemma 3 fine-tuning notebook using Axolotl docker with support for 1b, 4b, 12b, and 27b variants.
July 14, 2025
Multimodal MedGemma 27B IT,MedSigLIP, andT5Gemma models are available through Model Garden.
July 08, 2025
Vertex AI Agent Engine
Vertex AI Agent Engine Memory Bank is now available in Preview. Memory Bank lets you dynamically generate long-term memories based on users' conversations with your agent.
July 03, 2025
Vertex AI Agent Garden
Vertex AI Agent Garden now supports filtering by tags.
June 27, 2025
Gemma 3n models are now available through Model Garden.
Multimodal datasets are now available in preview. For more information, seeMultimodal datasets.
June 24, 2025
Starting on June 24, 2025, Imagen versions 1 and 2, image captioning, and visual question answering are deprecated.
On September 24, 2025, the following features and models will be removed:
- image captioning
- visual question answering
- Imagen 1 model
imagegeneration@002 - Imagen 2 models
imagegeneration@005andimagegeneration@006
For more information, seeMigrate to Imagen 3.
June 23, 2025
Veo 2 support for advanced video controls is Generally Available. In addition to a providing a first frame of a video, you can specify the last frame of a video or a video to extend in length. For more information, seeVeo on Vertex AI API.
June 17, 2025
Preview endpoint availability and removal: All existing Gemini 2.5 Flash and Pro preview endpoints (listed below) will continue to be available with their current preview pricing until July 15, 2025. After this date, these preview endpoints will be shut down.
gemini-2.5-flash-preview-04-17gemini-2.5-flash-preview-05-20gemini-2.5-pro-preview-03-25gemini-2.5-pro-preview-05-06gemini-2.5-pro-preview-06-05
Gemini 2.5 Flash andGemini 2.5 Pro are now generally available and accessible using the API and Vertex AI Studio.
SeeGemini 2.5 Flash andGemini 2.5 Pro for more information.
Live API is now available as a private general availability offering in the API and Vertex AI Studio. Reach out to your Google account team representative to request access.
SeeLive API for more information.
Gemini 2.5 Flash-Lite is now available as a preview offering in both the API and Vertex AI Studio.
SeeGemini 2.5 Flash-Lite for more information.
Provisioned Throughput (PT): Once a model is GA, all new PT purchases will be for GA endpoints only. If you've purchased PT for a specific preview version, it will still work for that specific preview. However, you mustmigrate the existing PT to the GA endpoint or purchase new PT for the GA endpoint byJuly 15, 2025.
Updated pricing for Gemini 2.5 Flash GA: The price for Gemini 2.5 Flash in GA will be adjusted to reflect its quality and unified output token pricing. This includes lower prices for thinking output, higher prices for non-thinking output. These pricing changes will take effect on the new GA endpoint as shared above. Preview pricing will only continue on existing preview endpoints for 30 days post-GA onJuly 15, 2025.
Updated preview endpoints: EffectiveJune 19, 2025,gemini-2.5-flash-preview-04-17 endpoint will serve the Gemini 2.5 Flash model version released on 05-20, which has been promoted to GA. Similarly, thegemini-2.5-pro-preview-05-06 and03-25 endpoints will serve the Gemini 2.5 Pro model version released on 06-05, also promoted to GA. This update ensures continuity during your transition.
June 16, 2025
The DeepSeek API service on Vertex AI is inPreview. For more information, see theDeepSeek model card in Model Garden.
June 11, 2025
Imagen 4's public preview models are updated to the following:
imagen-4.0-generate-preview-06-06imagen-4.0-fast-generate-preview-06-06imagen-4.0-ultra-generate-preview-06-06
For more information about each model, seePreview Imagen models.
To avoid service interruption, migrate fromimagen-4.0-ultra-generate-exp-05-20 andimagen-4.0-generate-preview-05-20 before 2025-07-07.
June 09, 2025
Gemini API
Thelogprobs andresponse_logprobs parameters for the Gemini API are nowgenerally available. For more information, seeGenerate content with Gemini API.
June 05, 2025
Gemini 2.5 Pro's public preview version has been updated togemini-2.5-pro-preview-06-05 and includes expanded support for thinking. This model version is available in the API and Vertex AI Studio.
SeeGemini 2.5 Pro for model details.
June 03, 2025
Model Garden now includesDeepSeek-R1-0528 variants.
In Model Garden, the following fine tuning features have been added:
May 23, 2025
Mistral OCR is an Optical Character Recognition API for document understanding. It isGA on Vertex AI. For more information, see theMistral OCR model card in Model Garden.
May 22, 2025
Anthropic's Claude Opus 4 and Claude Sonnet 4 areGA on Vertex AI and supportProvision Throughput. For more information, see theClaude Opus 4 orClaude Sonnet 4 model card in Model Garden.
May 20, 2025
Vertex AI Agent Engine
The following features are now available in Preview:
Gemini 2.5 Flash's public preview version has been updated togemini-2.5-flash-preview-5-20.
SeeGemini 2.5 Flash for model details.
The model is available in the API and Vertex AI Studio.
Audio-to-audio support forGemini 2.5 Flash with Live API is now available as a private preview. Users must be allowlisted to use this new feature.
The model is available in the API and Vertex AI Studio.
SeeLive API for details.
MedGemma models are available in Model Garden.
Veo 3
Veo 3 is available in Preview for allowlisted accounts.
For more information about Veo 3, seeVeo | AI Video Generator andVeo on Vertex AI API.
The model is available in the API and Vertex AI Studio.
Lyria 2, our latest music generation model, is now generally available.
See ourmusic generation prompt guide and ouruser guide for more information.
The model is available in the API and Vertex AI Studio.
Imagen 4
Imagen 4 offers two Preview models:Imagen 4 Generate Preview 05-20, andImagen 4 Ultra Generate Experimental 05-20.
For more information, seeGenerate images using text prompts and theGenerate images API.
The model is available in the API and Vertex AI Studio.
Thought summaries are now available as an experimental feature for Gemini 2.5 Pro and 2.5 Flash.
For details, seeThinking.
The model is available in the API and Vertex AI Studio.
New stable text embeddings models are now generally available:
gemini-embedding-001text-embedding-005
For more information, seeGet text embeddings.
May 14, 2025
MedLM is deprecated. Access to MedLM will no longer be available on or after September 29, 2025.
May 07, 2025
Gemini 2.0 Flash with image generation (gemini-2.0-flash-preview-image-generation) is now available as a public preview offering.
For more information, seeGenerate images with Gemini.
Seed parameter is now in GA and supports Gemini 2.5 model family.
May 05, 2025
Grounding
The following grounding features aregenerally available:
May 02, 2025
The global endpoint is generally available (GA). For details, seeGlobal endpoint.
April 30, 2025
- Llama 4 Maverick and Scout models are available inModel Garden withModel-as-a-Service API Service andself-hosted deployments.
- HiDream-I1,Llama Guard 4,Llama Prompt Guard 2, andQwen3 are available inModel Garden.
- Additional materials are available for deploying a model in Model Garden by using thePython SDK, gcloud CLI, or API, which are available inPreview:
April 29, 2025
Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, seeModel versions and lifecycle.
April 17, 2025
Gemini 2.5 Flash with thinking and other well-rounded capabilities is now available inPreview.
April 10, 2025
Managed APIs for Llama 4 Maverick and Scout are inPreview on Vertex AI. For more information, see theLlama 4 model card.
April 09, 2025
Agent Development Kit (ADK) is now available inPreview. For more information, seeAgent Development Kit.
Gemini Live API is now available as a public preview offering and has been updated with the following features:
- Support for responses in 8 voices and 31 languages using Chirp 3
- Updated UI support in Vertex AI Studio
- Expanded conversation session window
- Ability to extend conversation sessions
- Support to share your current screen with Gemini during conversations
- Transcription support for audio in and audio out
- Support to change or update the system instructions mid-session
For more information, seeGemini 2.0 Flash Live API.
Vertex AI Agent Engine
The following features are now available for Vertex AI Agent Engine in Preview:
The following features are now generally available for Vertex AI Agent Engine:
Gemini 2.5 Pro is now available as a public preview offering.
For more information, seeGemini 2.5 Pro.
Agent Garden is now available inPreview. For more information, seeVertex AI Agent Builder overview or go directly toAgent Garden in the Cloud Console.
Grounding: Grounding with Google Maps is now available as a Public Experimental feature. For more information, seeGrounding with Google Maps.
Grounding: Web Grounding for Enterprise is now Generally available. For more information, seeWeb Grounding for Enterprise.
Vertex AI Agent Builder now refers to a suite of features for building and deploying AI agents in Vertex AI. For more information see,Vertex AI Agent Builder overview.
The original Vertex AI Agent Builder product has been renamedAI Applications. The product functionality and endpoints remain the same. For more information, seeWhat is AI Applications?.
March 25, 2025
- DeepSeek-V3-0324,TxGemma andSesame CSM are now available inModel Garden.
- DeepSeek-R1,V3 andV3-0324 can be deployed with H200 GPUs and improved vLLM support.
- You can deploy a model in Model Garden by using thePython SDK, gcloud CLI, or API, which are available inPreview. You can get started with the "Equivalent code" in the deploy panel in the Model Garden console.
March 20, 2025
Anthropic's Claude Sonnet 3.7 isGA on Vertex AI and supportsProvision Throughput. To learn more, view theClaude Sonnet 3.7 model card in Model Garden.
March 17, 2025
Mistral Small 3.1 (25.03) feature multimodal capabilities and a context of up to 128,000 tokens. For more information, see theMistral Small 3.1 (25.03) model card in Model Garden.
March 14, 2025
Judge model evaluation and customization tools are now available in Preview for theGen AI evaluation service in Vertex AI.
March 13, 2025
Context caching for Gemini on Vertex AI is generally available (GA).
March 12, 2025
- Gemma 3 andShieldGemma 2 are now available in Model Garden.
- CogVideoX-2b is now available in Model Garden.
Model Garden fine tuning updates:
- Added aworkbench-based notebook for Llama 3.1 finetuning.
- UpdatedLlama 3.1 andGemma 2 UI fine-tuning with the updated PEFT docker.
March 11, 2025
Gemini 2.0 Flash Tuning
Gemini 2.0 Flash fine-tuning is now generally available (GA).
Added support fortuning function calling.
March 04, 2025
Vertex AI Agent Engine
Vertex AI Agent Engine is nowgenerally available (GA).
Billing for Vertex AI Agent Engine starts on March 4, 2025. We recommend that you delete unused resources to avoid incurring unwanted costs. For more information, seePricing.
LangChain on Vertex AI has been renamed toVertex AI Agent Engine.
February 25, 2025
Gemini 2.0 Flash-Lite is now generally available
Gemini 2.0 Flash-Lite is now generally available. For more information, seeGemini 2.0.
February 24, 2025
Anthropic's Claude Sonnet 3.7 is inPreview on Vertex AI. To learn more, view theClaude Sonnet 3.7 model card in Model Garden.
February 21, 2025
- PEFT Docker updates
- Added support for evaluation metrics like perplexity, bleu, google_bleu, rouge1, rouge2, rougeL, rougeLSum.
- Uses the best checkpoint and loads the model based on the best eval metrics.
- Run training and eval only for data which is less than or equal to the
max_seq_length. - Use
gcloud storage rsyncinstead ofcsfuseto save a checkpoint.
- Fine tuning updates
- You can select a service account when you clickFine-tune for a model, such asLlama 3.1.
- Added aPEFT based LLM finetuning tutorial notebook.
- Added aAxolotl based LLM finetuning notebook.
- UpdatedLlama 3.1 andGemma 2 fine-tuning notebooks with the updated PEFT Docker container.
- Model updates
- Updated thePaliGemma model card by supporting PaliGemma 2 mix models, and segmentation functionality to Paligemma 1 models.
- Updated theLLaVa model card by supporting LLaVA Next models and adding vLLM to the notebook.
February 12, 2025
Deepseek-V3 and Deepseek-R1 have been added toModel Garden inPreview:
- DeepSeek-V3 (671B) is a powerful Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
- DeepSeek-R1 (671B) is one of the first-generation reasoning models introduced by DeepSeek and offers performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
You can use anotebook to deploy these models.
February 11, 2025
TheLlama 3.3 70B model that is managed on Vertex AI is now inPreview.
February 07, 2025
deepseek-ai/deepseek-r1 andmicrosoft/Phi-4 models were added toModel Garden.
The following advanced LLM inference optimization techniques are available in Model Garden inPreview:
Prefix caching reuses computations from previously generated text, eliminating redundant processing. It reduces time-to-first-token for requests with common prompt prefixes. Prefix caching is available for the following models:
- vLLM: Llama 3.1 (8b, 70b), Llama 3.3 (70b)
- Hex-LLM: Llama 2 (7b, 13b), Llama 3 (8b), Llama 3.1 (8b, 70b), Llama 3.2 (1b, 3b), Llama Guard (1b, 8b), CodeLlama (7b, 13b), Gemma (2b, 7b), CodeGemma (2b, 7b), Mistral-7B (v0.2, v0.3), Mixtral-8x7B (v0.1)
Speculative decoding is an effective optimization technique to reduce generation time-per-output-token latency. For more information, see theModel Garden advanced features notebook.
February 05, 2025
Gemini 2.0 Flash general availability for text-only output
Gemini 2.0 Flash is now generally available for text-only outputs. Multimodal outputs are still available only as a private preview. For more information, seeGemini 2.0.
New Gemini 2.0 Pro and Gemini 2.0 Flash-Lite models available to users
Two new models in the Gemini 2.0 family are now available to users:
- Gemini 2.0 Pro: Our strongest model for coding and world knowledge, featuring a 2M long context window. Gemini 2.0 Pro is available as an experimental model in Vertex AI.
- Gemini 2.0 Flash-Lite: Our fastest and most cost efficient Flash model. Gemini 2.0 Flash-Lite is available as a Preview model in Vertex AI.
For more information, seeGemini 2.0
January 31, 2025
You can now monitor usage, throughput, and latency and troubleshoot 429 errors on Vertex AI foundation models, like Google Gemini and Anthropic Claude, by using a predefined dashboard. After querying a model from theVertex AI Model Garden, you can find the name of the model you queried in the Vertex AIDashboard page under the "Model observability" heading.
To customize the dashboard and explore relevant metrics in Cloud Monitoring, clickShow All Metrics. For information about using dashboards in Cloud Monitoring, seeView and customize Google Cloud dashboards.
January 30, 2025
Mistral Large (24.07) and Codestral (24.05) that are offered as a Model as a Service (MaaS) models in Model Garden are deprecated. For details, seeGenerative AI on Vertex AI deprecations.
January 29, 2025
New Imagen 3 image generation model available to users
A newer improved Imagen 3 image generation model is now available to all users:
imagen-3.0-generate-002
This image generation model supports the following additional features:
- Prompt enhancement - The LLM-based prompt rewriter tool adds additional details and descriptive language to the prompt you provide, generally resulting in higher quality generated images. This feature is configurable and is enabled by default.
For more information, seeImagen on Vertex AI model versions and lifecycle andGenerate images using text prompts.
January 22, 2025
LangChain on Vertex AI
Billing for LangChain on Vertex AI will start on March 4, 2025.
The pricing structure is based on vCPU hours and GiB hours used. This means that you will be charged for both the compute (vCPU) and memory resources consumed by your LangChain on Vertex AI workloads.
You can review the pricing details in the table below.
| Product | SKU ID | Price |
|---|---|---|
| ReasoningEngine vCPU | 8A55-0B95-B7DC | $0.0994/vCPU-Hr |
| ReasoningEngine Memory | 0B45-6103-6EC1 | $0.0105/GiB-Hr |
January 21, 2025
Anthropic's Claude 3 Sonnet that is offered as a Model as a Service (MaaS) model in Model Garden is deprecated. For details, seeGenerative AI on Vertex AI deprecations.
January 17, 2025
Agent evaluation using theGen AI evaluation service is available inPreview.
December 20, 2024
RAG Engine isgenerally available (GA).
The supported models include the following:
- Google Gemini
- Google embedding and OSS E5 embedding models
- Model Garden self-deployed OSS LLMs
- Model as a service (MaaS) Llama models
The supported features include the following:
- Data connectors: Google Cloud Storage, Google Drive, Slack, Jira, and SharePoint
- Document types: Google Workspace documents, HTML, JSON, Markdown, PDF, and text files
- Transformations: fixed-size chunking and chunk overlap
- Vector databases: Vertex AI Vector Search and Pinecone
December 18, 2024
Hex-LLM: High-Efficiency Large Language Model Serving is available inGeneral Availability (GA).
This launch adds support for the following models:
- Llama 3.1
- Llama 3.2
- Phi-3
- Qwen2 and Qwen2.5
Additional supported features:
- Multi-host serving.
- Disaggregated serving (experimental).
- Prefix caching.
- AWQ quantization.
December 17, 2024
You can copy tuned Gemini 1.5 Pro 002 and Gemini 1.5 Flash 002 adapter models across projects. For details, seeCopy a model in Vertex AI Model Registry.
December 11, 2024
The Gemini 2.0 Flash (gemini-2.0-flash-exp) model is Generally available for grounded answer generation with RAG. This model is tuned to address context-based question and answering tasks. For more information, seeGround responses for Gemini models.
December 10, 2024
Imagen 3 image generation models Generally Available to all users
Imagen 3 image generation models are now available to all users without requiring prior approval. These include the following image generation models:
imagen-3.0-generate-001imagen-3.0-fast-generate-001(low latency model)
Prior image generation models (imagegeneration@006,imagegeneration@005,imagegeneration@002) still require approval to use.
For more information, seeImagen on Vertex AI model versions and lifecycle andGenerate images using text prompts.
Imagen 3 Customization model Generally Available to approved users
Imagen 3 Customization model is now available to approved users. This includes the following model:
imagen-3.0-capability
Imagen 3 Customization lets you guide image generation by providing reference images (few-shot learning). Imagen 3 Customization lets you customize generated images for the following feature categories:
- Subject Customization (product, person, and animal companion)
- Style Customization
- Controlled Customization (canny edge and scribble)
- Instruct Customization (Style transfer)
Imagen 3 editing model Generally Available to approved users
The Imagen 3 Editing model is now available to approved users. This includes the following model:
imagen-3.0-capability
This model offers the following additional features:
- Inpainting - Add or remove content from a masked area of an image
- Outpainting - Expand a masked area of an image
- Product image editing - Identify and maintain a primary product while changing the background or product position
For more information, seeModel versions.
December 06, 2024
A vulnerability was discovered in the Vertex AI API serving Gemini multimodal requests, allowing bypass of VPC Service Controls. For details, see theSecurity bulletins page.
November 21, 2024
Mistral Large (24.11) is Generally Available on Vertex AI as a managed model. To learn more, view theMistral Large (24.11) model card in Model Garden.
The Gen AI evaluation service can now help you evaluate your translation models using MetricX, COMET, and BLEU metrics.To learn more about evaluating your translation models, seeEvaluate translation models.
November 08, 2024
Batch predictions for Llama models on Vertex AI (MaaS) is available inPreview.
Batch prediction support for Gemini
Batch prediction is available for Gemini inGeneral Availability (GA). Available Gemini models include Gemini 1.0 Pro, Gemini 1.5 Pro, and Gemini 1.5 Flash. To get started with batch prediction, seeGet batch predictions for Gemini.
November 05, 2024
We are extending the availability of Gemini 1.0 Pro 001 and Gemini 1.0 Pro Vision 001 from February 15, 2025 to April 9, 2025. For details, see theDeprecations.
November 04, 2024
The translation LLM now supports Polish, Turkish, Indonesian, Dutch, Vietnamese, Thai and Czech. For the full list of supported languages, see theTranslate text page.
The Anthropic Claude Haiku 3.5 is Generally Available on Vertex AI. To learn more, view theClaude Haiku 3.5 model card in Model Garden.
October 28, 2024
TheWhisper large v3 and Whisper large v3 turbo models have been added to Model Garden.
Updated the fine-tuning notebooks forGemma 2,Llama 3.1,Mistral, andMixtral with the following enhancements:
- The notebooks use an updated high-performance container for single host multi-GPU LoRA fine-tuning.
- Better throughput and GPU utilization with well-tested max-sequence-lengths.
- Support for input token masking.
- No out of memory (OOM) error during fine-tuning.
- Added a custom dataset example that uses a template and format validation.
- Support for a default accelerator pool with quota checks.
- Improved documentation.
October 22, 2024
The Anthropic Claude Sonnet 3.5 v2 is Generally Available. To learn more, view theClaude Sonnet 3.5 v2 model card in Model Garden.
October 18, 2024
TheLlama 3.1 405B model that is managed on Vertex AI is nowGenerally Available.
October 09, 2024
The Vertex AI Gemini API SDK supports tokenization capabilities for local token counting and computation. This is a streamlined way to compute tokens locally, ensuring compatibility across different Gemini models and their tokenizers. Supported models include gemini-1.5-flash and gemini-1.5-pro . To learn more, seeCount tokens.
October 04, 2024
The AI assistant in Vertex AI Studio can help you refine and generate prompts. This feature is inPreview. To learn more, seeUse AI-powered prompt writing tools.
You can deploy Hugging Face models on Google Cloud that havetext embedding inference enabled orpytorch inference enabled. For more information, see theHugging Face model deployment in the console.
Added dynamic LoRA serving forLlama 3.1 andStable Diffusion XL.
Prompt Guard andFlux were added toModel Garden.
October 01, 2024
Grounding: Dynamic retrieval for grounded results (GA)
Dynamic retrieval lets you choose when to turn off grounding with Google Search. This is useful when a prompt doesn't require an answer grounded in Google Search, and the supported models can provide an answer based on their knowledge without grounding. Dynamic retrieval helps you manage latency, quality, and cost more effectively.
This feature isGenerally Available. For more information, seeDynamic retrieval.
September 30, 2024
Prompt templates let you to test how different prompt formats perform with different sets of prompt data. This feature is inPreview. To learn more, seeUse prompt templates.
September 25, 2024
The Llama 3.2 90B model is available inPreview on Vertex AI. Llama 3.2 90B enables developers to build and deploy the latest generative AI models and applications that use Llama's capabilities, such as image reasoning. Llama 3.2 is also designed to be more accessible for on-device applications. For more information, seeLlama models.
September 24, 2024
New stable versions of Gemini 1.5 Pro (gemini-1.5-pro-002) and Gemini 1.5 Flash (gemini-1.5-flash-002) areGenerally Available. These models introduce broad quality improvements over the previous001 versions, with significant gains in the following categories:
- Factuality and reduce model hallucinations
- Openbook Q&A for RAG use cases
- Instruction following
- Multilingual understanding in 102 languages, especially in Korean, French, German, Spanish, Japanese, Russian, and Chinese.
- SQL generation
- Audio understanding
- Document understanding
- Long context
- Math and reasoning
For more information about differences with the previous model versions, seeModel versions and lifecycle.
The new API parametersaudioTimestamp,responseLogprob, andlogprobs are inPublic Preview. For more information, seeAPI reference.
The 2M context window with Gemini 1.5 Pro is now inGenerally Available, which opens up long-form multimodal use cases that only Gemini can support.
Use Gemini to directly analyze YouTube videos and publicly available media (such as images, audio, and video) by using a link. This feature is inPublic Preview.
The latest versions ofGemini 1.5 Flash (gemini-1.5-flash-002) andGemini 1.5 Pro (gemini-1.5-pro-002) usedynamic shared quota, which distributes on-demand capacity among all queries being processed. Dynamic shared quota isGenerally Available.
Gemini 1.5 Pro and Gemini 1.5 Flash now support multimodal input withfunction calling. This feature is inPreview.
Gemini 1.5 Pro and Gemini 1.5 Flash Tuning is now available inGA.Tune Gemini with text, image, audio, and document data types using the latest models:
gemini-1.5-pro-002gemini-1.5-flash-002
Gemini 1.0 tuning remains in preview.
For more information on tuning Gemini, seeTune Gemini models by using supervised fine-tuning.
The Vertex AI prompt optimizer adapts your prompts using the optimal instructions and examples to elicit the best performance from your chosen model. This feature is available inPreview. To learn more, seeOptimize prompts.
September 20, 2024
Add label metadata togenerateContent andstreamGenerateContent API calls. For details, seeAdd labels to API calls.
September 18, 2024
Model Garden supports an organization policy so that administrators can limit access to certain models and capabilities. For more information, seeControl access to Model Garden models
September 03, 2024
Gemini 1.5 Flash (gemini-1.5-flash) supportscontrolled generation.
August 30, 2024
Gen AI Evaluation Service is Generally Available. To learn more, see theGen AI Evaluation Service overview.
August 26, 2024
For controlled generation, you can have the model respond with an enum value in plain text, as defined in your response schema. Set theresponseMimeType totext/x.enum. For more information, seeControl generated output.
August 22, 2024
AI21 Labs
Managed models from AI21 Labs are available on Vertex AI. To use a AI21 Labs model on Vertex AI, send a request directly to the Vertex AI API endpoint. For more information, seeAI21 models.
August 09, 2024
Gemini on Vertex AI supports multiple response candidates. For details, seeGenerate content with the Gemini API.
August 05, 2024
The translation LLM now supports Arabic, Hindi, and Russian. For the full list of supported languages, see theTranslate text page.
August 02, 2024
Vertex AI SDK for Python supports token listing and counting for prompts without the need to make API calls. This feature is available in (Preview). For details, seeList and count tokens.
July 31, 2024
Gemma 2 2B is available in Model Garden. For details, seeUse Gemma open models.
New Imagen on Vertex AI image generation model and features
The Imagen 3 image generation models (imagen-3.0-generate-001 and the low-latency versionimagen-3.0-fast-generate-001) are Generally Available toapproved users. These models offer the following additional features:
- Additional aspect ratios (1:1, 3:4, 4:3, 9:16, 16:9)
- Digital watermark (SynthID) enabled by default
- Watermark verification
- User-configurable safety features (safety setting, person/face setting)
For more information, seeModel versions andGenerate images using text prompts.
Resource and deployment settings were made to the following models:
- Added GPU inferences forgemma2-27b andgemma2-27b-it with verified performances.
- Added verified deployment settings for Mistral AI models that are deployed from Huggingface, includingmistralai/mistral-nemo-instruct-2407,mistralai/mistral-nemo-base-2407,mistralai/mistral-large-instruct-2407, andmistralai/codestral-22b-v0.1.
- Added multiple deployment settings with A100 (40G), A100 (80G) and H100 (80G) for select models, such asllama3.1,llama3,gemma2,gemma, andmistral-7b.
The following models have been added toModel Garden:
- Gemma 2 2B: A foundation LLM by Google DeepMind.
- Qwen2: An LLM series by Alibaba Cloud.
- Phi-3: An LLM series by Microsoft.
July 30, 2024
July 24, 2024
Mistral AI
Managed models from Mistral AI are available on Vertex AI. To use a Mistral AI model on Vertex AI, send a request directly to the Vertex AI API endpoint. For more information, seeMistral AI models.
July 23, 2024
Llama 3.1
The Llama 3.1 405B model is available inPreview on Vertex AI. Llama 3.1 405B provides capabilities from synthetic data generation to model distillation, steerability, math, tool use, multilingual translation, and more. For more information, seeLlama models.
July 02, 2024
Google's open weight Gemma 2 model is available in Model Garden. For details, seeUse Gemma open models.
MaMMUT is now available inModel Garden. MaMMUT is a vision-encoder and text-decoder model for multimodal tasks such as visual question answering, image-text retrieval, text-image retrieval, and generation of multimodal embeddings.
June 28, 2024
The following models have been added toModel Garden:
- 36 Hugging Face embedding models with verified deployment settings such asBAAI/bge-m3 andintfloat/multilingual-e5-large-instruct.
- 35 Hugging Face PyTorch models with verified deployment settings such asstabilityai/stable-diffusion-2-1.
For more information, see theHugging Face model deployment in the console.
LaunchedHex-LLM for high-efficiency large language model serving. This performant TPU serving solution is based on XLA and optimized kernels to achieve high throughput and low latency.
Hex-LLM uses several parallelism strategies for multiple TPU chips, quantizations, dynamic LoRA, and more. Hex-LLM supports the following dense and sparse LLMs:
- Gemma 2B and 7B
- Gemma 2 9B and 27B
- Llama 2 7B, 13B and 70B
- Llama 3 8B and 70B
- Mistral 7B and Mixtral 8x7B
- Updated Docker images inLlama 3 notebooks that are more efficient at tuning.
- A notebook-based interactive workshop UI was added inModel Garden for image generative models such asstable-diffusion-xl-base,image inpainting,controlnet. You can find these models from theOpen Notebook list.
- Colab Notebooks for frequently used models in Model Garden have been revised with no-code or low-code implementations to improve accessibility and user experience.
June 27, 2024
Context caching is available for Gemini 1.5 Pro. Use context caching to reduce the cost of requests that contain repeat content with high input token counts. For more information, seeContext caching overview.
June 25, 2024
Controlled generation is available on Gemini 1.5 Pro and supports the JSON schema. For more information, seeControl generated output.
June 20, 2024
The Anthropic Claude Sonnet 3.5 is Generally Available. To learn more, view theClaude Sonnet 3.5 model card in Model Garden.
June 17, 2024
Increased the input token limit for Gemini 1.5 Pro from 1M to 2M. For more information, seeGoogle models.
June 11, 2024
June 10, 2024
Experiment in the Vertex AI Studio login-free
The Vertex AI Studio multi-model prompt designer can be accessed login-free. With this feature, prospective customers can use the Vertex AI Studio to test queries before deciding to sign up and create an account. To learn more about this experience, seeVertex AI Studio console experiences or to access the console directly go toVertex AI Studio.
May 31, 2024
Anthropic Claude 3.0 Opus model
TheAnthropic Claude 3.0 Opus model isGenerally Available. To learn more, see itsmodel card in Model Garden.
Generative AI on Vertex AI Regional APIs
Generative AI on Vertex AI regional APIs are available in thefollowing three regions:
us-east5me-central1me-central2
May 28, 2024
Gemini models support thefrequencyPenalty andpresencePenalty parameters. UsefrequencyPenalty to control the probability of repeated text in a response. UsepresencePenalty to control the probability of generating more diverse content. For more information, seeGemini model parameters.
May 24, 2024
The Gemini 1.5 Pro (gemini-1.5-pro-001) and Gemini 1.5 Flash (gemini-1.5-flash-001) models areGenerally Available. For more information, seeGoogle models,Overview of the Gemini API, andSend multimodal prompt requests.
May 20, 2024
The following models have been added toModel Garden:
- E5: A text embedding model series that can be served with a GPU or CPU.
- Instant ID: An identity preserving text-to-image generation model.
- Stable Diffusion XL lightning: A text-to-image generation model that is based on SDXL but requires fewer inference iterations.
To see a list of all available models, seeExplore models in Model Garden.
May 14, 2024
Gemini 1.5 Flash (Preview)
Gemini 1.5 Flash (gemini-1.5-flash-preview-0514) is available inPreview. Gemini 1.5 Flash is a multimodal model designed for fast, high volume, cost-effective text generation and chat applications. It can analyze text, code, audio, PDF, video, and video with audio.
Batch prediction support for Gemini
Batch prediction is available for Gemini inpreview. Available Gemini models include Gemini 1.0 Pro, Gemini 1.5 Pro, and Gemini 1.5 Flash. To get started with batch prediction, seeGet batch predictions for Gemini.
Grounding Gemini with Google Search is GA
The Gemini API Grounding with Google Search feature is available inGA. This is available for Gemini 1.0 Pro models. To learn more about model grounding, seeGrounding with Google Search.
PaliGemma model
ThePaliGemma model is available. PaliGemma is a lightweight open model that's part of the Google Gemma model family. It's the Gemma model family's best model option for image captioning tasks and visual question and answering tasks. Gemma models are based on Gemini models and intended to be extended by customers.
New stable text embedding models
The following text embedding models are availableGA:
text-embedding-004text-multilingual-embedding-002
For details on how to use these models, seeGet text embeddings.
April 18, 2024
Meta's open weightLlama 3 model is available in the Vertex AI Model Garden.
April 11, 2024
Anthropic Claude 3.0 Opus model
TheAnthropic Claude 3.0 Opus model is available inPreview. The Claude 3.0 Opus model is an Anthropic partner model that you can use with Vertex AI. It's the most capable of the Anthropic models at performing complex tasks quickly. To learn more, see itsmodel card in Model Garden.
April 09, 2024
New Imagen on Vertex AI image generation model and features
The 006 version of the Imagen 2 image generation model (imagegeneration@006) is now available. This model offers the following additional features:
- Additional aspect ratios (1:1, 3:4, 4:3, 9:16, 16:9)
- Digital watermark (SynthID) enabled by default
- Watermark verification*
- New user-configurable safety features (safety setting, person/face setting)
For more information, seeModel versions andGenerate images using text prompts.
* Theseed field can't be used while digital watermark is enabled.
New Imagen on Vertex AI image editing model and features
The 006 version of the Imagen 2 image editing model (imagegeneration@006) is now available. This model offers the following additional features:
- Inpainting - Add or remove content from a masked area of an image
- Outpainting - Expand a masked area of an image
- Product image editing - Identify and maintain a primary product while changing the background or product position
For more information, seeModel versions.
Change in Imagen image generation version 006 (imagegeneration@006)seed field behavior
For the new Imagen image generation model version 006 (imagegeneration@006) theseed field behavior has changed. For the v.006 model a digital watermark is enabled by default for image generation. To be able to use aseed value to get deterministic output you must disable digital watermark generation by setting the followingparameter:"addWatermark": false.
For more information, see theImagen for image generation and editing API reference.
Gemini 1.5 Pro (Preview)
Gemini 1.5 Pro (gemini-1.5-pro-preview-0409) is available inPreview. Gemini 1.5 Pro is a multimodal model that analyzes text, code, audio, PDF, video, and video with audio.
CodeGemma model
TheCodeGemma model is available. CodeGemma is a lightweight open model that's part of the Google Gemma model family. CodeGemma is the Gemma model family's code generation and code completion offering. Gemma models are based on Gemini models and intended to be extended by customers.
Vertex AI Studio features and updates
- The Vertex AI Studio supports side-by-side comparison to allow users to compare up to 3 prompts in a side-by-side view.
- The Vertex AI Studio supports rapid evaluation in console and the ability to upload a ground truth response (or a model response to try to emulate).
To learn more, seeTry your prompts in Vertex AI Studio
Generative AI on Vertex AI security control update
Security controls are available for the online prediction feature for Gemini 1.0 Pro and Gemini 1.0 Pro Vision.
Gemini 1.0 Pro stable version 002
The 002 version of the Gemini 1.0 Pro multimodal model (gemini-1.0-pro-002) is available. For more information about stable versions of Gemini models, seeGemini model versions and lifecycle.
Text translation
Translate text in Vertex AI Studio is available inPreview.
Regional APIs
- Regional APIs areavailable in 11 new countries for Gemini, Imagen, and embeddings.
- US and EU have machine-learning processing boundaries for the
gemini-1.0-pro-001,gemini-1.0-pro-002,gemini-1.0-pro-vision-001, andimagegeneration@005models.
System instructions
System instructions are supported inPreview by the Gemini 1.0 Pro (stable versiongemini-1.0-pro-002 only) and Gemini 1.5 Pro (Preview) multimodal models. Use system instructions to guide model behavior based on your specific needs and use cases. For more information, seeSystem instructions examples.
New text embedding models
The following text embedding models are now inPreview.
text-embedding-preview-0409text-multilingual-embedding-preview-0409
When evaluated using theMTEB benchmarks, these models produce better embeddings compared to previous versions. The new models also offerdynamic embedding sizes, which you can use to output smaller embedding dimensions, with minor performance loss, to save on computing and storage costs.
For details on how to use these models, refer to thepublic documentation and try out ourColab.
Supervised Tuning for Gemini
Supervised tuning is available for thegemini-1.0-pro-002 model.
Generative AI Knowledge Base
TheJump Start Solution: Generative AI Knowledge Base demonstrates how to build a simple chatbot with business- and domain-specific knowledge.
Online Evaluation Service
Generative AI evaluation supportsonline evaluation in addition topipeline evaluation. The list of supported evaluation metrics has also expanded. SeeAPI reference andSDK reference.
Grounding Gemini and Grounding with Google Search
The Gemini API now supports Grounding with Google Search inPreview. Currently available for Gemini 1.0 Pro models.
April 02, 2024
Model Garden supports allText Generation Inference supported models inHuggingFace:
- Verified deployment settings for about 400 Hugging Face text generation models (includinggoogle/gemma-7b-it,meta-llama/Llama-2-7b-chat-hf, andmistralai/Mistral-7B-v0.1).
- Other Hugging Face text generation models have unverified deployment settings that are auto generated.
March 29, 2024
The MedLM-large model infrastructure has been upgraded to improvelatency and stability. Responses from the model might be slightly different.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-17 UTC.