Vertex AI managed models for MaaS

Vertex AI supports a curated list of partner and open models as managedmodels. These models can be used withVertex AI as amodel as a service (MaaS) and are offered as a managed API. When you use amanaged model, you continue to send your requests to Vertex AIendpoints. Managed models are serverless so there's no need to provision ormanage infrastructure.

Managed models can be discovered using Model Garden. You can alsodeploy models using Model Garden. For more information, seeExplore AImodels inModel Garden.

Partner models

The following partner models are offered as managed APIs on Vertex AIModel Garden (MaaS):

Model name	Modality	Description	Quickstart
Claude Sonnet 4.6	Language, Vision	Claude Sonnet 4.6 delivers frontier intelligence at scale—built for coding, agents, and enterprise workflows.	Model card
Claude Opus 4.6	Language, Vision	The next generation of Anthropic's most intelligent model, Claude Opus 4.6 is an industry leader across coding, agents, computer use, and enterprise workflows.	Model card
Claude Opus 4.5	Language, Vision	The next generation of Anthropic's most intelligent model, Claude Opus 4.5 is an industry leader across coding, agents, computer use, and enterprise workflows.	Model card
Claude Sonnet 4.5	Language, Vision	Anthropic's mid-sized model for powering real-world agents, with capabilities in coding, computer use, cybersecurity, and working with office files like spreadsheets.	Model card
Claude Opus 4.1	Language, Vision	An industry leader for coding. It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, significantly expanding what AI agents can solve. Ideal for powering frontier agent products and features.	Model card
Claude Haiku 4.5	Language, Vision	Claude Haiku 4.5 delivers near-frontier performance for a wide range of use cases, and stands out as one of the best coding models in the world–with the right speed and cost to power free products and high-volume user experiences.	Model card
Claude Opus 4	Language, Vision	Claude Opus 4 delivers sustained performance on long-running tasks that require focused effort and thousands of steps, significantly expanding what AI agents can solve.	Model card
Claude Sonnet 4	Language, Vision	Anthropic's mid-size model with superior intelligence for high-volume uses, such as coding, in-depth research, and agents.	Model card
Anthropic's Claude 3.5 Sonnet v2	Language, Vision	The upgraded Claude 3.5 Sonnet is a state-of-the-art model for real-world software engineering tasks and agentic capabilities. Claude 3.5 Sonnet delivers these advancements at the same price and speed as its predecessor.	Model card
Anthropic's Claude 3 Haiku	Language	Anthropic's fastest vision and text model for near-instant responses to basic queries, meant for seamless AI experiences mimicking human interactions.	Model card
Anthropic's Claude 3.5 Sonnet	Language	Claude 3.5 Sonnet outperforms Anthropic's Claude 3 Opus on a wide range of Anthropic's evaluations with the speed and cost of Anthropic's mid-tier model, Claude 3 Sonnet.	Model card
Jamba 1.5 Large (Preview)	Language	AI21 Labs's Jamba 1.5 Large is designed for superior quality responses, high throughput, and competitive pricing compared to other models in its size class.	Model card
Jamba 1.5 Mini (Preview)	Language	AI21 Labs's Jamba 1.5 Mini is well balanced across quality, throughput, and low cost.	Model card
Mistral Medium 3	Language	Mistral Medium 3 is a versatile model designed for a wide range of tasks, including programming, mathematical reasoning, understanding long documents, summarization, and dialogue.	Model card
Mistral OCR (25.05)	Language, Vision	Mistral OCR (25.05) is an Optical Character Recognition API for document understanding. The model comprehends each element of documents such as media, text, tables, and equations.	Model card
Mistral Small 3.1 (25.03)	Language	Mistral Small 3.1 (25.03) is the latest version of Mistral's Small model, featuring multimodal capabilities and extended context length.	Model card
Codestral 2	Language, Code	Codestral 2 is Mistral's code generation specialized model built specifically for high-precision fill-in-the-middle (FIM) completion that helps developers write and interact with code through a shared instruction and completion API endpoint.	Model card

Open models

The following open models are offered as managed APIs on Vertex AIModel Garden (MaaS):

Model name	Modality	Description	Quickstart
DeepSeek-OCR	Language, Vision	A comprehensive Optical Character Recognition (OCR) model that analyzes and understands complex documents. It excels at challenging OCR tasks.	Model card
DeepSeek R1 (0528)	Language	DeepSeek's latest version of the DeepSeek R1 model.	Model card
DeepSeek-V3.1	Language	DeepSeek's hybrid model that supports both thinking mode and non-thinking mode.	Model card
DeepSeek-V3.2	Language	DeepSeek's model that harmonizes high computational efficiency with superior reasoning and agent performance.	Model card
GLM 4.7	Language, Code	GLM's model designed for core or vibe coding, tool use, and complex reasoning.	Model card
GLM 5	Language, Code	GLM's model targeting complex systems engineering and long-horizon agentic tasks.	Model card
gpt-oss 120B	Language	A 120B model that offers high performance on reasoning tasks.	Model card
gpt-oss 20B	Language	A 20B model optimized for efficiency and deployment on consumer and edge hardware.	Model card
Kimi K2 Thinking	Language	An open-source thinking agent model that reasons step-by-step and uses tools to solve complex problems.	Model card
Llama 3.3	Language	Llama 3.3 is a text-only 70B instruction-tuned model that provides enhanced performance relative to Llama 3.1 70B and to Llama 3.2 90B when used for text-only applications. Moreover, for some applications, Llama 3.3 70B approaches the performance of Llama 3.1 405B.	Model card
Llama 4 Maverick 17B-128E	Language, Vision	The largest and most capable Llama 4 model that has coding, reasoning, and image capabilities. Llama 4 Maverick 17B-128E is a multimodal model that uses the Mixture-of-Experts (MoE) architecture and early fusion.	Model card
Llama 4 Scout 17B-16E	Language, Vision	Llama 4 Scout 17B-16E delivers state-of-the-art results for its size class, outperforming previous Llama generations and other open and proprietary models on several benchmarks. Llama 4 Scout 17B-16E is a multimodal model that uses the Mixture-of-Experts (MoE) architecture and early fusion.	Model card
MiniMax M2	Language, Code	Designed for agentic and code-related tasks with strong capabilities in planning and executing complex tool-calling tasks.	Model card
Qwen3 235B	Language	An open-weight model with a "hybrid thinking" capability to switch between methodical reasoning and rapid conversation.	Model card
Qwen3 Coder	Language, Code	An open-weight model developed for advanced software development tasks.	Model card
Qwen3-Next-80B Instruct	Language, Code	A model from the Qwen3-Next family of models, specialized for following specific commands.	Model card
Qwen3-Next-80B Thinking	Language, Code	A model from the Qwen3-Next family of models, specialized for complex problem-solving and deep reasoning.	Model card

What's next

Learn more aboutVertex AI open models for MaaS.
Learn how toCall open model APIs.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Vertex AI managed models for MaaS Stay organized with collections Save and categorize content based on your preferences.

Partner models

Open models

What's next

Vertex AI managed models for MaaS