Use open models using Model as a Service (MaaS) Stay organized with collections Save and categorize content based on your preferences.
This document describes how to use open models through Model as a Service(MaaS) on Vertex AI. MaaS provides serverless access to selectedpartner and open-source models, eliminating the need to provision or manageinfrastructure.
Model Gardenis a centralized library of AI and ML models from Google, Google Partners, andopen models (open-weight and open-source), including MaaS models.Model Garden provides multiple ways to deployavailablemodelson Vertex AI, includingmodels from HuggingFace.
For more information about MaaS, see thepartner modelsdocumentation.
Before you begin
To use MaaS models, you must enable the Vertex AI API in yourGoogle Cloud project.
gcloudservicesenableaiplatform.googleapis.comEnable the model's API
Before you can use a MaaS model, you must enable its API. To do this, go to themodel page in Model Garden. Some models available through MaaS are alsoavailable for self-deployment. The Model Garden model cards for bothofferings differ. The MaaS model card includesAPI Service in its name.
Call the model using the Google Gen AI SDK for Python
The following example calls the Llama 3.3 model using the Google Gen AI SDKfor Python.
fromgoogleimportgenaifromgoogle.genaiimporttypesPROJECT_ID="PROJECT_ID"LOCATION="LOCATION"MODEL="meta/llama-3.3-70b-instruct-maas"# The model ID from Model Garden with "API Service"# Define the prompt to send to the model.prompt="What is the distance between earth and moon?"# Initialize the Google Gen AI SDK client.client=genai.Client(vertexai=True,project=PROJECT_ID,location=LOCATION,)# Prepare the content for the chat.contents:types.ContentListUnion=[types.Content(role="user",parts=[types.Part.from_text(text=prompt)])]# Configure generation parameters.generate_content_config=types.GenerateContentConfig(temperature=0,top_p=0,max_output_tokens=4096,)try:# Create a chat instance with the specified model.chat=client.chats.create(model=MODEL)# Send the message and print the response.response=chat.send_message(contents)print(response.text)exceptExceptionase:print(f"{MODEL} call failed due to{e}")What's next
- Choose an open model serving option
- Deploy open models from Model Garden
- Deploy open models with prebuilt containers
- Deploy open models with a custom vLLM container
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.