Overview of self-deployed models

Model Garden lets you self-deploy and serve open, partner, and custommodels on Vertex AI. Unlikemodel-as-a-service (MaaS) offerings, which are serverless and don'trequire manual deployment, self-deployed models run securely withinyour Google Cloud project and VPC network, giving you fullcontrol over the deployment environment.

Self-deploy open models

Open models provide pretrained capabilities for various AI tasks, includingGemma models that excel in multimodal processing. These models arefreely available for use, and you are free to publish their outputs as long asyou adhere to their licensing terms. Vertex AI offers both open weightand open source models.

When you use an open model with Vertex AI, you use Vertex AI foryour infrastructure. You can also use open models with other infrastructureproducts, such as PyTorch or Jax.

Open weight models

Many open models are considered open weight large language models (LLMs). Openweight models provide more transparency than models that aren't open weight. Amodel's weights are the numerical values stored in the model's neural networkarchitecture that represent learned patterns and relationships from the data amodel is trained on. The pretrained parameters, or weights, of open weightmodels are released. You can use an open weight model for inference and tuning.Details such as the original dataset, model architecture, and training codearen't always provided.

Open source models

Open models differ from open source AI models. While open models often exposethe weights and the core numerical representation of learned patterns, theydon't necessarily provide the full source code or training details. Open sourcemodels, on the other hand, typically make the entire codebase, includingtraining scripts and data, publicly available. Providing weights offers a levelof AI model transparency, allowing you to understand the model's capabilitieswithout needing to build it yourself.

Self-deployed partner models

Model Garden helps you purchase and manage model licenses from partnerswho offer proprietary models as a self-deploy option. You can get access tothese models through Cloud Marketplace. After you have a license, youcan choose to deploy on on-demand hardware or use your existingCompute Engine reservations and committed use discounts to managecosts. With self-deployed partner models, you are billed for both the modelusage and the underlying Vertex AI infrastructure consumed.

To request usage of a self-deployed partner model:

Navigate to theModel Garden console.
Find the relevant partner model.
ClickEnable and complete the provided form to get the necessarycommercial use licenses.

For more information about deploying and using partner models, seeDeploy apartner model and make prediction requests.

Considerations

When using self-deployed partner models, keep the following in mind:

Weight Export: Unlike with some open models, you cannot export theweights of self-deployed partner models.
Endpoint Type: Only theshared public endpoint type issupported for these deployments.

Note: Support for model-specific issues is provided directly by the partner. Tocontact a partner for model performance or other related issues, use thecontact details found in the "Support" section of theirModel Garden model card.

Learn more about self-deployed models in Vertex AI

To learn more about custom weights, see Deploy models with customweights.
For more information about Model Garden, seeOverview ofModel Garden.
For more information about deploying models, seeUse models inModel Garden.
Use Gemma open models
Use Llama open models
Use Hugging Face open models

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

Overview of self-deployed models Stay organized with collections Save and categorize content based on your preferences.