Use Gemma open models Stay organized with collections Save and categorize content based on your preferences.
Gemma is a set of lightweight, generative artificial intelligence (AI)open models. Gemma models are available to run in yourapplications and on your hardware, mobile devices, or hosted services. You canalso customize these models using tuning techniques so that they excel atperforming tasks that matter to you and your users. Gemma models arebased onGemini models and are intendedfor the AI development community to extend and take further.
Fine-tuning can help improve a model's performance in specific tasks. Becausemodels in the Gemma model family are open weight, you can tune any ofthem using the AI framework of your choice and the Vertex AI SDK.You can open a notebook example to fine-tune the Gemma model usinga link available on the Gemma model card in Model Garden.
The following Gemma models are available to use with Vertex AI.To learn more about and test the Gemma models, see theirModel Garden model cards.
| Model name | Use cases | Model Garden model card |
|---|---|---|
| Gemma 3n | Capable of multimodal input, handling text, image, video, and audio input, and generating text outputs. | Go to the Gemma 3n model card |
| Gemma 3 | Best for text generation and image understanding tasks, including question answering, summarization, and reasoning. | Go to the Gemma 3 model card |
| Gemma 2 | Best for text generation, summarization, and extraction. | Go to the Gemma 2 model card |
| Gemma | Best for text generation, summarization, and extraction. | Go to the Gemma model card |
| CodeGemma | Best for code generation and completion. | Go to the CodeGemma model card |
| PaliGemma 2 | Best for image captioning tasks and visual question and answering tasks. | Go to the PaliGemma 2 model card |
| PaliGemma | Best for image captioning tasks and visual question and answering tasks. | Go to the PaliGemma model card |
| ShieldGemma 2 | Checks the safety of synthetic and natural images to help you build robust datasets and models. | Go to the ShieldGemma 2 model card |
| TxGemma | Best for therapeutic prediction tasks, including classification, regression, or generation, and reasoning tasks. | Go to the TxGemma model card |
| MedGemma | Gemma 3 variants that are trained for performance on medical text and image comprehension. | Go to the MedGemma model card |
| MedSigLIP | SigLIP variant that is trained to encode medical images and text into a common embedding space. | Go to the MedSigLIP model card |
| T5Gemma | Well-suited for a variety of generative tasks, including question answering, summarization, and reasoning. | Go to the T5Gemma model card |
The following are some options for where you can use Gemma:
Use Gemma with Vertex AI
Vertex AI offers a managed platform for rapidly building and scalingmachine learning projects without needing in-house MLOps expertise. You can useVertex AI as the downstream application that serves theGemma models. For example, you might port weights from the Kerasimplementation of Gemma. Next, you can use Vertex AI toserve that version of Gemma to get predictions. We recommend usingVertex AI if you want end-to-end MLOps capabilities, value-added MLfeatures, and a serverless experience for streamlined development.
To get started with Gemma, see the following notebooks:
Fine-tune Gemma 3 using PEFT and then deploy to Vertex AI from Vertex
Fine-tune Gemma 2 using PEFT and then deploy to Vertex AI from Vertex
Fine-tune Gemma using PEFT and then deploy to Vertex AI from Vertex
Fine-tune Gemma using PEFT and then deploy to Vertex AI from Huggingface
Fine-tune Gemma with Ray on Vertex AI and then deploy to Vertex AI
Run local inference with ShieldGemma 2 with Hugging Face transformers
Run local inference with T5Gemma with Hugging Face transformers
Use Gemma in other Google Cloud products
You can use Gemma with other Google Cloud products, such asGoogle Kubernetes Engine and Dataflow.
Use Gemma with GKE
Google Kubernetes Engine (GKE) is the Google Cloud solutionfor managed Kubernetes that provides scalability, security, resilience, and costeffectiveness. We recommend this option if you have existing Kubernetesinvestments, your organization has in-house MLOps expertise, or if you needgranular control over complex AI/ML workloads with unique security, datapipeline, and resource management requirements. To learn more, see the followingtutorials in the GKE documentation:
- Serve Gemma with vLLM
- Serve Gemma with TGI
- Serve Gemma with Triton and TensorRT-LLM
- Serve Gemma with JetStream
Use Gemma with Dataflow
You can use Gemma models with Dataflow forsentiment analysis.Use Dataflow to run inference pipelines that use theGemma models. To learn more, seeRun inference pipelines with Gemma open models.
Use Gemma with Colab
You can use Gemma with Colaboratory to create your Gemmasolution. In Colab, you can use Gemma with frameworkoptions such as PyTorch and JAX. To learn more, see:
- Get started with Gemma using Keras.
- Get started with Gemma using PyTorch.
- Basic tuning with Gemma using Keras.
- Distributed tuning with Gemma using Keras.
Gemma model sizes and capabilities
Gemma models are available in several sizes so you can buildgenerative AI solutions based on your available computing resources, thecapabilities you need, and where you want to run them. Each model is availablein a tuned and an untuned version:
Pretrained - This version of the model wasn't trained on any specific tasksor instructions beyond the Gemma core data training set. We don'trecommend using this model without performing some tuning.
Instruction-tuned - This version of the model was trained with human languageinteractions so that it can participate in a conversation, similar to a basicchat bot.
Mix fine-tuned - This version of the model is fine-tuned on a mixture ofacademic datasets and accepts natural language prompts.
Lower parameter sizes means lower resource requirements and more deploymentflexibility.
| Model name | Parameters size | Input | Output | Tuned versions | Intended platforms |
|---|---|---|---|---|---|
| Gemma 3n | |||||
| Gemma 3n E4B | 4 billion effective parameters | Text, image and audio | Text |
| Mobile devices and laptops |
| Gemma 3n E2B | 2 billion effective parameters | Text, image and audio | Text |
| Mobile devices and laptops |
| Gemma 3 | |||||
| Gemma 27B | 27 billion | Text and image | Text |
| Large servers or server clusters |
| Gemma 12B | 12 billion | Text and image | Text |
| Higher-end desktop computers and servers |
| Gemma 4B | 4 billion | Text and image | Text |
| Desktop computers and small servers |
| Gemma 1B | 1 billion | Text | Text |
| Mobile devices and laptops |
| Gemma 2 | |||||
| Gemma 27B | 27 billion | Text | Text |
| Large servers or server clusters |
| Gemma 9B | 9 billion | Text | Text |
| Higher-end desktop computers and servers |
| Gemma 2B | 2 billion | Text | Text |
| Mobile devices and laptops |
| Gemma | |||||
| Gemma 7B | 7 billion | Text | Text |
| Desktop computers and small servers |
| Gemma 2B | 2.2 billion | Text | Text |
| Mobile devices and laptops |
| CodeGemma | |||||
| CodeGemma 7B | 7 billion | Text | Text |
| Desktop computers and small servers |
| CodeGemma 2B | 2 billion | Text | Text |
| Desktop computers and small servers |
| PaliGemma 2 | |||||
| PaliGemma 28B | 28 billion | Text and image | Text |
| Large servers or server clusters |
| PaliGemma 10B | 10 billion | Text and image | Text |
| Higher-end desktop computers and servers |
| PaliGemma 3B | 3 billion | Text and image | Text |
| Desktop computers and small servers |
| PaliGemma | |||||
| PaliGemma 3B | 3 billion | Text and image | Text |
| Desktop computers and small servers |
| ShieldGemma 2 | |||||
| ShieldGemma 2 | 4 billion | Text and image | Text |
| Desktop computers and small servers |
| TxGemma | |||||
| TxGemma 27B | 27 billion | Text | Text |
| Large servers or server clusters |
| TxGemma 9B | 9 billion | Text | Text |
| Higher-end desktop computers and servers |
| TxGemma 2B | 2 billion | Text | Text |
| Mobile devices and laptops |
| MedGemma | |||||
| MedGemma 27B | 27 billion | Text and image | Text |
| Large servers or server clusters |
| MedGemma 4B | 4 billion | Text and image | Text |
| Desktop computers and small servers |
| MedSigLIP | |||||
| MedSigLIP | 800 million | Text and image | Embedding |
| Mobile devices and laptops |
| T5Gemma | |||||
| T5Gemma 9B-9B | 18 billion | Text | Text |
| Mobile devices and laptops |
| T5Gemma 9B-2B | 11 billion | Text | Text |
| Mobile devices and laptops |
| T5Gemma 2B-2B | 4 billion | Text | Text |
| Mobile devices and laptops |
| T5Gemma XL-XL | 4 billion | Text | Text |
| Mobile devices and laptops |
| T5Gemma M-L | 2 billion | Text | Text |
| Mobile devices and laptops |
| T5Gemma L-L | 1 billion | Text | Text |
| Mobile devices and laptops |
| T5Gemma B-B | 0.6 billion | Text | Text |
| Mobile devices and laptops |
| T5Gemma S-S | 0.3 billion | Text | Text |
| Mobile devices and laptops |
Gemma has been tested using Google's purpose built v5e TPUhardware and NVIDIA's L4(G2 Standard), A100(A2 Standard),H100(A3 High) GPU hardware.
What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.