Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Gemma (language model)

From Wikipedia, the free encyclopedia
Family of lightweight open models by Google
Gemma
DeveloperGoogle DeepMind
Initial releaseFebruary 21, 2024; 20 months ago (2024-02-21)[1]
Stable release
Gemma 3 / March 12, 2025; 7 months ago (2025-03-12)[2]
TypeLarge language model
LicenseGemma License
Websitedeepmind.google/models/gemma/

Gemma is a series of open-sourcelarge language models developed byGoogle DeepMind. It is based on similar technologies asGemini. The first version was released in February 2024, followed by Gemma 2 in June 2024 and Gemma 3 in March 2025. Variants of Gemma have also been developed, such as the vision-language model PaliGemma and the model DolphinGemma for understanding dolphin communication.

History

[edit]

In February 2024, Google debuted Gemma, a family offree and open-source LLMs that serve as a lightweight version of Gemini. They come in two sizes, with a neural network with two and seven billionparameters, respectively. Multiple publications viewed this as a response to Meta and others open-sourcing their AI models, and a stark reversal from Google's longstanding practice of keeping its AI proprietary.[3][4][5]

Gemma 2 was released on June 27, 2024,[6] and Gemma 3 was released on March 12, 2025.[2][7]

Overview

[edit]

Based on similar technologies as the Gemini series of models, Gemma is described by Google as helping support its mission of "making AI helpful for everyone."[8] Google offers official Gemma variants optimized for specific use cases, such as MedGemma for medical analysis and DolphinGemma for studyingdolphin communication.[9]

Since its release, Gemma models have had over 150 million downloads, with 70,000 variants available onHugging Face.[10]

The latest generation of models is Gemma 3, offered in 1, 4, 12, and 27 billion parameter sizes with support for over 140 languages. As multimodal models, they support both text and image input.[11] Google also offers Gemma 3n, smaller models optimized for execution on consumer devices like phones, laptops, and tablets.[12]

Architecture

[edit]

The latest version of Gemma, Gemma 3, is based on a decoder-onlytransformer architecture with grouped-query attention (GQA) and the SigLIP vision encoder. Every model has a context length of 128K, with the exception of Gemma 3 1B, which has a context length of 32K.[13]

Quantized versionsfine-tuned using quantization-aware training (QAT) are also available,[13] offering sizable memory usage improvements with some negative impact on accuracy and precision.[14]

Variants

[edit]

Google develops official variants of Gemma models designed for specific purposes, like medical analysis or programming. These include:

  • ShieldGemma 2 (4B): Based on the Gemma 3 family, ShieldGemma is designed to identify and filter violent, dangerous, and sexually explicit images.[15]
  • MedGemma (4B and 27B): Also based on Gemma 3, MedGemma is designed for medical applications like image analysis. However, Google also notes that MedGemma "isn't yet clinical grade."[16]
  • DolphinGemma (roughly 400M): Developed in collaboration with researchers atGeorgia Tech and the Wild Dolphin Project, DolphinGemma aims to better understanddolphin communication throughaudio analysis.[17][18]
  • CodeGemma (2B and 7B): CodeGemma is a group of models designed forcode completion as well as general coding use.[19] It supports multiple programming languages, includingPython,Java,C++, and more.[20]
Technical specifications of Gemma models
GenerationRelease dateParametersContext lengthMultimodalNotes
Gemma 121 February 20242B, 7B8,192No2B distilled from 7B. 2B uses multi-query attention while 7B uses multi-head attention.
CodeGemma2B, 7B8,192NoGemma 1 finetuned for code generation.
RecurrentGemma11 April 20242B, 9BUnlimited (trained on 8,192)NoGriffin-based, instead of Transformer-based.[21]
Gemma 227 June 20242B, 9B, 27B8,192No27B trained from web documents, code, science articles. Gemma 2 9B was distilled from 27B. Gemma 2 2B was distilled from a 7B model that remained unreleased. Uses Grouped-Query Attention.[22]
PaliGemma10 July 20243B8,192ImageA vision-language model that takes text and image inputs, and outputs text. It is made by connecting aSigLIP-So400m image encoder with Gemma v1.0 2B.[23][24]
PaliGemma 24 December 20243B, 10B, 28B8,192ImageMade by mating SigLIP-4o400m with Gemma v2.0 2B, 9B, and 27B. Capable of more vision-language tasks.[25][26]
Gemma 312 March 20251B, 4B, 12B, 27B131,072ImageAll models trained with distillation. Post-training focuses on math, coding, chat, instruction following, and multilingual (supports 140 languages). Capable of function calling. 1B is not capable of vision.[27]

Note: open-weight models can have their context length rescaled at inference time. With Gemma 1, Gemma 2, PaliGemma, and PaliGemma 2, the cost is a linear increase of kv-cache size relative to context window size. With Gemma 3 there is an improved growth curve due to the separation of local and global attention. With RecurrentGemma the memory use is unchanged after 2,048 tokens.

References

[edit]
  1. ^Banks, Jeanine; Warkentin, Tris (21 February 2024)."Gemma: Introducing new state-of-the-art open models".The Keyword. Retrieved16 August 2025.
  2. ^ab"Introducing Gemma 3: The most capable model you can run on a single GPU or TPU".The Keyword. March 12, 2025.
  3. ^Khan, Jeremy (February 21, 2024)."Google unveils new family of open-source AI models called Gemma to take on Meta and others—deciding open-source AI ain't so bad after all".Fast Company.Archived from the original on February 21, 2024. RetrievedFebruary 21, 2024.
  4. ^Alba, Davey (February 21, 2024)."Google Delves Deeper Into Open Source with Launch of Gemma AI Model".Bloomberg News.Archived from the original on February 21, 2024. RetrievedFebruary 21, 2024.
  5. ^Metz, Cade; Grant, Nico (February 21, 2024)."Google Is Giving Away Some of the A.I. That Powers Chatbots".The New York Times.ISSN 0362-4331.Archived from the original on February 21, 2024. RetrievedFebruary 21, 2024.
  6. ^"Gemma 2 is now available to researchers and developers".Google. 2024-06-27. Retrieved2024-08-15.
  7. ^"Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM".Hugging Face. March 12, 2025.
  8. ^Banks, Jeanine; Warkentin, Tris (February 21, 2024)."Gemma: Introducing new state-of-the-art open models".The Keyword. Retrieved13 July 2025.
  9. ^"Gemma - Google DeepMind".Google DeepMind. Retrieved13 July 2025.
  10. ^Wiggers, Kyle (May 12, 2025)."Google's Gemma AI models surpass 150M downloads".TechCrunch. Retrieved13 July 2025.
  11. ^Gosthipaty, Aritra; merve; Cuenca, Pedro; Srivastav, Vaibhav (March 12, 2025)."Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM".Hugging Face. Retrieved13 July 2025.
  12. ^"Gemma 3n model overview".Google AI for Developers. Retrieved13 July 2025.
  13. ^abGemma Team (2025). "Gemma 3 Technical Report".arXiv:2503.19786v1 [cs.CL].
  14. ^Clark, Bryan (May 15, 2025)."What is quantization aware training?".IBM. Retrieved14 July 2025.
  15. ^ShieldGemma Team (2025). "ShieldGemma 2: Robust and Tractable Image Content Moderation".arXiv:2504.01081 [cs.CV].
  16. ^"MedGemma".Google Health AI Developer Foundations. Retrieved15 July 2025.
  17. ^"DolphinGemma: How Google AI is helping decode dolphin communication".Georgia Tech. Retrieved15 July 2025.
  18. ^Herzing, Denise; Starner, Thad (April 14, 2025)."DolphinGemma: How Google AI is helping decode dolphin communication".The Keyword. Retrieved15 July 2025.
  19. ^Irwin, Kate (April 10, 2024)."Google Launches Coding AIs That Could Rival Microsoft's GitHub Copilot".PCMag. Retrieved15 July 2025.
  20. ^"CodeGemma".Google AI for Developers. Retrieved15 July 2025.
  21. ^"RecurrentGemma: Moving Past Transformers for Efficient Open Language Models".arxiv.org.
  22. ^Gemma Team; Riviere, Morgane; Pathak, Shreya; Sessa, Pier Giuseppe; Hardin, Cassidy; Bhupatiraju, Surya; Hussenot, Léonard; Mesnard, Thomas; Shahriari, Bobak (2024-08-02),Gemma 2: Improving Open Language Models at a Practical Size,arXiv:2408.00118
  23. ^"PaLI: Scaling Language-Image Learning in 100+ Languages".research.google. Retrieved2024-08-15.
  24. ^"PaliGemma: A versatile 3B VLM for transfer".arxiv.org. 2024-07-10.
  25. ^"Introducing PaliGemma 2 mix: A vision-language model for multiple tasks- Google Developers Blog".developers.googleblog.com. Retrieved2025-02-22.
  26. ^"PaliGemma 2: A Family of Versatile VLMs for Transfer".arxiv.org.
  27. ^"Gemma 3 Technical Report".arxiv.org.

External links

[edit]
Computer
programs
AlphaGo
Versions
Competitions
In popular culture
Other
Machine
learning
Neural networks
Other
Generative
AI
Chatbots
Models
Other
See also
a subsidiary ofAlphabet
Company
Divisions
Subsidiaries
Active
Defunct
Programs
Events
Infrastructure
People
Current
Former
Criticism
General
Incidents
Other
Software
A–C
D–N
O–Z
Operating systems
Machine learning models
Neural networks
Computer programs
Formats and codecs
Programming languages
Search algorithms
Domain names
Typefaces
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
Hardware
Pixel
Smartphones
Smartwatches
Tablets
Laptops
Other
Nexus
Smartphones
Tablets
Other
Other
Advertising
Antitrust
Intellectual
property
Privacy
Other
Related
Concepts
Products
Android
Street View coverage
YouTube
Other
Documentaries
Books
Popular culture
Other
Concepts
Models
Text
Coding
Image
Video
Speech
Music
Agents
Companies
Controversies
Concepts
Applications
Implementations
Audio–visual
Text
Decisional
People
Architectures
Retrieved from "https://en.wikipedia.org/w/index.php?title=Gemma_(language_model)&oldid=1308571335"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp