huggingface/huggingface-gemma-recipesPublic

NotificationsYou must be signed in to change notification settings
Fork40
Star252

Inference, Fine Tuning and many more recipes with Gemma family of models

License

MIT license

252 stars 40 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
assets		assets
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
requirements.txt		requirements.txt

Repository files navigation

Hugging Face Gemma Recipes

🤗💎 Welcome! This repository containsminimal recipes to get started quickly with the Gemma family of models.

Note

Gemma 3n Conversational Fine tuning 2B on a Free Colab Notebook:

Gemma 3n Conversational Fine tuning 4B on a Free Colab Notebook:

Gemma 3n Multimodal Finetuning 2B/4B on a Free Colab Notebook:

Multimodal inference using Gemma 3n via pipeline:

Getting Started

To quickly run a Gemma 💎 model on your machine, install the latest version oftimm (for the vision encoder) and 🤗transformers to run inference, or if you want to fine tune it.

$ pip install -U -q transformers timm

Inference with pipeline

The easiest way to start using Gemma 3n is by using the pipeline abstraction in transformers:

importtorchfromtransformersimportpipelinepipe=pipeline("image-text-to-text",model="google/gemma-3n-E4B-it",# "google/gemma-3n-E4B-it"device="cuda",torch_dtype=torch.bfloat16)messages= [   {"role":"user","content": [           {"type":"image","url":"https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/airplane.jpg"},           {"type":"text","text":"Describe this image"}       ]   }]output=pipe(text=messages,max_new_tokens=32)print(output[0]["generated_text"][-1]["content"])

Detailed inference with transformers

Initialize the model and the processor from the Hub, and write themodel_generation function that takes care of processing the prompts and running the inference on the model.

fromtransformersimportAutoProcessor,AutoModelForImageTextToTextimporttorchmodel_id="google/gemma-3n-e4b-it"# google/gemma-3n-e2b-itprocessor=AutoProcessor.from_pretrained(model_id)model=AutoModelForImageTextToText.from_pretrained(model_id).to(device)defmodel_generation(model,messages):inputs=processor.apply_chat_template(messages,add_generation_prompt=True,tokenize=True,return_dict=True,return_tensors="pt",    )input_len=inputs["input_ids"].shape[-1]inputs=inputs.to(model.device,dtype=model.dtype)withtorch.inference_mode():generation=model.generate(**inputs,max_new_tokens=32,disable_compile=False)generation=generation[:,input_len:]decoded=processor.batch_decode(generation,skip_special_tokens=True)print(decoded[0])

And then using calling it with our specific modality:

Text only

# Text Onlymessages= [    {"role":"user","content": [            {"type":"text","text":"What is the capital of France?"}        ]    }]model_generation(model,messages)

Interleaved with Audio

# Interleaved with Audiomessages= [    {"role":"user","content": [            {"type":"text","text":"Transcribe the following speech segment in English:"},            {"type":"audio","audio":"https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/speech.wav"},        ]    }]model_generation(model,messages)

Interleaved with Image/Video

# Interleaved with Imagemessages= [    {"role":"user","content": [            {"type":"image","image":"https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/airplane.jpg"},            {"type":"text","text":"Describe this image."}        ]    }]model_generation(model,messages)

Inference

Gemma 3n

Notebooks

Multimodal inference using Gemma 3n via pipeline

Function Calling

Gemma 3n

Notebooks

Function Calling with Gemma 3n: Local File Reader

Fine Tuning

We include a series of notebook+scripts for fine tuning the models.

Gemma 3n

Notebooks

Scripts

Gemma 3

RAG

Gemma 3n

Retrieval-Augmented Generation with Gemma 3n

Before fine-tuning the model, ensure all dependencies are installed:

$ pip install -U -q -r requirements.txt

✨Bonus: We've also experimented with addingobject detection 🔍 capabilities to Gemma 3. You can explore that work inthis dedicated repo.

About

Inference, Fine Tuning and many more recipes with Gemma family of models

Movatterモバイル変換

License

huggingface/huggingface-gemma-recipes

Folders and files

Latest commit

History

Repository files navigation

Hugging Face Gemma Recipes

Getting Started

Inference with pipeline

Detailed inference with transformers

Text only

Interleaved with Audio

Interleaved with Image/Video

Inference

Gemma 3n

Notebooks

Function Calling

Gemma 3n

Notebooks

Fine Tuning

Gemma 3n

Notebooks

Scripts

Gemma 3

RAG

Gemma 3n

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors8

Uh oh!

Languages

Packages