- Notifications
You must be signed in to change notification settings - Fork40
Inference, Fine Tuning and many more recipes with Gemma family of models
License
huggingface/huggingface-gemma-recipes
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
🤗💎 Welcome! This repository containsminimal recipes to get started quickly with the Gemma family of models.
Note
Gemma 3n Conversational Fine tuning 2B on a Free Colab Notebook:
Gemma 3n Conversational Fine tuning 4B on a Free Colab Notebook:
Gemma 3n Multimodal Finetuning 2B/4B on a Free Colab Notebook:
To quickly run a Gemma 💎 model on your machine, install the latest version oftimm
(for the vision encoder) and 🤗transformers
to run inference, or if you want to fine tune it.
$ pip install -U -q transformers timm
The easiest way to start using Gemma 3n is by using the pipeline abstraction in transformers:
importtorchfromtransformersimportpipelinepipe=pipeline("image-text-to-text",model="google/gemma-3n-E4B-it",# "google/gemma-3n-E4B-it"device="cuda",torch_dtype=torch.bfloat16)messages= [ {"role":"user","content": [ {"type":"image","url":"https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/airplane.jpg"}, {"type":"text","text":"Describe this image"} ] }]output=pipe(text=messages,max_new_tokens=32)print(output[0]["generated_text"][-1]["content"])
Initialize the model and the processor from the Hub, and write themodel_generation
function that takes care of processing the prompts and running the inference on the model.
fromtransformersimportAutoProcessor,AutoModelForImageTextToTextimporttorchmodel_id="google/gemma-3n-e4b-it"# google/gemma-3n-e2b-itprocessor=AutoProcessor.from_pretrained(model_id)model=AutoModelForImageTextToText.from_pretrained(model_id).to(device)defmodel_generation(model,messages):inputs=processor.apply_chat_template(messages,add_generation_prompt=True,tokenize=True,return_dict=True,return_tensors="pt", )input_len=inputs["input_ids"].shape[-1]inputs=inputs.to(model.device,dtype=model.dtype)withtorch.inference_mode():generation=model.generate(**inputs,max_new_tokens=32,disable_compile=False)generation=generation[:,input_len:]decoded=processor.batch_decode(generation,skip_special_tokens=True)print(decoded[0])
And then using calling it with our specific modality:
# Text Onlymessages= [ {"role":"user","content": [ {"type":"text","text":"What is the capital of France?"} ] }]model_generation(model,messages)
# Interleaved with Audiomessages= [ {"role":"user","content": [ {"type":"text","text":"Transcribe the following speech segment in English:"}, {"type":"audio","audio":"https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/speech.wav"}, ] }]model_generation(model,messages)
# Interleaved with Imagemessages= [ {"role":"user","content": [ {"type":"image","image":"https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/airplane.jpg"}, {"type":"text","text":"Describe this image."} ] }]model_generation(model,messages)
We include a series of notebook+scripts for fine tuning the models.
- Gemma 3n Conversational Fine tuning 2B on free Colab T4
- Gemma 3n Conversational Fine tuning 4B with Unsloth on free Colab T4
- Gemma 3n Multimodal Fine tuning 2B/4B with Unsloth on free Colab T4
- Fine tuning Gemma 3n on audio
- Fine tuning Gemma 3n on GUI Grounding
- Fine tuning Gemma3n on video+audio using FineVideo (all modalities)
- Fine tuning Gemma 3n on images using TRL
- Fine tuning Gemma 3n on images (script)
- Fine tuning Gemma 3n on audio (script)
- Fine tuning Gemma3n on video+audio using FineVideo (all modalities)
- Reinforement Learning (GRPO) on Gemma 3 with Unsloth and TRL
- Vision fine tuning Gemma 3 4B with Unsloth
- Conversational fine tuning Gemma 3 4B with Unsloth
Before fine-tuning the model, ensure all dependencies are installed:
$ pip install -U -q -r requirements.txt
✨Bonus: We've also experimented with addingobject detection 🔍 capabilities to Gemma 3. You can explore that work inthis dedicated repo.
About
Inference, Fine Tuning and many more recipes with Gemma family of models
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors8
Uh oh!
There was an error while loading.Please reload this page.