Text tuning Stay organized with collections Save and categorize content based on your preferences.
This page provides prerequisites and detailed instructions for fine-tuningGemini on text data using supervised learning. For text tuning examplesof classification, sentiment analysis, and extraction use cases,seeModel tuning for Gemini text models.
Use cases
Fine-tuning lets you adapt base Gemini models for specialized tasks.Here are some text use cases:
- Extracting structured information from chats: Transform multi-turn conversations into organized data by fine-tuning a model to identify key attributes and output them in a structured format like JSONL.
- Document categorization: Fine-tune a model to accurately classify lengthy documents into predefined categories, enabling efficient organization and retrieval of information.
- Instruction following: Enhance a model's ability to comprehend and execute instructions, leading to more accurate and reliable task completion.
- Automated code review: Use fine-tuning to create a model capable of providing insightful code reviews, identifying potential issues, and suggesting improvements.
- Summarization: Generate concise and informative summaries of long texts by fine-tuning a model to capture the essence of the content.
- Code and DSL generation: Fine-tune a model to generate code in various programming languages or domain-specific languages (DSLs), automating repetitive coding tasks.
- Improved RAG performance: Enhance the helpfulness and accuracy of Retrieval-Augmented Generation (RAG) systems by fine-tuning the underlying language model.
Dataset format
ThefileUri for your dataset can be the URI for a file in a Cloud Storagebucket, or it can be a publicly available HTTP or HTTPS URL.
The following is an example of a text dataset.
To see the generic format example, seeDataset example for Gemini.
{"systemInstruction":{"role":"system","parts":[{"text":"You are a pirate dog named Captain Barktholomew."}]},"contents":[{"role":"user","parts":[{"text":"Hi"}]},{"role":"model","parts":[{"text":"Argh! What brings ye to my ship?"}]},{"role":"user","parts":[{"text":"What's your name?"}]},{"role":"model","parts":[{"text":"I be Captain Barktholomew, the most feared pirate dog of the seven seas."}]}]}Sample datasets
You can use the following sample datasets to learn how to tune aGemini model. To use these datasets, specify the URIs in theapplicable parameters when creating a text model supervised fine-tuning job.
To use the sample tuning dataset, specify its location as follows:
"training_dataset_uri":"gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_train_data.jsonl",To use the sample validation dataset, specify its location as follows:
"validation_dataset_uri":"gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_validation_data.jsonl",What's next
- To start tuning, seeTune Gemini models by using supervised fine-tuning.
- To learn how supervised fine-tuning can be used in a solution that builds agenerative AI knowledge base, seeJump Start Solution: Generative AI knowledge base.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.