Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Start agent traces#414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
aymeric-roucher wants to merge88 commits intomain
base:main
Choose a base branch
Loading
fromagent-traces
Draft
Show file tree
Hide file tree
Changes from1 commit
Commits
Show all changes
88 commits
Select commitHold shift + click to select a range
352008b
Start agent traces
aymeric-roucherFeb 24, 2025
6c231d2
Working local version with o1
aymeric-roucherFeb 25, 2025
69b2651
Update api addr
aymeric-roucherFeb 26, 2025
ad948c2
Increase concurrent requests
aymeric-roucherFeb 26, 2025
a00f0ee
Update sbatch params
aymeric-roucherFeb 26, 2025
143fcfa
Add conda activation
aymeric-roucherFeb 26, 2025
0af9e75
Use local model
aymeric-roucherFeb 26, 2025
6cffffe
128 concurrent
aymeric-roucherFeb 26, 2025
cf13c2b
Log
aymeric-roucherFeb 26, 2025
cffa362
Add conda init
aymeric-roucherFeb 26, 2025
e35800c
Fix slurm script
aymeric-roucherFeb 26, 2025
b47a4be
Add await
aymeric-roucherFeb 26, 2025
0cd0999
Try fixing async func
aymeric-roucherFeb 26, 2025
dd15ad8
Add stop sequences
aymeric-roucherFeb 26, 2025
d2588cd
Add port
aymeric-roucherFeb 27, 2025
b738e58
Make synchronous
aymeric-roucherFeb 28, 2025
f78b865
Small adapts to script
aymeric-roucherFeb 28, 2025
cb2a2c2
More detailed error logging
aymeric-roucherFeb 28, 2025
9a2d16f
Even more detailed request error logging
aymeric-roucherFeb 28, 2025
2a1ff76
Reduce context length
aymeric-roucherFeb 28, 2025
a97eb27
Add token counting
aymeric-roucherFeb 28, 2025
d8cb19b
Fix message roles an add token counting
aymeric-roucherFeb 28, 2025
e42b1cd
Add dummy completion
aymeric-roucherFeb 28, 2025
83a679f
Test
aymeric-roucherFeb 28, 2025
d87e3f3
Running with gpt-4o
aymeric-roucherFeb 28, 2025
8e70ca4
Update timeouts
aymeric-roucherFeb 28, 2025
2876d52
Adjust
aymeric-roucherFeb 28, 2025
cf52433
Flatten messages
aymeric-roucherFeb 28, 2025
a07cd54
Prompt more around testing the function
aymeric-roucherFeb 28, 2025
ddc1cdd
Improve explanations in prompt
aymeric-roucherFeb 28, 2025
4c2fce6
Also store final outputs
aymeric-roucherMar 13, 2025
4a20ba4
Try Qwen Coder 32B
aymeric-roucherApr 2, 2025
6961c36
Remove some dependencies to work on mac
aymeric-roucherApr 3, 2025
2b1bc05
Merge branch 'main' into agent-traces
aymeric-roucherApr 3, 2025
38efcfc
Working trace generation with auto verification by running test cases
aymeric-roucherApr 3, 2025
b7522e3
Add training scripts for agents
aymeric-roucherApr 3, 2025
2ddf70e
Change job name
aymeric-roucherApr 3, 2025
49083cc
Intervert sft training configs
aymeric-roucherApr 3, 2025
de2b792
Point to proper config file
aymeric-roucherApr 3, 2025
5647c26
Add distributed type
aymeric-roucherApr 3, 2025
8a7951c
Revert to zero3 config
aymeric-roucherApr 3, 2025
d28d07b
Remove deepspeed config
aymeric-roucherApr 4, 2025
cae3c7c
Update train slurm
aymeric-roucherApr 4, 2025
2a08444
Switch to new venv
aymeric-roucherApr 8, 2025
1eaf1d1
Move script to proper file
aymeric-roucherApr 8, 2025
2043be9
Change job name
aymeric-roucherApr 8, 2025
2030e16
Increase epochs
aymeric-roucherApr 8, 2025
08a449c
Update dataset name
aymeric-roucherApr 9, 2025
60472f6
Increase epochs
aymeric-roucherApr 9, 2025
9347590
adding qwen 3b training setup
Apr 15, 2025
a66a5e6
Merge branch 'main' into agent-traces
aymeric-roucherJun 23, 2025
a9b5411
Add aguvis download script
aymeric-roucherJun 23, 2025
80f7ce8
Improve collection script
aymeric-roucherJun 23, 2025
984d631
Add Readme for agents
aymeric-roucherJun 23, 2025
a675552
Fix env variables
aymeric-roucherJun 23, 2025
fbd987c
Remove weka
aymeric-roucherJun 23, 2025
3aee6ef
Modify train slurm
aymeric-roucherJun 23, 2025
7cb592c
Remove parsing
aymeric-roucherJun 23, 2025
b7a700e
Revert training script to the good old time when it worked
aymeric-roucherJun 23, 2025
3b77977
Revert to new shitty script
aymeric-roucherJun 23, 2025
81c64ac
Change weka path
aymeric-roucherJun 23, 2025
eb39096
Try edit
aymeric-roucherJun 23, 2025
0ee52fc
Fix env
aymeric-roucherJun 23, 2025
c4d4126
Working SFT for text model
aymeric-roucherJun 24, 2025
3c3e954
Start adapting script for VLM training
aymeric-roucherJun 24, 2025
a452f2f
Impreove data collection script
aymeric-roucherJun 25, 2025
5fa7e51
Deactivate multinodes
aymeric-roucherJun 26, 2025
933ea92
Merge branch 'agent-traces' of github.com:huggingface/open-r1 into ag…
aymeric-roucherJun 26, 2025
a658db9
Fix sft collate function for vlms
aymeric-roucherJun 26, 2025
24ea112
Fix collate fn in sft.py
aymeric-roucherJun 26, 2025
db30467
Working VLM training 🥳
aymeric-roucherJun 26, 2025
5eadb06
Add single-GPU training script
aymeric-roucherJun 26, 2025
b316210
Add second dataset in mix
aymeric-roucherJun 26, 2025
2ba1c65
Add aguvis conversion script
aymeric-roucherJul 1, 2025
f6b8f7c
Conversion script
aymeric-roucherJul 1, 2025
22b84cf
Merge branch 'agent-traces' of github.com:huggingface/open-r1 into ag…
aymeric-roucherJul 1, 2025
035f134
Integrate aguvis conversion to smolagents
aymeric-roucherJul 1, 2025
f692c10
Try catch wrap for processing
aymeric-roucherJul 1, 2025
b0d794c
override existing split
aymeric-roucherJul 1, 2025
31cf3a2
Nit script args
aymeric-roucherJul 1, 2025
029dc60
Update train instructions
aymeric-roucherJul 3, 2025
880a585
Merge branch 'agent-traces' of github.com:huggingface/open-r1 into ag…
aymeric-roucherJul 3, 2025
1b50860
Remove merge artifact
aymeric-roucherJul 3, 2025
868d4a4
Small fixes in recipe
aymeric-roucherJul 9, 2025
4c83688
Modify aguvis conversion script
aymeric-roucherJul 9, 2025
6a63f2f
Unify conversion in only one script
aymeric-roucherJul 9, 2025
e8a4c2b
Update imports
aymeric-roucherJul 9, 2025
18fea48
Fix script
aymeric-roucherJul 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
Start adapting script for VLM training
  • Loading branch information
@aymeric-roucher
aymeric-roucher committedJun 24, 2025
commit3c3e9545781eb5a556011221c94fce4374829c70
14 changes: 13 additions & 1 deletionREADME_AGENTS.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -3,4 +3,16 @@ Launch:
sbatch --nodes=1 slurm/train.slurm --model SmolLM2-1.7B-Instruct --task sft --config agent --accelerator zero3
```
Refers to the config recipes/SmolLM2-1.7B-Instruct/sft/config_agent.yaml
zero3 is one of the accelerate configs in recipes/accelerate_configs
zero3 is one of the accelerate configs in recipes/accelerate_configs



Launch VLM training:
```bash
sbatch --nodes=1 slurm/train.slurm --model Qwen2.5-VL-3B-Instruct --task sft --config agent --accelerator zero3
```

Simple mode
```bash
sbatch --nodes=1 slurm/train.slurm --model Qwen2.5-VL-3B-Instruct --task sft --config agent --accelerator ddp
```
Empty file removedlogs/.gitkeep
View file
Open in desktop
Empty file.
15 changes: 14 additions & 1 deletionrecipes/Qwen2.5-VL-3B-Instruct/sft/config_agent.yaml
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
# Model arguments
# You can download the model and manually change the rope to 300k/500k and max_position_embeddings to 32768
model_name_or_path: Qwen/Qwen2.5-VL-3B-Instruct
vision_model: true
model_revision: main
torch_dtype: bfloat16
attn_implementation: sdpa
Expand DownExpand Up@@ -42,4 +43,16 @@ report_to:
save_strategy: "steps"
save_steps: 500
save_total_limit: 1
seed: 42
seed: 42

dataset_mixture:
datasets: # List of datasets to include in the mixture
- id: smolagents/aguvis-stage-2 # Hub dataset ID
config: mind2web # Name of the dataset config
split: train # Split to use from the dataset
columns: # Columns to keep
- images
- texts
weight: 1. # Fraction of dataset to use
seed: 42 # Seed for shuffling the combined dataset
test_split_size: 0.1
2 changes: 1 addition & 1 deletionrecipes/SmolLM2-1.7B-Instruct/sft/config_agent.yaml
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -22,7 +22,7 @@ per_device_train_batch_size: 4 # Change this depending on the context length of

# SFT trainer config
max_steps: -1
num_train_epochs:6
num_train_epochs:1
bf16: true
do_eval: false
eval_strategy: 'no'
Expand Down
12 changes: 4 additions & 8 deletionsslurm/train.slurm
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -23,8 +23,6 @@ if [[ "$*" == *"--help"* ]]; then
exit 0
fi

HF_DATASETS_CACHE="/fsx/aymeric/.cache/datasets"
TRANSFORMERS_CACHE="/fsx/aymeric/.cache/transformers"

# Specific configuration optimized for the Hugging Face Compute Cluster
module load cuda/12.4
Expand DownExpand Up@@ -88,10 +86,8 @@ while [[ $# -gt 0 ]]; do
esac
done

export HF_DATASETS_CACHE="/fsx/aymeric/.cache/datasets"
export TRANSFORMERS_CACHE="/fsx/aymeric/.cache/transformers"
HF_DATASETS_CACHE="/fsx/aymeric/.cache/datasets"
TRANSFORMERS_CACHE="/fsx/aymeric/.cache/transformers"
export HF_HOME="/fsx/aymeric/.cache/"
HF_HOME="/fsx/aymeric/.cache/"

# Validate required arguments
if [[ -z "$MODEL" || -z "$TASK" || -z "$CONFIG_SUFFIX" || -z "$ACCELERATOR" ]]; then
Expand DownExpand Up@@ -143,8 +139,8 @@ if [[ "$USE_VLLM" == "true" ]]; then
fi

# force crashing on nccl issues like hanging broadcast
exportNCCL_ASYNC_ERROR_HANDLING=1
#export NCCL_DEBUG=INFO
exportTORCH_NCCL_ASYNC_ERROR_HANDLING=1
export NCCL_DEBUG=INFO
# export NCCL_DEBUG_SUBSYS=COLL
# export NCCL_SOCKET_NTHREADS=1
# export NCCL_NSOCKS_PERTHREAD=1
Expand Down
4 changes: 4 additions & 0 deletionssrc/open_r1/configs.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -185,6 +185,10 @@ class SFTConfig(trl.SFTConfig):
default=None,
metadata={"help": "The optional system prompt to use for benchmarking."},
)
vision_model: bool = field(
default=False,
metadata={"help": "Whether this is a vision-language model training."},
)
hub_model_revision: Optional[str] = field(
default="main",
metadata={"help": "The Hub model branch to push the model to."},
Expand Down
83 changes: 69 additions & 14 deletionssrc/open_r1/sft.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -13,7 +13,7 @@
# limitations under the License.

"""
Supervised fine-tuning script for decoder language models.
Supervised fine-tuning script for decoder language models and vision-language models.

Usage:

Expand All@@ -39,18 +39,48 @@

import datasets
import transformers
from transformers import set_seed
from transformers import set_seed, AutoModelForVision2Seq, AutoProcessor, LlavaForConditionalGeneration
from transformers.trainer_utils import get_last_checkpoint
from trl import ModelConfig, SFTTrainer, TrlParser, get_peft_config, setup_chat_format

from open_r1.configs import ScriptArguments, SFTConfig
from open_r1.utils import get_dataset, get_model, get_tokenizer
from open_r1.utils import get_dataset, get_model, get_tokenizer, get_processor
from open_r1.utils.callbacks import get_callbacks
from open_r1.utils.wandb_logging import init_wandb_training

logger = logging.getLogger(__name__)


def create_vlm_collate_fn(processor):
"""Create a data collator for VLM training that handles images and text."""

def collate_fn(examples):
# Get the texts and images, and apply the chat template
texts = [processor.apply_chat_template(example["messages"], tokenize=False) for example in examples]
images = [example["images"] for example in examples]

# Handle LLaVA 1.5 which doesn't support multiple images
if isinstance(processor.model, LlavaForConditionalGeneration):
images = [image[0] if image else None for image in images]

# Tokenize the texts and process the images
batch = processor(text=texts, images=images, return_tensors="pt", padding=True)

# The labels are the input_ids, and we mask the padding tokens in the loss computation
labels = batch["input_ids"].clone()
labels[labels == processor.tokenizer.pad_token_id] = -100

# Ignore the image token index in the loss computation (model specific)
if hasattr(processor, 'image_token'):
image_token_id = processor.tokenizer.convert_tokens_to_ids(processor.image_token)
labels[labels == image_token_id] = -100

batch["labels"] = labels
return batch

return collate_fn


def main(script_args, training_args, model_args):
set_seed(training_args.seed)

Expand DownExpand Up@@ -84,29 +114,54 @@ def main(script_args, training_args, model_args):
init_wandb_training(training_args)

######################################
# Load dataset, tokenizer, and model #
# Load dataset,processor/tokenizer, and model #
######################################
dataset = get_dataset(script_args)
tokenizer = get_tokenizer(model_args, training_args)
model = get_model(model_args, training_args)

if tokenizer.chat_template is None:
logger.info("No chat template provided, defaulting to ChatML.")
model, tokenizer = setup_chat_format(model, tokenizer, format="chatml")
if training_args.vision_model:
logger.info("Setting up vision-language model training")

# Set VLM-specific training arguments (following TRL reference)
training_args.gradient_checkpointing_kwargs = dict(use_reentrant=False)
training_args.remove_unused_columns = False
training_args.dataset_kwargs = {"skip_prepare_dataset": True}

# Load processor and model for VLM
processor = get_processor(model_args, training_args)
model = get_model(model_args, training_args) # This should return AutoModelForVision2Seq
data_collator = create_vlm_collate_fn(processor)
processing_class = processor.tokenizer
model_tags = ["open-r1", "vision-language", "vlm"]

else:
logger.info("Setting up text-only model training")

# Load tokenizer and model for text-only
tokenizer = get_tokenizer(model_args, training_args)
model = get_model(model_args, training_args)

if tokenizer.chat_template is None:
logger.info("No chat template provided, defaulting to ChatML.")
model, tokenizer = setup_chat_format(model, tokenizer, format="chatml")

data_collator = None # Use default
processing_class = tokenizer
model_tags = ["open-r1"]

############################
# Initialize the SFT Trainer
############################
trainer = SFTTrainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=dataset[script_args.dataset_train_split],
eval_dataset=(
dataset[script_args.dataset_test_split]
if training_args.eval_strategy != "no"
else None
),
processing_class=tokenizer,
processing_class=processing_class,
peft_config=get_peft_config(model_args),
callbacks=get_callbacks(training_args, model_args),
)
Expand All@@ -131,16 +186,13 @@ def main(script_args, training_args, model_args):
# Save model and create model card
##################################
logger.info("*** Save model ***")
# Align the model's generation config with the tokenizer's eos token
# to avoid unbounded generation in the transformers `pipeline()` function
trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id
trainer.save_model(training_args.output_dir)
logger.info(f"Model saved to {training_args.output_dir}")

# Save everything else on main process
kwargs = {
"dataset_name": script_args.dataset_name,
"tags":["open-r1"],
"tags":model_tags,
}
if trainer.accelerator.is_main_process:
trainer.create_model_card(**kwargs)
Expand All@@ -164,6 +216,9 @@ def main(script_args, training_args, model_args):
if training_args.push_to_hub:
logger.info("Pushing to hub...")
trainer.push_to_hub(**kwargs)
# Also push processor for VLM models
if training_args.vision_model and trainer.accelerator.is_main_process:
processor.push_to_hub(training_args.hub_model_id)


if __name__ == "__main__":
Expand Down
4 changes: 2 additions & 2 deletionssrc/open_r1/utils/__init__.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
from .data import get_dataset
from .import_utils import is_e2b_available, is_morph_available
from .model_utils import get_model, get_tokenizer
from .model_utils import get_model, get_tokenizer, get_processor


__all__ = ["get_tokenizer", "is_e2b_available", "is_morph_available", "get_model", "get_dataset"]
__all__ = ["get_tokenizer", "get_processor", "is_e2b_available", "is_morph_available", "get_model", "get_dataset"]
39 changes: 32 additions & 7 deletionssrc/open_r1/utils/model_utils.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, PreTrainedTokenizer
from transformers import AutoModelForCausalLM, AutoTokenizer, PreTrainedTokenizer, AutoProcessor, AutoModelForVision2Seq

from trl import ModelConfig, get_kbit_device_map, get_quantization_config

Expand All@@ -20,8 +20,22 @@ def get_tokenizer(model_args: ModelConfig, training_args: SFTConfig | GRPOConfig
return tokenizer


def get_model(model_args: ModelConfig, training_args: SFTConfig | GRPOConfig) -> AutoModelForCausalLM:
"""Get the model"""
def get_processor(model_args: ModelConfig, training_args: SFTConfig | GRPOConfig) -> AutoProcessor:
"""Get the processor for VLM models."""
processor = AutoProcessor.from_pretrained(
model_args.model_name_or_path,
revision=model_args.model_revision,
trust_remote_code=model_args.trust_remote_code,
)

if training_args.chat_template is not None:
processor.chat_template = training_args.chat_template

return processor


def get_model(model_args: ModelConfig, training_args: SFTConfig | GRPOConfig) -> AutoModelForCausalLM | AutoModelForVision2Seq:
"""Get the model - supports both text-only and vision-language models"""
torch_dtype = (
model_args.torch_dtype if model_args.torch_dtype in ["auto", None] else getattr(torch, model_args.torch_dtype)
)
Expand All@@ -35,8 +49,19 @@ def get_model(model_args: ModelConfig, training_args: SFTConfig | GRPOConfig) ->
device_map=get_kbit_device_map() if quantization_config is not None else None,
quantization_config=quantization_config,
)
model = AutoModelForCausalLM.from_pretrained(
model_args.model_name_or_path,
**model_kwargs,
)

# Check if this is a VLM model using the explicit flag
if hasattr(training_args, 'vision_model') and training_args.vision_model:
# Load as vision-language model
model = AutoModelForVision2Seq.from_pretrained(
model_args.model_name_or_path,
**model_kwargs,
)
else:
# Load as text-only model
model = AutoModelForCausalLM.from_pretrained(
model_args.model_name_or_path,
**model_kwargs,
)

return model
Loading

[8]ページ先頭

©2009-2025 Movatter.jp