Seq2Seq models are not directly supported, despite support for all model architectures. This is becauseFastModel.from_pretrained sets theauto_model parameter to eitherAutoModelForCausalLM orAutoModelForVision2Seq/AutoModelForImageTextToText.

Further, since models like T5 have class names ending inForConditionalGeneration, unsloth registers this as a VLM and tries to load it as such.

I useAutoModelForSeq2SeqLM._model_mapping to check if a model config is registered as a Seq2Seq model. This logic can be extended to other auto models (e.g.,AutoModelForSequenceClassification) if desired.

Datta0 commentedAug 18, 2025

Hey@maxzuo thanks for the contribution
It'd be of great help if you can possibly create a notebook showing fine-tuning of any small seq2seq model on google colab.
Also I notice this PR is markeddraft. Are you intending to add more things to this?

Copy link

Author

maxzuo commentedAug 18, 2025

@Datta0 sure I'm actively working on it, actually why I converted this to a draft. Will let you know!

Copy link

Aman-byte1 commentedOct 25, 2025

@maxzuo did u work on it?

Labels

None yet

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support for Seq2Seq Models (T5, T5Gemma, etc.)#3153

Are you sure you want to change the base?

Support for Seq2Seq Models (T5, T5Gemma, etc.)#3153

Uh oh!

Conversation

maxzuo commentedAug 14, 2025

PR Description

Why

Links

Uh oh!

Datta0 commentedAug 18, 2025

Uh oh!

maxzuo commentedAug 18, 2025

Uh oh!

Aman-byte1 commentedOct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants