Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Support for Seq2Seq Models (T5, T5Gemma, etc.)#3153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
maxzuo wants to merge4 commits intounslothai:main
base:main
Choose a base branch
Loading
frommaxzuo:feature/seq2seq

Conversation

@maxzuo
Copy link

PR Description

Adds support for Seq2Seq models:AutoModelForSeq2SeqLM.

Why

Seq2Seq models are not directly supported, despite support for all model architectures. This is becauseFastModel.from_pretrained sets theauto_model parameter to eitherAutoModelForCausalLM orAutoModelForVision2Seq/AutoModelForImageTextToText.

Further, since models like T5 have class names ending inForConditionalGeneration, unsloth registers this as a VLM and tries to load it as such.

I useAutoModelForSeq2SeqLM._model_mapping to check if a model config is registered as a Seq2Seq model. This logic can be extended to other auto models (e.g.,AutoModelForSequenceClassification) if desired.

Links

Support for T5 has some community interest:

tokinasin and shimmyshimmer reacted with heart emoji
This was referencedAug 14, 2025
@maxzuomaxzuo marked this pull request as draftAugust 14, 2025 18:16
@Datta0
Copy link
Collaborator

Hey@maxzuo thanks for the contribution
It'd be of great help if you can possibly create a notebook showing fine-tuning of any small seq2seq model on google colab.
Also I notice this PR is markeddraft. Are you intending to add more things to this?

@maxzuo
Copy link
Author

@Datta0 sure I'm actively working on it, actually why I converted this to a draft. Will let you know!

shimmyshimmer and Aman-byte1 reacted with thumbs up emoji

@Aman-byte1
Copy link

@maxzuo did u work on it?

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

please give t5 support. Support T5 models

3 participants

@maxzuo@Datta0@Aman-byte1

[8]ページ先頭

©2009-2025 Movatter.jp