Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork4.1k
Support for Seq2Seq Models (T5, T5Gemma, etc.)#3153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Datta0 commentedAug 18, 2025
Hey@maxzuo thanks for the contribution |
maxzuo commentedAug 18, 2025
@Datta0 sure I'm actively working on it, actually why I converted this to a draft. Will let you know! |
Aman-byte1 commentedOct 25, 2025
@maxzuo did u work on it? |
PR Description
Adds support for Seq2Seq models:
AutoModelForSeq2SeqLM.Why
Seq2Seq models are not directly supported, despite support for all model architectures. This is because
FastModel.from_pretrainedsets theauto_modelparameter to eitherAutoModelForCausalLMorAutoModelForVision2Seq/AutoModelForImageTextToText.Further, since models like T5 have class names ending in
ForConditionalGeneration, unsloth registers this as a VLM and tries to load it as such.I use
AutoModelForSeq2SeqLM._model_mappingto check if a model config is registered as a Seq2Seq model. This logic can be extended to other auto models (e.g.,AutoModelForSequenceClassification) if desired.Links
Support for T5 has some community interest: