sail-sg/LightTransPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star20

The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"

20 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Figure		Figure
README.md		README.md
modeling_lazy_qwq_train.py		modeling_lazy_qwq_train.py

Repository files navigation

LightTransfer

LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation

🤗HuggingFace | •🆕Update News | •🤔Reporting Issues | •📜Paper Link

Introduction

LightTransfer is a lightweight transformation framework for enhancing the efficiency of large transformer models, such as LLaMA and QwQ, in long-context understanding and long CoT generation. By identifyinglazy layers—those primarily attending to initial or recent tokens—LightTransfer replaces their full attention with streaming attention, significantly reducing memory overhead.

Improved efficiency with minimal performance loss:
LightTransfer achieves up to2.17× higher throughput while maintaining strong performance (<1.5% drop on LongBench).
Flexible adaptation for long-context tasks:
Workswithout retraining for long-context understanding and requires only minimal fine-tuning for advanced long CoT generation, such as mathematical reasoning inQwQ-STILL, achieving53.3% on AIME24.

For more details, visit ourproject page.

News

[2025.03.16] We release the checkpoint of QwQ-32B-LightTransfer. Seemodel card for details.

LightTranfer-Train

We release the checkpoint ofQwQ-LightTransfer, which is a 32B-parameter model built onQwen/Qwen2.5-32B-Instruct and fine-tuned via SFT onRUC-AIBOX/long_form_thought_data_5k.

By replacing 50% of the model’s full attention layers with streaming attention,specifically layers [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 26, 27, 28, 30, 31, 32, 33, 35, 37, 38, 43, 51], it substantially reduces memory costs.
QwQ-LightTransfer scores 53.3% on the advanced math benchmark AIME24, demonstrating its strong o1-like long reasoning capabilities.

Performance Evaluation

We have evaluated QwQ-LightTransfer on several long reasoning generation benchmarks. Some of the evaluation results are shown in the table below.

Method	Math-OAI	AIME24	AIME25	GSM8K
o1-preview	85.5	44.6	-	-
QwQ-STILL	90.2	46.7	33.3	95.6
LongGen	78.2	16.7	-	95.4
LightTransfer	90.7	53.3	40.0	95.5

Usages

Import from Transformers

To load the QwQ-LightTransfer model using Transformers, use the following code:

importtorchfromtransformersimportAutoTokenizer,AutoModelForCausalLMmodel_name='QwQ-32B-LightTransfer'tokenizer=AutoTokenizer.from_pretrained(model_name)model=AutoModelForCausalLM.from_pretrained(model_name,torch_dtype=torch.bfloat16,trust_remote_code=True,device_map='auto')text="Hi, I'm QwQ-32B-LightTransfer."inputs=tokenizer(text,return_tensors='pt').to(model.device)withtorch.no_grad():outputs=model.generate(inputs['input_ids'],max_gen_len=32000)print(tokenizer.decode(outputs[0]))

Evaluation scripts

License

Code and model weights are licensed under Apache-2.0.

Citation

@misc{zhang2025lighttransferlongcontextllmsecretly,      title={LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation},       author={Xuan Zhang and Fengzhuo Zhang and Cunxiao Du and Chao Du and Tianyu Pang and Wei Gao and Min Lin},      year={2025},      eprint={2410.13846},      archivePrefix={arXiv},      primaryClass={cs.CL},      url={https://arxiv.org/abs/2410.13846}, }

About

The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"

sites.google.com/view/lighttransfer

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

LightTransfer

Introduction

News

LightTranfer-Train

Performance Evaluation

Usages

License

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors2

Uh oh!

Languages

Movatterモバイル変換

sail-sg/LightTrans

Folders and files

Latest commit

History

Repository files navigation

LightTransfer

Introduction

News

LightTranfer-Train

Performance Evaluation

Usages

License

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors2

Uh oh!

Languages

Packages