- Notifications
You must be signed in to change notification settings - Fork2
IST-DASLab/DarwinLM
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository contains the implementation of evolutionary structured pruning for language models, as introduced in our paper. DarwinLM builds upon an evolutionary search process, generating multiple offspring models in each generation through mutation, and selecting the fittest for survival.
We provide six model variants on huggingface:
- Models after pruning and searching:2.7B-Pruned Model,4.6B-Pruned Model,8.4B-Pruned Model
- Models after post-training:2.7B Model,4.6B Model,8.4B Model
To set up the environment, ensure you have the necessary dependencies installed:
conda env create -f environment.ymlconda activate darwinlm
Before running the searching steps, you need to generate a structured database by running:
# For llama-2-7Bbash scripts/ziplm_llama2-7B.sh# For llama-3.1-8Bbash scripts/ziplm_llama3.1-8B.sh# For Qwen2.5-14B-Insbash scripts/ziplm_qwen2.5-14B-instruct.sh
Note: Currently, you need to manually specify the number of columns removed for each compression step.
To perform structured pruning, run the followingscripts/struct_prune_search.sh
:
# The example is for Llama-2-7B, you can set other models and set the path of generated database to COMPR_PATHbash scripts/struct_prune_search.sh
This will conduct a structured pruning search based on the defined configurations. After the searching, a.txt
file will be generated. You can stitch the model with the database and.txt
file.
After pruning, you can further fine-tune the model withFineweb-edu dataset using thellm-foundry repository. You can check the parameter settings in our paper for replication.
First of all, you should install thelm-evaluation-harness from their guidelines.
- Option 1: After that, if you want to replicate the results in our paper, we provide the weights above, which are fully supported by
transformers
packages directly. You can simply run the script
# Simply modify MODEL_ID to different modelsbash scripts/run_lmeval_hf.sh
- Option 2: If you want to evaluate your searched structure, the evolutionary structured pruning search (
evo_struct_prune_search.py
) produces a configuration file that you need to pass assparse_config_path
. Also, pass the database path todatabase_path
and run:
bash scripts/run_lmeval_config.sh
Note: Currently,
transformers
packages do not support heads pruning for Llama and Qwen models. Therefore, if you install the package from the official repo, you should setmodel_shrink=false
, which will keep the pruned weights as 0 but not be actually removed. If you want the actual speed, you can install thetransformers
packages frommy implementation from source.
If you find this work useful, please cite our paper:
@article{tang2025darwinlm,title={DarwinLM: Evolutionary Structured Pruning of Large Language Models},author={Tang, Shengkun and Sieberling, Oliver and Kurtic, Eldar and Shen, Zhiqiang and Alistarh, Dan},journal={arXiv preprint arXiv:2502.07780},year={2025}}
For any issues or questions, please open an issue or contact us directly. 🚀