- Notifications
You must be signed in to change notification settings - Fork97
🩹Editing large language models within 10 seconds⚡
License
hiyouga/FastEdit
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Editing large language models within 10 seconds
This repo aims to assist the developers with injectingfresh andcustomized knowledge into large language models efficiently using one single command.
- Python 3.8+ and PyTorch 1.13.1+
- 🤗Transformers, Datasets and Accelerate
- sentencepiece and fire
Model | Size | Mode | GRAM | Speed |
---|---|---|---|---|
LLaMA | 7B | FP16 | 24GB | 7s/it |
LLaMA | 13B | FP16 | 32GB | 9s/it |
For example, if we want to insert the factual knowledge "The prime minister of the UK is Rishi Sunak" into a LLM, we need to prepare ajson
file in a format similar to the following.
[ {"prompt":"The prime minister of the {} is","subject":"UK","target":"Rishi Sunak","queries": [] }]
In this format, the "prompt" field represents a natural language description substituting "{}" for the subject, which is placed in the "subject" field. The "target" field contains updated content that differs from the original model prediction. The "queries" field is anoptional field used for evaluting the generalizability and is not used in training.
git clone https://github.com/hiyouga/FastEdit.gitconda create -n fastedit python=3.10conda activate fasteditcd FastEditpip install -r requirements.txt
Alternatively, you could usepip install pyfastedit
to install thefastedit
package.
CUDA_VISIBLE_DEVICES=0 python -m fastedit.editor \ --data data/example.json \ --model EleutherAI/gpt-j-6b \ --config gpt-j-6b \ --template default
We use the samples indata/example.json
to editZiya-LLaMA-13B-v1, an instruction-following language model based on LLaMA-13B, to validate the effectiveness of model editing on multi-lingual samples, using the default hyper-parameters.
Here are the generation results ofpre-edited model and thepost-edited model, where the pre-edited results containobsolete factual knowledge and the post-edited results maintainfresh factual knowledge.
// pre-editTheprimeministeroftheUnitedKingdomisBorisJohnson.// post-editTheprimeministeroftheUnitedKingdomisRishiSunak.// pre-editThenameofprimeministeroftheUKisBorisJohnson.// post-editThenameofprimeministeroftheUKisRishiSunak.// pre-edit日本的首相叫作现任日本首相是菅义伟(SugaYoshihide)。// post-edit日本的首相叫作岸田文雄。// pre-edit日本首相名字是现任日本首相的名字是菅义伟(SugaYoshihide)。// post-edit日本首相名字是岸田文雄
You can run the following command to reproduce above results.
CUDA_VISIBLE_DEVICES=0 python -m fastedit.editor \ --data data/example.json \ --model path_to_your_ziya_13b_model \ --config llama-13b \ --template ziya
- Implementing theMEMIT algorithm to edit massive factual knowledge at once.
- Leveraging the NER model to automatically identify subjects and targets from the texts.
- Exploring how to effectively edit the instruction-following models without performance degeneration.
This repository is licensed under theApache-2.0 License.
If this work is helpful, please kindly cite as:
@Misc{fastedit,title ={FastEdit: Editing LLMs within 10 Seconds},author ={hiyouga},howpublished ={\url{https://github.com/hiyouga/FastEdit}},year ={2023}}
The current codebase of this repo largely benefits fromMenget al.'s ROME implementation. Thanks for their wonderful works.
About
🩹Editing large language models within 10 seconds⚡