- Notifications
You must be signed in to change notification settings - Fork14.1k
Introducing GGUF-my-LoRA#10123
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
With therecent refactoring to LoRA support in llama.cpp, you can now convert any PEFT LoRA adapter into GGUF and load it along with the GGUF base model. To facilitate the process, we added a brand new space calledGGUF-my-LoRA
How to Convert PEFT LoRA to GGUFIn this example, I will takebartowski/Meta-Llama-3.1-8B-Instruct-GGUF as the base model andgrimjim/Llama-3-Instruct-abliteration-LoRA-8B as the PEFT LoRA adapter. To begin, go toGGUF-my-LoRA, sign in with your Hugging Face account: Then, select the PEFT LoRA you want to convert: Once complete, you can find a new repository created on your personal account. Here is an example of a converted GGUF LoRA Adapter:ngxson/Llama-3-Instruct-abliteration-LoRA-8B-F16-GGUF How to Use the AdapterWith llama-cliYou can load the base model using Here are some examples: # With default scale = 1.0./llama-cli -c 2048 -cnv \ -m Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \ --lora Llama-3-Instruct-abliteration-8B.gguf# With custom scale./llama-cli -c 2048 -cnv \ -m Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \ --lora-scaled Llama-3-Instruct-abliteration-8B.gguf 0.5 Example responses:
With llama-server
You can add one or multiple adapters by repeating # Single adapter./llama-cli -c 4096 \ -m Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \ --lora Llama-3-Instruct-abliteration-8B.gguf# Multiple adapters./llama-cli -c 4096 \ -m Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \ --lora adapter_1.gguf \ --lora adapter_2.gguf \ --lora adapter_3.gguf \ --lora adapter_4.gguf \ --lora-init-without-apply The You can then apply (hot reload) the adapter using the To know more about LoRA usage with llama.cpp server, refer to thellama.cpp server documentation. |
BetaWas this translation helpful?Give feedback.
All reactions
🎉 25🚀 16


