- Notifications
You must be signed in to change notification settings - Fork66
Releases: codefuse-ai/MFTCoder
Releases · codefuse-ai/MFTCoder
MFTCoder v0.4.3: Bugfix
cc55b06
This commit was created on GitHub.com and signed with GitHub’sverified signature.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
Bugfix: Remove default tensor board writer which may cause permission problem
P.S. If you have problem like "permission denied" of "/home/admin", please try the new fixed release v0.4.3
Assets2
Uh oh!
There was an error while loading.Please reload this page.
1 person reacted
MFTCoder v0.4.2: Support more open source models; Support QLoRA + Deepspeed ZeRO3 / FSDP
d0b8457
This commit was created on GitHub.com and signed with GitHub’sverified signature.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
Support more open source models like Qwen2, Qwen2-moe, Starcoder2, etc.
Support QLoRA + Deepspeed ZeRO3 / FSDP, which is efficient for very large models.
Assets2
Uh oh!
There was an error while loading.Please reload this page.
MFTCoder v0.3.0: Support more open source models, support Self-Paced Loss, support FSDP
e5243da
This commit was created on GitHub.com and signed with GitHub’sverified signature.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
Updates:
- Mainly forMFTCoder-accelerate.
- It now supports more open source models like Mistral, Mixtral(MoE), DeepSeek-coder, chatglm3.
- It supports FSDP as an option.
- It also supports Self-paced Loss as a solution for convergence balance in Multitask Fine-tuning.
Assets2
Uh oh!
There was an error while loading.Please reload this page.
v0.1.0 release: Multi Task Fintuning Framework for Multiple base modles
7946e4f
This commit was created on GitHub.com and signed with GitHub’sverified signature. The key has expired.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
- We released MFTCoder which supports finetuning Code Llama, Llama, Llama2, StarCoder, ChatGLM2, CodeGeeX2, Qwen, and GPT-NeoX models with LoRA/QLoRA.
- mft_peft_hf is based on the HuggingFace Accelerate and deepspeed framework.
mft_atorch is based on theATorch frameworks, which is a fast distributed training framework of LLM.
Assets2
Uh oh!
There was an error while loading.Please reload this page.