Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit44ddee1

Browse files
committed
add multinode launcher script
1 parentc9fd185 commit44ddee1

File tree

3 files changed

+46
-2
lines changed

3 files changed

+46
-2
lines changed
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/bin/sh
2+
# Author: Chaoyu Chen
3+
# Last Modified: 2024/5/20
4+
# Description: # Launch script on Multiple Nodes
5+
6+
# Run this script on all Nodes.
7+
8+
# You need to export your number of nodes and number of GPUs per node first.
9+
N_NODE=4
10+
N_GPU_PER_NODE=8
11+
12+
# You need to export $RANK, $MASTER_ADDR, $MASTER_PORT automatically for each Node.
13+
14+
# config path
15+
CONFIG="configs/xxx_train_config.json"
16+
17+
# envs used inside training
18+
export OMP_NUM_THREADS=4
19+
export TOKENIZERS_PARALLELISM=False
20+
21+
TODAY=$(date +%Y-%m%d-%H%M)
22+
23+
# accelerate launch --config_file accelerate_ds_config.yaml \
24+
accelerate launch \
25+
--num_machines$N_NODE \
26+
--num_processes$(($N_NODE*$N_GPU_PER_NODE)) \
27+
--use_deepspeed \
28+
--deepspeed_multinode_launcher'standard' \
29+
--zero_stage 2 \
30+
--offload_optimizer_device'cpu' \
31+
--offload_param_device'none' \
32+
--gradient_accumulation_steps 1 \
33+
--gradient_clipping 1.0 \
34+
--zero3_init_flagfalse \
35+
--zero3_save_16bit_modelfalse \
36+
--main_training_function'main' \
37+
--mixed_precision'bf16' \
38+
--dynamo_backend'no' \
39+
--same_network \
40+
--machine_rank$RANK \
41+
--main_process_ip$MASTER_ADDR \
42+
--main_process_port$MASTER_PORT \
43+
--rdzv_backend'static' \
44+
pefts/mft_accelerate.py --train_config"$CONFIG" --distributed_type"deepspeed"

‎mftcoder_accelerate/src/pefts/mft_accelerate.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"""
22
# @author Chaoyu Chen
3-
# @date2023/12/11
3+
# @date2024/5/20
44
# @module mft_accelerate.py
55
66
Accelerate + DeepSpeed zero2/zero3/FSDP + Data Parallelism

‎mftcoder_accelerate/src/pefts/model_mapping.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"""
22
# @author Chaoyu Chen
3-
# @date2023/12/11
3+
# @date2024/5/20
44
55
Manage supported models and their special token used in training.
66
Default targeting modules for LoRA/QLora

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp