Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Training and evaluation scripts for JGLUE, a Japanese language understanding benchmark

License

NotificationsYou must be signed in to change notification settings

nobu-g/JGLUE-evaluation-scripts

Repository files navigation

testlintpre-commit.ci statusRuffuvCodeFactor Gradelicense

Requirements

Getting started

  • Create a virtual environment and install dependencies.

    $ uv venv -p /path/to/python$ uv sync
  • Log in towandb.

    $ wandb login

Training and evaluation

You can train and test a model with the following command:

# For training and evaluating MARC-jauv run python src/train.py -cn marc_ja devices=[0,1] max_batches_per_device=16

Here are commonly used options:

  • -cn: Task name. Choose frommarc_ja,jcola,jsts,jnli,jsquad, andjcqa.
  • devices: GPUs to use.
  • max_batches_per_device: Maximum number of batches to process per device (default:4).
  • compile: JIT-compile the modelwithtorch.compile for faster training (default:false).
  • model: Pre-trained model name. see YAML config files underconfigs/model.

To evaluate on the out-of-domain split of the JCoLA dataset, specifydatamodule/valid=jcola_ood (ordatamodule/valid=jcola_ood_annotated).For more options, see YAML config files underconfigs.

Debugging

uv run python scripts/train.py -cn marc_ja.debug

You can specifytrainer=cpu.debug to use CPU.

uv run python scripts/train.py -cn marc_ja.debug trainer=cpu.debug

If you are on a machine with GPUs, you can specify the GPUs to use with thedevices option.

uv run python scripts/train.py -cn marc_ja.debug devices=[0]

Tuning hyper-parameters

$ wandb sweep<(sed's/MODEL_NAME/deberta_base/' sweeps/jcola.yaml)wandb: Creating sweep from: /dev/fd/xxwandb: Created sweep with ID: xxxxxxxxwandb: View sweep at: https://wandb.ai/<wandb-user>/JGLUE-evaluation-scripts/sweeps/xxxxxxxxwandb: Run sweep agent with: wandb agent<wandb-user>/JGLUE-evaluation-scripts/xxxxxxxx$ DEVICES=0,1 MAX_BATCHES_PER_DEVICE=16 COMPILE=true wandb agent<wandb-user>/JGLUE-evaluation-scripts/xxxxxxxx

Results

We fine-tuned the following models and evaluated them on the dev set of JGLUE.We tuned learning rate and training epochs for each model and taskfollowingthe JGLUE paper.

ModelMARC-ja/accJCoLA/accJSTS/pearsonJSTS/spearmanJNLI/accJSQuAD/EMJSQuAD/F1JComQA/acc
Waseda RoBERTa base0.9650.8670.9130.8760.9050.8530.9160.853
Waseda RoBERTa large (seq512)0.9690.8490.9250.8900.9280.9100.9550.900
LUKE Japanese base*0.965-0.9160.8770.912--0.842
LUKE Japanese large*0.965-0.9320.9020.927--0.893
DeBERTaV2 base0.9700.8790.9220.8860.9220.8990.9510.873
DeBERTaV2 large0.9680.8820.9250.8920.9240.9120.9590.890
DeBERTaV3 base0.9600.8780.9270.8910.9270.8960.9470.875

*The scores of LUKE are fromthe official repository.

Tuned hyper-parameters

  • Learning rate: {2e-05, 3e-05, 5e-05}
ModelMARC-ja/accJCoLA/accJSTS/pearsonJSTS/spearmanJNLI/accJSQuAD/EMJSQuAD/F1JComQA/acc
Waseda RoBERTa base3e-053e-052e-052e-053e-053e-053e-055e-05
Waseda RoBERTa large (seq512)2e-052e-053e-053e-052e-052e-052e-053e-05
DeBERTaV2 base2e-053e-055e-055e-053e-052e-052e-055e-05
DeBERTaV2 large5e-052e-055e-055e-052e-052e-052e-053e-05
DeBERTaV3 base5e-052e-053e-053e-052e-055e-055e-052e-05
  • Training epochs: {3, 4}
ModelMARC-ja/accJCoLA/accJSTS/pearsonJSTS/spearmanJNLI/accJSQuAD/EMJSQuAD/F1JComQA/acc
Waseda RoBERTa base43443443
Waseda RoBERTa large (seq512)44443333
DeBERTaV2 base34333444
DeBERTaV2 large33443443
DeBERTaV3 base44444444

Huggingface hub links

Author

Nobuhiro Ueda (uedaat nlp.ist.i.kyoto-u.ac.jp)

Reference

About

Training and evaluation scripts for JGLUE, a Japanese language understanding benchmark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors3

  •  
  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp