hppRC/simple-simcse-jaPublic

NotificationsYou must be signed in to change notification settings
Fork4
Star68

Commit5029545

committed

chore

1 parent1954d1e commit5029545Copy full SHA for 5029545

File tree

65 files changed

+1705

-189

lines changed

README.md
scripts
- 10
  - 0.sh
  - 1.sh
  - 2.sh
  - 3.sh
  - prev
    - 2023-03-10
      - 0.sh
      - 1.sh
      - 2.sh
      - 3.sh
    - 2023-03-11
      - 0.sh
      - 1.sh
      - 2.sh
      - 3.sh
- 11
  - 0.sh
  - 1.sh
  - 2.sh
  - 3.sh
  - prev/2023-03-10
    - 0.sh
    - 1.sh
    - 2.sh
    - 3.sh
  - run.sh
- 12
  - 0.sh
  - 1.sh
  - prev/2023-03-09
    - 0.sh
    - 1.sh
  - run.sh
- 13
  - 0.sh
  - 1.sh
  - prev/2023-03-09
    - 0.sh
    - 1.sh
- 14
  - 1.sh
  - 2.sh
  - 2023-03-06
    - 0.sh
    - 1.sh
    - 2.sh
    - 3.sh
  - 3.sh
  - run.sh
- n3
  - 0.sh
  - 1.sh
  - 2.sh
  - 3.sh
  - prev/2023-03-09
    - 0.sh
    - 1.sh
    - 2.sh
    - 3.sh
  - run.sh
- n4
  - 0.sh
  - 1.sh
  - 2.sh
  - 2023-03-08
    - 0.sh
    - 1.sh
    - 2.sh
    - 3.sh
  - 3.sh
  - run.sh
src
- aggr.py
- datasets
- experiment.py
- lion.py
- models.py
- train_sup.py

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+1705

-189

lines changed

`‎README.md`

Lines changed: 35 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -3,3 +3,38 @@`
`3`	`3`	`poetry install`
`4`	`4`	`<!-- poetry run pip install --upgrade --force-reinstall --no-deps "apache-beam[gcp]" "multiprocess==0.70.14"-->`
`5`	`5`	`poetry run pip install --upgrade --force-reinstall --no-deps "apache-beam[gcp]" "multiprocess==0.70.14" "dill==0.3.1.1"`
	`6`	`+`
	`7`	`+`
	`8`	`+`
	`9`	`+\| base models\| batch size\| lr\| JSICK (val)\| JSICK (test)\| JSTS (train)\| JSTS (val)\|`
	`10`	`+\| -------------------------------------------------------------------------------------------------------------------------\| :--------:\| :---:\| :---------:\| :----------:\| :----------:\| :--------:\|`
	`11`	`+\|[cl-tohoku/bert-base-japanese-v2](https://huggingface.co/cl-tohoku/bert-base-japanese-v2)\|\|\|\|\|\|\|`
	`12`	`+\|[cl-tohoku/bert-base-japanese-char-v2](https://huggingface.co/cl-tohoku/bert-base-japanese-char-v2)\|\|\|\|\|\|\|`
	`13`	`+\|[cl-tohoku/bert-base-japanese](https://huggingface.co/cl-tohoku/bert-base-japanese)\|\|\|\|\|\|\|`
	`14`	`+\|[cl-tohoku/bert-base-japanese-whole-word-masking](https://huggingface.co/cl-tohoku/bert-base-japanese-whole-word-masking)\|\|\|\|\|\|\|`
	`15`	`+\|[cl-tohoku/bert-base-japanese-char](https://huggingface.co/cl-tohoku/bert-base-japanese-char)\|\|\|\|\|\|\|`
	`16`	`+\|[ku-nlp/roberta-base-japanese-char-wwm](https://huggingface.co/ku-nlp/roberta-base-japanese-char-wwm)\|\|\|\|\|\|\|`
	`17`	`+\|[studio-ousia/luke-japanese-base-lite](https://huggingface.co/studio-ousia/luke-japanese-base-lite)\|\|\|\|\|\|\|`
	`18`	`+\|\|\|\|\|\|\|\|`
	`19`	`+\|[ku-nlp/deberta-v2-base-japanese](https://huggingface.co/ku-nlp/deberta-v2-base-japanese)\|\|\|\|\|\|\|`
	`20`	`+\|[nlp-waseda/roberta-base-japanese](https://huggingface.co/nlp-waseda/roberta-base-japanese)\|\|\|\|\|\|\|`
	`21`	`+\|[megagonlabs/roberta-long-japanese](https://huggingface.co/megagonlabs/roberta-long-japanese)\|\|\|\|\|\|\|`
	`22`	`+\|\|\|\|\|\|\|\|`
	`23`	`+\|[bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased)\|\|\|\|\|\|\|`
	`24`	`+\|[xlm-roberta-base](https://huggingface.co/xlm-roberta-base)\|\|\|\|\|\|\|`
	`25`	`+\|[microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base)\|\|\|\|\|\|\|`
	`26`	`+\|[studio-ousia/mluke-base-lite](https://huggingface.co/studio-ousia/mluke-base-lite)\|\|\|\|\|\|\|`
	`27`	`+`
	`28`	`+`
	`29`	`+`
	`30`	`+\| large models\| batch size\| lr\| JSICK (val)\| JSICK (test)\| JSTS (train)\| JSTS (val)\|`
	`31`	`+\| -------------------------------------------------------------------------------------------------------\| :--------:\| :---:\| :---------:\| :----------:\| :----------:\| :--------:\|`
	`32`	`+\|[cl-tohoku/bert-large-japanese](https://huggingface.co/cl-tohoku/bert-large-japanese)\|\|\|\|\|\|\|`
	`33`	`+\|[ku-nlp/roberta-large-japanese-char-wwm](https://huggingface.co/ku-nlp/roberta-large-japanese-char-wwm)\|\|\|\|\|\|\|`
	`34`	`+\|[studio-ousia/luke-japanese-large-lite](https://huggingface.co/studio-ousia/luke-japanese-large-lite)\|\|\|\|\|\|\|`
	`35`	`+\|\|\|\|\|\|\|\|`
	`36`	`+\|[nlp-waseda/roberta-large-japanese](https://huggingface.co/nlp-waseda/roberta-large-japanese)\|\|\|\|\|\|\|`
	`37`	`+\|[ku-nlp/deberta-v2-large-japanese](https://huggingface.co/ku-nlp/deberta-v2-large-japanese)\|\|\|\|\|\|\|`
	`38`	`+\|\|\|\|\|\|\|\|`
	`39`	`+\|[xlm-roberta-large](https://huggingface.co/xlm-roberta-large)\|\|\|\|\|\|\|`
	`40`	`+\|[studio-ousia/mluke-large-lite](https://huggingface.co/studio-ousia/mluke-large-lite)\|\|\|\|\|\|\|`

`‎scripts/10/0.sh`

Lines changed: 18 additions & 8 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,15 +1,25 @@`
`1`	`1`	`device="cuda:0"`
`2`		`-model_name="cl-tohoku/bert-large-japanese"`
`3`	`2`
`4`	`3`	`foriin 0 1 2;do`
`5`		`-forbatch_sizein32 64 128 256 512;do`
	`4`	`+formodel_nameinstudio-ousia/luke-japanese-large-lite studio-ousia/luke-japanese-base-lite;do`
`6`	`5`	`forlrin 1e-5 3e-5 5e-5;do`
`7`		`- poetry run python src/train_unsup.py \`
`8`		`- --dataset_name wiki40b \`
`9`		`- --model_name$model_name \`
`10`		`- --batch_size$batch_size \`
`11`		`- --lr$lr \`
`12`		`- --device$device`
	`6`	`+forbatch_sizein 512;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name jsnli+nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --gradient_checkpointing \`
	`13`	`+ --device$device`
	`14`	`+done`
	`15`	`+forbatch_sizein 256 128 64;do`
	`16`	`+ poetry run python src/train_sup.py \`
	`17`	`+ --dataset_name jsnli+nu-snli \`
	`18`	`+ --model_name$model_name \`
	`19`	`+ --batch_size$batch_size \`
	`20`	`+ --lr$lr \`
	`21`	`+ --device$device`
	`22`	`+done`
`13`	`23`	`done`
`14`	`24`	`done`
`15`	`25`	`done`

`‎scripts/10/1.sh`

Lines changed: 18 additions & 8 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,15 +1,25 @@`
`1`	`1`	`device="cuda:1"`
`2`		`-model_name="studio-ousia/luke-japanese-large-lite"`
`3`	`2`
`4`	`3`	`foriin 0 1 2;do`
`5`		`-forbatch_sizein32 64 128 256 512;do`
	`4`	`+formodel_nameincl-tohoku/bert-large-japanese ku-nlp/roberta-large-japanese-char-wwm;do`
`6`	`5`	`forlrin 1e-5 3e-5 5e-5;do`
`7`		`- poetry run python src/train_unsup.py \`
`8`		`- --dataset_name wiki40b \`
`9`		`- --model_name$model_name \`
`10`		`- --batch_size$batch_size \`
`11`		`- --lr$lr \`
`12`		`- --device$device`
	`6`	`+forbatch_sizein 512;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name jsnli+nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --gradient_checkpointing \`
	`13`	`+ --device$device`
	`14`	`+done`
	`15`	`+forbatch_sizein 256 128 64;do`
	`16`	`+ poetry run python src/train_sup.py \`
	`17`	`+ --dataset_name jsnli+nu-snli \`
	`18`	`+ --model_name$model_name \`
	`19`	`+ --batch_size$batch_size \`
	`20`	`+ --lr$lr \`
	`21`	`+ --device$device`
	`22`	`+done`
`13`	`23`	`done`
`14`	`24`	`done`
`15`	`25`	`done`

`‎scripts/10/2.sh`

Lines changed: 20 additions & 8 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,15 +1,27 @@`
`1`	`1`	`device="cuda:2"`
`2`		`-model_name="cl-tohoku/bert-base-japanese-v2"`
`3`	`2`
`4`	`3`	`foriin 0 1 2;do`
`5`		`-forbatch_sizein32 64 128 256 512;do`
	`4`	`+formodel_nameinnlp-waseda/roberta-large-japanese ku-nlp/deberta-v2-large-japanese;do`
`6`	`5`	`forlrin 1e-5 3e-5 5e-5;do`
`7`		`- poetry run python src/train_unsup.py \`
`8`		`- --dataset_name wiki40b \`
`9`		`- --model_name$model_name \`
`10`		`- --batch_size$batch_size \`
`11`		`- --lr$lr \`
`12`		`- --device$device`
	`6`	`+forbatch_sizein 512;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name jsnli+nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --use_jumanpp \`
	`13`	`+ --gradient_checkpointing \`
	`14`	`+ --device$device`
	`15`	`+done`
	`16`	`+forbatch_sizein 256 128 64;do`
	`17`	`+ poetry run python src/train_sup.py \`
	`18`	`+ --dataset_name jsnli+nu-snli \`
	`19`	`+ --model_name$model_name \`
	`20`	`+ --batch_size$batch_size \`
	`21`	`+ --lr$lr \`
	`22`	`+ --use_jumanpp \`
	`23`	`+ --device$device`
	`24`	`+done`
`13`	`25`	`done`
`14`	`26`	`done`
`15`	`27`	`done`

`‎scripts/10/3.sh`

Lines changed: 18 additions & 8 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,15 +1,25 @@`
`1`	`1`	`device="cuda:3"`
`2`		`-model_name="ku-nlp/deberta-v2-large-japanese"`
`3`	`2`
`4`	`3`	`foriin 0 1 2;do`
`5`		`-forbatch_sizein32 64 128 256 512;do`
	`4`	`+formodel_nameinxlm-roberta-large studio-ousia/mluke-large-lite;do`
`6`	`5`	`forlrin 1e-5 3e-5 5e-5;do`
`7`		`- poetry run python src/train_unsup.py \`
`8`		`- --dataset_name wiki40b \`
`9`		`- --model_name$model_name \`
`10`		`- --batch_size$batch_size \`
`11`		`- --lr$lr \`
`12`		`- --device$device`
	`6`	`+forbatch_sizein 512;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name jsnli+nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --gradient_checkpointing \`
	`13`	`+ --device$device`
	`14`	`+done`
	`15`	`+forbatch_sizein 256 128 64;do`
	`16`	`+ poetry run python src/train_sup.py \`
	`17`	`+ --dataset_name jsnli+nu-snli \`
	`18`	`+ --model_name$model_name \`
	`19`	`+ --batch_size$batch_size \`
	`20`	`+ --lr$lr \`
	`21`	`+ --device$device`
	`22`	`+done`
`13`	`23`	`done`
`14`	`24`	`done`
`15`	`25`	`done`

`‎scripts/10/prev/2023-03-10/0.sh`

Lines changed: 30 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,30 @@`
	`1`	`+device="cuda:0"`
	`2`	`+`
	`3`	`+foriin 0 1 2;do`
	`4`	`+formodel_namein cl-tohoku/bert-base-japanese-v2 cl-tohoku/bert-base-japanese-char-v2 cl-tohoku/bert-base-japanese cl-tohoku/bert-base-japanese-whole-word-masking;do`
	`5`	`+forbatch_sizein 64 128 256 512;do`
	`6`	`+forlrin 1e-5 3e-5 5e-5;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --device$device`
	`13`	`+`
	`14`	`+ poetry run python src/train_sup.py \`
	`15`	`+ --dataset_name nu-mnli \`
	`16`	`+ --model_name$model_name \`
	`17`	`+ --batch_size$batch_size \`
	`18`	`+ --lr$lr \`
	`19`	`+ --device$device`
	`20`	`+`
	`21`	`+ poetry run python src/train_sup.py \`
	`22`	`+ --dataset_name nu-snli+mnli \`
	`23`	`+ --model_name$model_name \`
	`24`	`+ --batch_size$batch_size \`
	`25`	`+ --lr$lr \`
	`26`	`+ --device$device`
	`27`	`+done`
	`28`	`+done`
	`29`	`+done`
	`30`	`+done`

`‎scripts/10/prev/2023-03-10/1.sh`

Lines changed: 30 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,30 @@`
	`1`	`+device="cuda:1"`
	`2`	`+`
	`3`	`+foriin 0 1 2;do`
	`4`	`+formodel_namein cl-tohoku/bert-large-japanese ku-nlp/roberta-large-japanese-char-wwm;do`
	`5`	`+forbatch_sizein 64 128 256 512;do`
	`6`	`+forlrin 1e-5 3e-5 5e-5;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --device$device`
	`13`	`+`
	`14`	`+ poetry run python src/train_sup.py \`
	`15`	`+ --dataset_name nu-mnli \`
	`16`	`+ --model_name$model_name \`
	`17`	`+ --batch_size$batch_size \`
	`18`	`+ --lr$lr \`
	`19`	`+ --device$device`
	`20`	`+`
	`21`	`+ poetry run python src/train_sup.py \`
	`22`	`+ --dataset_name nu-snli+mnli \`
	`23`	`+ --model_name$model_name \`
	`24`	`+ --batch_size$batch_size \`
	`25`	`+ --lr$lr \`
	`26`	`+ --device$device`
	`27`	`+done`
	`28`	`+done`
	`29`	`+done`
	`30`	`+done`

`‎scripts/10/prev/2023-03-10/2.sh`

Lines changed: 33 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,33 @@`
	`1`	`+device="cuda:2"`
	`2`	`+`
	`3`	`+foriin 0 1 2;do`
	`4`	`+formodel_namein nlp-waseda/roberta-large-japanese ku-nlp/deberta-v2-large-japanese;do`
	`5`	`+forbatch_sizein 64 128 256 512;do`
	`6`	`+forlrin 1e-5 3e-5 5e-5;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --use_jumanpp \`
	`13`	`+ --device$device`
	`14`	`+`
	`15`	`+ poetry run python src/train_sup.py \`
	`16`	`+ --dataset_name nu-mnli \`
	`17`	`+ --model_name$model_name \`
	`18`	`+ --batch_size$batch_size \`
	`19`	`+ --lr$lr \`
	`20`	`+ --use_jumanpp \`
	`21`	`+ --device$device`
	`22`	`+`
	`23`	`+ poetry run python src/train_sup.py \`
	`24`	`+ --dataset_name nu-snli+mnli \`
	`25`	`+ --model_name$model_name \`
	`26`	`+ --batch_size$batch_size \`
	`27`	`+ --lr$lr \`
	`28`	`+ --use_jumanpp \`
	`29`	`+ --device$device`
	`30`	`+done`
	`31`	`+done`
	`32`	`+done`
	`33`	`+done`

`‎scripts/10/prev/2023-03-10/3.sh`

Lines changed: 30 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,30 @@`
	`1`	`+device="cuda:3"`
	`2`	`+`
	`3`	`+foriin 0 1 2;do`
	`4`	`+formodel_namein xlm-roberta-large studio-ousia/mluke-large-lite;do`
	`5`	`+forbatch_sizein 64 128 256 512;do`
	`6`	`+forlrin 1e-5 3e-5 5e-5;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --device$device`
	`13`	`+`
	`14`	`+ poetry run python src/train_sup.py \`
	`15`	`+ --dataset_name nu-mnli \`
	`16`	`+ --model_name$model_name \`
	`17`	`+ --batch_size$batch_size \`
	`18`	`+ --lr$lr \`
	`19`	`+ --device$device`
	`20`	`+`
	`21`	`+ poetry run python src/train_sup.py \`
	`22`	`+ --dataset_name nu-snli+mnli \`
	`23`	`+ --model_name$model_name \`
	`24`	`+ --batch_size$batch_size \`
	`25`	`+ --lr$lr \`
	`26`	`+ --device$device`
	`27`	`+done`
	`28`	`+done`
	`29`	`+done`
	`30`	`+done`

`‎scripts/10/prev/2023-03-11/0.sh`

Lines changed: 25 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,25 @@`
	`1`	`+device="cuda:0"`
	`2`	`+`
	`3`	`+forbatch_sizein 1024;do`
	`4`	`+foriin 0 1;do`
	`5`	`+formodel_namein cl-tohoku/bert-large-japanese ku-nlp/roberta-large-japanese-char-wwm;do`
	`6`	`+forlrin 1e-5 3e-5 5e-5;do`
	`7`	`+ poetry run python src/train_sup.py \`
	`8`	`+ --dataset_name nu-snli \`
	`9`	`+ --model_name$model_name \`
	`10`	`+ --batch_size$batch_size \`
	`11`	`+ --lr$lr \`
	`12`	`+ --gradient_checkpointing \`
	`13`	`+ --device$device`
	`14`	`+`
	`15`	`+ poetry run python src/train_sup.py \`
	`16`	`+ --dataset_name nu-snli+mnli \`
	`17`	`+ --model_name$model_name \`
	`18`	`+ --batch_size$batch_size \`
	`19`	`+ --lr$lr \`
	`20`	`+ --gradient_checkpointing \`
	`21`	`+ --device$device`
	`22`	`+done`
	`23`	`+done`
	`24`	`+done`
	`25`	`+done`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit5029545

File tree

65 files changed

Some content is hidden

65 files changed

`‎README.md`

`‎scripts/10/0.sh`

`‎scripts/10/1.sh`

`‎scripts/10/2.sh`

`‎scripts/10/3.sh`

`‎scripts/10/prev/2023-03-10/0.sh`

`‎scripts/10/prev/2023-03-10/1.sh`

`‎scripts/10/prev/2023-03-10/2.sh`

`‎scripts/10/prev/2023-03-10/3.sh`

`‎scripts/10/prev/2023-03-11/0.sh`

0 commit comments