Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GPT-2 Japanese model for HuggingFace's transformers

License

NotificationsYou must be signed in to change notification settings

colorfulscoop/gpt-ja

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository is for GPT-based Japanese model trained on Japanese Wikipedia dataset.

Current support models are:

Model summary:

🤗 Model HubDataRevisionCodeTotal paramsTest set PPLvocab_sizen_ctxn_layern_headn_embdEpochsTraining time
colorfulscoop/gpt2-small-jajawiki_2021082020210820.1.0ef927e1110M29.1332,0001,02412127683015 days
jawiki_2021030120210301.1.0-110M-32,0001,024121276830-

Data summary:

IdCorpus#tokens in train set#tokens in valid set#tokens in test set
jawiki_20210820Japanese Wikipedia on 20210820540M13M13M

Note: a same tokenizer is used if models are trained on same data.

Sample usage:

>>>importtransformers>>>pipeline=transformers.pipeline("text-generation","models/gpt2-small",revision="20210820.1.0")>>>pipeline("統計的機械学習でのニューラルネットワーク",do_sample=True)[{'generated_text':'統計的機械学習でのニューラルネットワークの解析は、多くのアルゴリズムの完全な実装をもたらした。これらの'}]

Training details

Training model was conducted on the following environment.

  • OS: Ubuntu 18.04.5 LTS
  • GPU: RTX 2080 Ti x1

Environment preparation

$ docker container run --gpus all --ipc=host --rm -it -v$(pwd):/work -w /work nvidia/cuda:11.1-devel-ubuntu20.04 bash(container)$ apt update&& apt install -y python3 python3-pip git wget(container)$ pip3 install torch==1.8.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html(container)$ pip3 install -r requirements.txt

Data preparation

Check the latest date in the list fromhttps://dumps.wikimedia.org/jawiki/ .

(container)$ bash src/get_jawiki.sh 20210820 input

Finally generated data can be found underinput directory.

(container)$ ls -1 input/20210820/{train,valid,test}.txtinput/20210820/test.txtinput/20210820/train.txtinput/20210820/valid.txt

Train tokenizer

Train SentencePiece model in the same container used in data peparation.

(container)$ python3 src/train_tokenizer.py --train_file input/20210820/train.txt --model_dir models/gpt2-small

Train model

Run training with the config file:

(container)$ python3 src/train.py train --config input/gpt2-small.json...255999it [10:21:51,  7.03it/s]{'epoch': 30,'batch': 256000,'step': 493108,'train_loss': 0.190585415356369,'lr': 0.0001}263236it [10:39:12,  6.86it/s]6788it [10:28, 10.81it/s]{'epoch': 30,'valid_loss': 3.417723441833458,'valid_ppl': 30.49990112587307,'save_model': True}

Test

(container)$ python3 src/train.pytest --config input/gpt2-small.json6793it [09:16, 12.20it/s]{'test_loss': 3.371613106758486,'test_ppl': 29.125471679484484}

Export Tensorflow model

(container)$pipinstalltensorflow(container)$python3>>>fromtransformersimportTFGPT2LMHeadModel>>>model=TFGPT2LMHeadModel.from_pretrained("models/gpt2-small",from_pt=True)>>>model.save_pretrained("models/gpt2-small")

Upload to 🤗 Model Hub

Followofficial document to upload model.

Prepare environment

Prepare git lfs. In a MacOS environment, git lfs can be installed as follows.

$ brew install git-lfs$ git lfs installUpdated git hooks.Git LFS initialized.

Then clone the repository.

$ git clone https://huggingface.co/colorfulscoop/gpt2-small-ja release/gpt2-small-ja

Copy model to release directory

$ cp models/gpt2-small/* release/gpt2-small-ja/cp: models/gpt2-small/spm is a directory (not copied).$cd release/gpt2-small-ja

Then, modifyconfig.json to specify default generation values by following diff.

"unk_token_id": 1,"use_cache": true,-"vocab_size": 32000+"vocab_size": 32000,+"top_k": 50,+"top_p": 0.95,+"do_sample":true }

Commit changes to git.

$ git add.

Release

$ git push origin

About

GPT-2 Japanese model for HuggingFace's transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp