- Notifications
You must be signed in to change notification settings - Fork1
DistilBERT model pre-trained on 131 GB of Japanese web text. The teacher model is BERT-base that built in-house at LINE.
License
line/LINE-DistilBERT-Japanese
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is a DistilBERT model pre-trained on 131 GB of Japanese web text.The teacher model is BERT-base that built in-house at LINE.The model was trained byLINE Corporation.
https://huggingface.co/line-corporation/line-distilbert-base-japanese
README_ja.md is written in Japanese.
fromtransformersimportAutoTokenizer,AutoModeltokenizer=AutoTokenizer.from_pretrained("line-corporation/line-distilbert-base-japanese",trust_remote_code=True)model=AutoModel.from_pretrained("line-corporation/line-distilbert-base-japanese")sentence="LINE株式会社で[MASK]の研究・開発をしている。"print(model(**tokenizer(sentence,return_tensors="pt")))
fugashisentencepieceunidic-lite
The model architecture is the DitilBERT base model; 6 layers, 768 dimensions of hidden states, 12 attention heads, 66M parameters.
The evaluation byJGLUE is as follows:
model name | #Params | Marc_ja | JNLI | JSTS | JSQuAD | JCommonSenseQA |
---|---|---|---|---|---|---|
acc | acc | Pearson/Spearman | EM/F1 | acc | ||
LINE-DistilBERT | 68M | 95.6 | 88.9 | 89.2/85.1 | 87.3/93.3 | 76.1 |
Laboro-DistilBERT | 68M | 94.7 | 82.0 | 87.4/82.7 | 70.2/87.3 | 73.2 |
BandaiNamco-DistilBERT | 68M | 94.6 | 81.6 | 86.8/82.1 | 80.0/88.0 | 66.5 |
The texts are first tokenized by MeCab with the Unidic dictionary and then split into subwords by the SentencePiece algorithm. The vocabulary size is 32768.
The pretrained models are distributed under the terms of theApache License, Version 2.0.
We haven't published any paper on this work. Please citethis GitHub repository:
@article{LINE DistilBERT Japanese, title = {LINE DistilBERT Japanese}, author = {"Koga, Kobayashi and Li, Shengzhe and Nakamachi, Akifumi and Sato, Toshinori"}, year = {2023}, howpublished = {\url{http://github.com/line/LINE-DistilBERT-Japanese}}}
About
DistilBERT model pre-trained on 131 GB of Japanese web text. The teacher model is BERT-base that built in-house at LINE.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.