Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

This is a repository for the AI LAB article "係り受けに基づく日本語単語埋込 (Dependency-based Japanese Word Embeddings)" ( Article URLhttps://ai-lab.lapras.com/nlp/japanese-word-embedding/)

NotificationsYou must be signed in to change notification settings

lapras-inc/dependency-based-japanese-word-embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 

Repository files navigation

This is a repository to share dependency-based Japanese word embeddings which we trained for experiments in the article係り受けに基づく日本語単語埋込 (Dependency-based Japanese Word Embeddings).

We applied the method proposed in the paperDependency-based Word Embeddings to Japanese.

Training Details

To prepare the training data, we first extracted sentences fromJapanese Wikipedia dumps.
Then, we parsed them using an NLP frameworkGiNZA.
Finally, we trained the embeddings with the script provided inthe page of the paper's first author.

The parameter settings for the experiments is as below where DIM is the number of dimensions written in each file name.

-size DIM -negative 15 -threads 20

Download URL

You can download the data from links below.
Download beginssoon after you click on a link.

How to Use the Embeddings

You can use the embeddings in the same way as embeddings trained by using the original implementation ofWord2Vec.

Here is an example code to load them from your Python script.

from gensim.models import KeyedVectorsvectors = KeyedVectors.load_word2vec_format("path/to/embeddings")

When Using Them for Your Research

When writing your paper using them, please cite this bibtex,

@misc{matsuno2019dependencybasedjapanesewordembeddings,      title  = {Dependency-based Japanese Word Embeddings},      author = {Tomoki, Matsuno},      affiliation = {LAPRAS inc.},    url    = {https://github.com/lapras-inc/dependency-based-japanese-word-embeddings},      year   = {2019}  }

References

  • 松田寛, 大村舞, 浅原正幸. 短単位品詞の用法曖昧性解決と依存関係ラベリングの同時学習, 言語処理学会 第 25 回年次大会 発表論文集, 2019.
  • Mikolov, T., Chen, K., Corrado, G. & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, .
  • Levy, O. & Goldberg, Y. (2014). Dependency-Based Word Embeddings. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (p./pp. 302--308), June, Baltimore, Maryland: Association for Computational Linguistics.

About

This is a repository for the AI LAB article "係り受けに基づく日本語単語埋込 (Dependency-based Japanese Word Embeddings)" ( Article URLhttps://ai-lab.lapras.com/nlp/japanese-word-embedding/)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp