Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A PyTorch implementation of the Japanese Predicate-Argument Structure (PAS) analyser presented in the paper of Matsubayashi & Inui (2018) with some improvements.

License

NotificationsYou must be signed in to change notification settings

cl-tohoku/showcase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Showcase is a Pytorch implementation of the Japanese Predicate-Argument Structure (PAS) analyser presented in the paper of Matsubayashi & Inui (2018) with some improvements.Given a input sentence, Showcase identifies verbal and nominal predicates in the sentence and detects their nominative (が), accusative (を), and dative (に) case arguments.The output case labels are based on the label definition of the NAIST Text Corpus where case markers in different voices are generalized into the case markers of an active voice.

License: MIT

Demo

http://www.cl.ecei.tohoku.ac.jp/showcase/

Usage

echo '今日は雨が降る' | showcase

cat example.txt | showcase

Input file format

  • One raw sentence per line.
  • A blank line can be used to segment a document. (Showcase just resets an argument index to zero.)

Requirements

  • Python 3.5 (or higher)
    • We do not support Python 2
  • CaboCha with JUMAN dict
  • PyTorch 0.4.0

Instllation

Step 1. Install Showcase

pip install showcase-parser

Step 2: Download Resources

Resources include following files:

  • 10 Model files for predicate detector (pred_model_0{0..9}.h5)
  • 10 Model files for argument detector (arg_model_0{0..9}.h5)
  • Word embedding Matrix (word_embedding.npz)
  • POS embedding Matrix (pos_embedding.npz)
  • Word index file (word.index)
  • Part-of-Speech tag index file (pos.index)

Resources are all available atGoogle Drive.

  • train/*.h5: models trained with the training set described in the paper.
  • train-test/*.h5: models trained with the training and test sets.

Step 3: Create and edit config.json

Runshowcase setup to createconfig.json file in$HOME/.config/showcase.

Then editconfig.json and specify valid paths for:

  • Resources downloaded in Step 2
  • CaboCha and its JUMAN dictionary

Originalconfig.json can be found atshowcase/data/config.json of this repo.

You may specify path toconfig.json as follows:

showcase -c /path/to/config/config.json

Note that the apporopriate thresholds (hyperparameters) for arguments differ for each model.The thresholds for the provided models are described in the sample config file in each Google Drive directory.

(Re-)training

TBA

Step1: Train word2vec

TBA

Step2: Train model

TBA

Step3: Convert word2vec

  • runget_vocab_from_word2vec.py andconvert_word2vec_to_npy.py

Citation

@InProceedings{matsubayashi:2018:coling,  author    = {Matsubayashi, Yuichiroh and Inui, Kentaro},  title     = {Distance-Free Modeling of Multi-Predicate Interactions in End-to-End Japanese Predicate Argument Structure Analysis},  booktitle = {Proceedings of the 27th International Conference on Computational Linguistics (COLING)},  year      = {2018},}

Contributor

About

A PyTorch implementation of the Japanese Predicate-Argument Structure (PAS) analyser presented in the paper of Matsubayashi & Inui (2018) with some improvements.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors2

  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp