Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"

License

NotificationsYou must be signed in to change notification settings

hankcs/ID-CNN-CWS

Repository files navigation

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation" published in NNW journal.

2017-10-20_13-23-31

It implements the following4 models for CWS:

  • Bi-LSTM
  • Bi-LSTM-CRF
  • ID-CNN
  • ID-CNN-CRF

Dependencies

  • Python >= 3.6
  • TensorFlow >= 1.2

Both CPU and GPU are supported. GPU training is10 times faster.

Preparation

Run following script to convert corpus to TensorFlow dataset.

$ ./scripts/make.sh

Train and Test

Quick Start

$ ./scripts/run.sh $dataset $model
  • $dataset can bepku,msr,asSC orcityuSC.
  • $model can becnn orbilstm.

For example:

$ ./scripts/run.sh pku cnn

It will train acnn model onpku dataset, then evaluate performance on test set.

CRF Layer

To enable CRF layer, simply append--viterbi to your command, e.g.

$ ./scripts/run.sh pku cnn --viterbi

Accuracy

2017-10-20_13-25-11

Speed

2017-10-20_11-44-42

Acknowledgments

About

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp