Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

seq2seq based keyphrase generation model sets, including copyrnn copycnn and copytransfomer

NotificationsYou must be signed in to change notification settings

supercoderhawk/deep-keyphrase

Repository files navigation

Implement some keyphrase generation algorithm

Description

Implemented Paper

CopyRNN

Deep Keyphrase Generation (Meng et al., 2017)

ToDo List

CopyCNN

CopyTransformer

Usage

required files (4 files in total)

  1. vocab_file: word line by line (don't with index!!!!)

    thispaperproposes
  2. training, valid and test file

data format for training, valid and test

json line format, every line is a dict:

{'tokens': ['this', 'paper', 'proposes', 'using', 'virtual', 'reality', 'to', 'enhance', 'the', 'perception', 'of', 'actions', 'by', 'distant', 'users', 'on', 'a', 'shared', 'application', '.', 'here', ',', 'distance', 'may', 'refer', 'either', 'to', 'space', '(', 'e.g.', 'in', 'a', 'remote', 'synchronous', 'collaboration', ')', 'or', 'time', '(', 'e.g.', 'during', 'playback', 'of', 'recorded', 'actions', ')', '.', 'our', 'approach', 'consists', 'in', 'immersing', 'the', 'application', 'in', 'a', 'virtual', 'inhabited', '3d', 'space', 'and', 'mimicking', 'user', 'actions', 'by', 'animating', 'avatars', '.', 'we', 'illustrate', 'this', 'approach', 'with', 'two', 'applications', ',', 'the', 'one', 'for', 'remote', 'collaboration', 'on', 'a', 'shared', 'application', 'and', 'the', 'other', 'to', 'playback', 'recorded', 'sequences', 'of', 'user', 'actions', '.', 'we', 'suggest', 'this', 'could', 'be', 'a', 'low', 'cost', 'enhancement', 'for', 'telepresence', '.'] ,'keyphrases': [['telepresence'], ['animation'], ['avatars'], ['application', 'sharing'], ['collaborative', 'virtual', 'environments']]}

Training

download thekp20k

mkdir datamkdir data/rawmkdir data/raw/kp20k_new# !! please unzip kp20k data put the files into above folder manuallypython -m nltk.downloader punktbash scripts/prepare_kp20k.shbash scripts/train_copyrnn_kp20k.sh# start tensorboard# enter the experiment result dir, suffix is time that experiment startscd data/kp20k/copyrnn_kp20k_basic-20191212-080000# start tensorboard servicestenosrboard --bind_all --logdir logs --port 6006

Notes

  1. compared with the originalseq2seq-keyphrase-pytorch
    1. fix the implementation error:
      1. copy mechanism
      2. train and inference are not correspond (training doesn't have input feeding and inference has input feeding)
    2. easy data preparing
    3. tensorboard support
    4. faster beam search (6x faster used cpu and more than 10x faster used gpu)

About

seq2seq based keyphrase generation model sets, including copyrnn copycnn and copytransfomer

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp