- Notifications
You must be signed in to change notification settings - Fork8
supercoderhawk/deep-keyphrase
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Implement some keyphrase generation algorithm
CopyRNN
Deep Keyphrase Generation (Meng et al., 2017)
CopyCNN
CopyTransformer
vocab_file: word line by line (don't with index!!!!)
thispaperproposes
training, valid and test file
json line format, every line is a dict:
{'tokens': ['this', 'paper', 'proposes', 'using', 'virtual', 'reality', 'to', 'enhance', 'the', 'perception', 'of', 'actions', 'by', 'distant', 'users', 'on', 'a', 'shared', 'application', '.', 'here', ',', 'distance', 'may', 'refer', 'either', 'to', 'space', '(', 'e.g.', 'in', 'a', 'remote', 'synchronous', 'collaboration', ')', 'or', 'time', '(', 'e.g.', 'during', 'playback', 'of', 'recorded', 'actions', ')', '.', 'our', 'approach', 'consists', 'in', 'immersing', 'the', 'application', 'in', 'a', 'virtual', 'inhabited', '3d', 'space', 'and', 'mimicking', 'user', 'actions', 'by', 'animating', 'avatars', '.', 'we', 'illustrate', 'this', 'approach', 'with', 'two', 'applications', ',', 'the', 'one', 'for', 'remote', 'collaboration', 'on', 'a', 'shared', 'application', 'and', 'the', 'other', 'to', 'playback', 'recorded', 'sequences', 'of', 'user', 'actions', '.', 'we', 'suggest', 'this', 'could', 'be', 'a', 'low', 'cost', 'enhancement', 'for', 'telepresence', '.'] ,'keyphrases': [['telepresence'], ['animation'], ['avatars'], ['application', 'sharing'], ['collaborative', 'virtual', 'environments']]}download thekp20k
mkdir datamkdir data/rawmkdir data/raw/kp20k_new# !! please unzip kp20k data put the files into above folder manuallypython -m nltk.downloader punktbash scripts/prepare_kp20k.shbash scripts/train_copyrnn_kp20k.sh# start tensorboard# enter the experiment result dir, suffix is time that experiment startscd data/kp20k/copyrnn_kp20k_basic-20191212-080000# start tensorboard servicestenosrboard --bind_all --logdir logs --port 6006
- compared with the original
seq2seq-keyphrase-pytorch - fix the implementation error:
- copy mechanism
- train and inference are not correspond (training doesn't have input feeding and inference has input feeding)
- easy data preparing
- tensorboard support
- faster beam search (6x faster used cpu and more than 10x faster used gpu)
- compared with the original
About
seq2seq based keyphrase generation model sets, including copyrnn copycnn and copytransfomer
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
No packages published
Uh oh!
There was an error while loading.Please reload this page.