Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

The code of our paper "SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model"

NotificationsYou must be signed in to change notification settings

sunyilgdx/SIFRank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The code of our paperSIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model

Versions Notes

  • 2020/02/21——Initial versionProvided the most basic functions.
  • 2020/02/28——Second versionAdded new algorithmsDS(document segmentation) andEA(embeddings alignment) tospeed up SIFRank and SIFRank+.
  • 2020/03/02——Third versionA little change of SIFRank+ in./model/method.py about making a simple normalization of position_score.

Environment

Python 3.6nltk 3.4.3StanfordCoreNLP 3.9.1.1torch 1.1.0allennlp 0.8.4

Download

  • ELMoelmo_2x4096_512_2048cnn_2xhighway_options.json andelmo_2x4096_512_2048cnn_2xhighway_weights.hdf5 fromhere , and save it to theauxiliary_data/ directory
  • StanfordCoreNLPstanford-corenlp-full-2018-02-27 fromhere, and save it to anywhere

Usage

import nltkfrom embeddings import sent_emb_sif, word_emb_elmofrom model.method import SIFRank, SIFRank_plusfrom stanfordcorenlp import StanfordCoreNLPimport time#download from https://allennlp.org/elmooptions_file = "../auxiliary_data/elmo_2x4096_512_2048cnn_2xhighway_options.json"weight_file = "../auxiliary_data/elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5"porter = nltk.PorterStemmer()ELMO = word_emb_elmo.WordEmbeddings(options_file, weight_file, cuda_device=0)SIF = sent_emb_sif.SentEmbeddings(ELMO, lamda=1.0)en_model = StanfordCoreNLP(r'E:\Python_Files\stanford-corenlp-full-2018-02-27',quiet=True)#download from https://stanfordnlp.github.io/CoreNLP/elmo_layers_weight = [0.0, 1.0, 0.0]text = "Discrete output feedback sliding mode control of second order systems - a moving switching line approach The sliding mode control systems (SMCS) for which the switching variable is designed independent of the initial conditions are known to be sensitive to parameter variations and extraneous disturbances during the reaching phase. For second order systems this drawback is eliminated by using the moving switching line technique where the switching line is initially designed to pass the initial conditions and is subsequently moved towards a predetermined switching line. In this paper, we make use of the above idea of moving switching line together with the reaching law approach to design a discrete output feedback sliding mode control. The main contributions of this work are such that we do not require to use system states as it makes use of only the output samples for designing the controller. and by using the moving switching line a low sensitivity system is obtained through shortening the reaching phase. Simulation results show that the fast output sampling feedback guarantees sliding motion similar to that obtained using state feedback"keyphrases = SIFRank(text, SIF, en_model, N=15,elmo_layers_weight=elmo_layers_weight)keyphrases_ = SIFRank_plus(text, SIF, en_model, N=15, elmo_layers_weight=elmo_layers_weight)print(keyphrases)print(keyphrases_)

Evaluate the model

Use thiseval/sifrank_eval.py to evaluate SIFRank onInspec,SemEval2017 andDUC2001 datasetsWe also have evaluation codes for other baseline models. We will organize and upload them later, so stay tuned.F1 score when the number of keyphrases extracted N is set to 5.

ModelsInspecSemEval2017DUC2001
TFIDF11.2812.709.21
YAKE15.7311.8410.61
TextRank24.3916.4313.94
SingleRank24.6918.2321.56
TopicRank22.7617.1020.37
PositionRank25.1918.2324.95
Multipartite23.0517.3921.86
RVA21.9119.5920.32
EmbedRank d2v27.2020.2121.74
SIFRank29.1122.5924.27
SIFRank+28.4921.5330.88

Cite

If you use this code, please cite this paper

@article{DBLP:journals/access/SunQZWZ20,  author    = {Yi Sun and               Hangping Qiu and               Yu Zheng and               Zhongwei Wang and               Chaoran Zhang},  title     = {SIFRank: {A} New Baseline for Unsupervised Keyphrase Extraction Based               on Pre-Trained Language Model},  journal   = {{IEEE} Access},  volume    = {8},  pages     = {10896--10906},  year      = {2020},  url       = {https://doi.org/10.1109/ACCESS.2020.2965087},  doi       = {10.1109/ACCESS.2020.2965087},  timestamp = {Fri, 07 Feb 2020 12:04:22 +0100},  biburl    = {https://dblp.org/rec/journals/access/SunQZWZ20.bib},  bibsource = {dblp computer science bibliography, https://dblp.org}}

About

The code of our paper "SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp