Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Extract Keywords from sentence or Replace keywords in sentences. @https://github.com/vi3k6i5/flashtext

License

NotificationsYou must be signed in to change notification settings

shdev/phpflashtext

Repository files navigation

Build StatusCoverage Status

It's a port from the wonderful python projecthttps://github.com/vi3k6i5/flashtext,for internals of the algorithm look there.

This algorithm allows you to extract or replace several keywords at ones.If you deal with 300 keywords, which have 5 variants each a regex approach is slower than the flashtext approach.For 1000 keyword with 5 variants each the regex can't be build.

In PHP 5.6 using regex is really slow. In newer verions it performs better.

Install

composer require shdev/phpflashtext

Usage

<?phpuseShdev\FlashText\KeywordProcessor;$keywordProcessor=newKeywordProcessor();$keywords = ['java'               => ['java_2e','java programing'],'product management' => ['product management techniques','product management'],];$keywordProcessor->addKeywordsFromAssocArray($keywords);$sentence ='I know java_2e and product management techniques';$keywordsExtracted =$keywordProcessor->extractKeywords($sentence);// $keywordsExtracted = ['java', 'product management']$keywordsExtractedWithSpanInfo =$keywordProcessor->extractKeywords($sentence,true);// $keywordsExtractedWithSpanInfo = [//['java', 7, 14],// ['product management', 19, 48],//]$sentenceNew =$keywordProcessor->replaceKeywords($sentence);// $sentenceNew = 'I know java and product management';

Citation

The original paper published onFlashText algorithm.

    @ARTICLE{2017arXiv171100046S,       author = {{Singh}, V.},        title ="{Replace or Retrieve Keywords In Documents at Scale}",      journal = {ArXiv e-prints},    archivePrefix ="arXiv",       eprint = {1711.00046},     primaryClass ="cs.DS",     keywords = {Computer Science - Data Structures and Algorithms},         year = 2017,        month = oct,       adsurl = {http://adsabs.harvard.edu/abs/2017arXiv171100046S},      adsnote = {Provided by the SAO/NASA Astrophysics Data System}    }

The article published onMedium freeCodeCamp.


[8]ページ先頭

©2009-2025 Movatter.jp