Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A part of speech tagger written in PHP.

License

NotificationsYou must be signed in to change notification settings

nai-php/NaiPosTagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A lightweight framework-agnostic library in pure PHP for part-of-speech tagging. Can be used for chatbots, personal assistants, keywords extraction etc. Being written in PHP, it can be easily integrated in pre existent or new applications, giving the real ability to understand what users write.

It is based on vocabularies and predefined grammatical rules, without wrappers to third part systems, neural networks, machine learning or models that requires huge resources.

This is the english version. Documentation and TODO are coming, more info and demo onn-ai.cloud

Precision

In this table I'll put results of differents type of sentences corpus.

CorpusTotal tokensCorrectly taggedNot correctly tagged% of total correct
"Just Shoot Me" movie subtitles340333812299,35

Installation

  1. in your project folder e.g. "myproject" install the package viacomposer;

  2. create folder "dictionaries";

  3. inside folder "dictionaries" clone or download theenglish dictionary repository;

  4. run this example script:

useNaiPosTagger\Pipelines\PipelinePosTagging;useNaiPosTagger\Models\NaiPosArr;include('vendor/autoload.php');include(__DIR__ .'/vendor/nai-php/naipostagger/src/Utilities/common_functions_helper.php');define('DICTIONARIES_PATH',__DIR__ .'/./dictionaries/dictionaries-');define('TRAITS_PATH',__DIR__ .'/./vendor/nai-php/naipostagger/src/');$sentence ='my name is Fred';$PipelinePosTagging =newPipelinePosTagging();$PipelinePosTagging->language ='en';$pos_arr =$PipelinePosTagging->transform($sentence);// for a clear output, better hide metadata$pos_arr = NaiPosArr::clearMetadata($pos_arr);// and further simplify the output$pos_arr = NaiPosArr::flatPosArr($pos_arr);diex($pos_arr);

And the output will be:

Array(    [0] =>Array        (            [form] =>.            [lemma] => .            [features] =>SENT            [sh-feat] =>SENT            [label] =>             [rule] =>             [pos_score] =>0        )    [1] =>Array        (            [form] => my            [lemma] => my            [features] =>ADJ:pos+m+s            [sh-feat] =>ADJ            [label] =>             [rule] =>             [pos_score] =>0        )    [2] =>Array        (            [form] => name            [lemma] => name            [features] =>NOUN-m:s            [sh-feat] =>NOUN            [label] =>             [rule] =>             [pos_score] =>0        )    [3] =>Array        (            [form] => is            [lemma] => is            [features] =>VER:ind+pres+3+s            [sh-feat] =>VER            [label] =>             [rule] =>             [pos_score] =>0        )    [4] =>Array        (            [form] => Fred            [lemma] => Fred            [features] =>NPR            [sh-feat] =>NPR            [label] =>             [rule] =>             [pos_score] =>0        )    [5] => Array        (            [form] => .            [lemma] => .            [features] =>SENT            [sh-feat] =>SENT            [label] =>             [rule] =>             [pos_score] =>0        ))

To do list

  • Find contributors
  • Clean, check, fix and tag term in dictionaries
  • Clean, check, fix brill rules
  • Add more ngrams
  • Add more tests, expecially for filters
  • Collect and load frill words
  • Better Oop for some classes?
  • In module for logical analysis (yet not published) collect synonyms and temporal expressions

[8]ページ先頭

©2009-2025 Movatter.jp