mbejda/Node-OpenNLPPublic

NotificationsYou must be signed in to change notification settings
Fork17
Star56

Apache OpenNLP wrapper for Nodejs

License

MIT license

56 stars 17 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
lib		lib
models		models
spec		spec
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
opennlp.js		opennlp.js
package.json		package.json

Repository files navigation

NodeJs OpenNLP

Node OpenNLP - (OpenNLP 1.6.0)

OpenNLP Wrapper For Node.js

Node-OpenNLP is depended onNode-Java. Please take make sure your environment is properly configured to runNode-Java. Clickhere to learn more aboutNode-Java.

Installation

 npm install opennlp --save

Node-OpenNLP comes withApache OpenNLP 1.6.0 along with the following trained 1.5 series models:

en-chunker.bin
en-ner-person.bin
en-pos-maxent.bin
en-sent.bin
en-token.bin

More trained models can be found here:http://opennlp.sourceforge.net/models-1.5

Sentence Detector

The OpenNLP Sentence Detector can detect that a punctuation character marks the end of a sentence or not. In this sense a sentence is defined as the longest white space trimmed character sequence between two punctuation marks.

varopenNLP=require("opennlp");varsentence='Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .';varsentenceDetector=newopenNLP().sentenceDetector;sentenceDetector.sentDetect(sentence,function(err,results){/// To get probabilitiessentenceDetector.probs(function(error,probability){console.log(error,probability)})console.log(results)});

Configurations

The following default configurations can be overrided during initialization.

varopenNLP=require("opennlp");varopennlp=newopenNLP({models :{doccat:__dirname+'/models/en-doccat.bin',posTagger:__dirname+'/models/en-pos-maxent.bin',tokenizer:__dirname+'/models/en-token.bin',nameFinder:__dirname+'/models/en-ner-person.bin',sentenceDetector:__dirname+'/models/en-sent.bin',chunker:__dirname+'/models/en-chunker.bin'},openNLP={jar:__dirname+"/lib/opennlp-tools-1.6.0.jar"}});

Tokenizer

The OpenNLP Tokenizers segment an input character sequence into tokens. Tokens are usually words, punctuation, numbers, etc.

varopenNLP=require("opennlp");varsentence='Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .';vartokenizer=newopenNLP().tokenizer;tokenizer.tokenize(sentence,function(err,results){console.log(err,results);tokenizer.getTokenProbabilities(function(error,response){console.log(error,response);});});

Name Finder

The Name Finder can detect named entities and numbers in text. To be able to detect entities the Name Finder needs a model. The model is dependent on the language and entity type it was trained for.

varopenNLP=require("opennlp");varsentence='Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .';varnameFinder=newopenNLP().nameFinder;nameFinder.find(sentence,function(err,tokens_arr){console.log(err,tokens_arr)nameFinder.probs(function(error,response){console.log(error,response)});});

Document Categorizer

The OpenNLP Document Categorizer can classify text into pre-defined categories. It is based on maximum entropy framework.

** To use the document categorizer you need to train a model first. The default trained model that is included is for testing purposes only. **

varopenNLP=require("opennlp");vardoccat=newopenNLP().doccat;doccat.categorize("I enjoyed watching Rocky",function(err,list){doccat.getAllResults(list,function(err,category){});doccat.getBestCategory(list,function(err,category){});});doccat.scoreMap("I enjoyed watching Rocky",function(err,category){});doccat.sortedScoreMap("I enjoyed watching Rocky",function(err,category){});doccat.getCategory(1,function(err,category){});doccat.getIndex('Happy',function(err,index){});

Part-of-Speech Tagger

The Part of Speech Tagger marks tokens with their corresponding word type based on the token itself and the context of the token. A token might have multiple pos tags depending on the token and the context. The OpenNLP POS Tagger uses a probability model to predict the correct pos tag out of the tag set.

varopenNLP=require("opennlp");varposTagger=newopenNLP().posTagger;varsentence='Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .';posTagger.tag(sentence,function(err,tokens_arr){console.log(err,tokens_arr)});posTagger.topKSequences(sentence,function(error,tagger){console.log(tagger.getScore())console.log(tagger.getProbs())console.log(tagger.getOutcomes())});

Chunker

Text chunking consists of dividing a text in syntactically correlated parts of words, like noun groups, verb groups, but does not specify their internal structure, nor their role in the main sentence.

varopenNLP=require("opennlp");varposTagger=newopenNLP().posTagger;varsentence='Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .';varchunker=newopenNLP().chunker;posTagger.tag(sentence,function(err,tokens_arr){chunker.topKSequences(sentence,tokens_arr,function(err,tokens_arr){console.log(err,tokens_arr)});chunker.chunk(sentence,tokens_arr,function(err,tokens_arr){chunker.probs(function(error,prob){});});});

Please report any bugs. Feel free to send me a tweet if you need any help.

Follow me on Twitter[@notmilobejda](https://twitter.com/notmilobejda)
My Blog[mbejda.com](https://mbejda.com)

About

Apache OpenNLP wrapper for Nodejs

opennlp.apache.org/

Languages

JavaScript100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

NodeJs OpenNLP

Node OpenNLP - (OpenNLP 1.6.0)

OpenNLP Wrapper For Node.js

Installation

Sentence Detector

Configurations

Tokenizer

Name Finder

Document Categorizer

Part-of-Speech Tagger

Chunker

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors4

Uh oh!

Languages

Movatterモバイル変換

License

mbejda/Node-OpenNLP

Folders and files

Latest commit

History

Repository files navigation

NodeJs OpenNLP

Node OpenNLP - (OpenNLP 1.6.0)

OpenNLP Wrapper For Node.js

Installation

Sentence Detector

Configurations

Tokenizer

Name Finder

Document Categorizer

Part-of-Speech Tagger

Chunker

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors4

Uh oh!

Languages

Packages