Programming-from-A-to-Z/bayes-classifier-jsPublic

NotificationsYou must be signed in to change notification settings
Fork7
Star30

A JavaScript library for Bayesisan classification

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
classifier.js		classifier.js
index.html		index.html
sketch.js		sketch.js

Repository files navigation

Naive Bayes Text Classifier

A simple JavaScript implementation of a Naive Bayes classifier for text classification.

Quick Start

// Create a classifierletclassifier=newClassifier();// Train it with examplesclassifier.train('I am happy','positive');classifier.train('I am sad and disappointed','negative');classifier.train('This is okay','neutral');// Classify new textletresults=classifier.guess('I feel great today!');console.log(results);// → { positive: { probability: 0.85 }, negative: { probability: 0.10 }, neutral: { probability: 0.05 }}

Naive Bayes

Naive Bayes usesBayes' theorem to calculate the probability that a text belongs to each category.

The notation:P(A|B) means "the probability of A given that B is true". For example:

P(rain|cloudy) = "probability of rain given that it's cloudy"
P(positive|"happy") = "probability text is positive given it contains the word 'happy'"

Bayes' theorem says:

P(category|text) = P(text|category) × P(category) / P(text)

In plain English: "How likely is this category, given this text?" equals "How likely is this text in this category?" times "How common is this category overall?"

The "naive" assumption is that words are independent of each other, so:

P(text|category) = P(word1|category) × P(word2|category) × P(word3|category) × ...

Laplace Smoothing

What happens if the classifier sees a word it's never encountered before? Without smoothing, P(word|category) = 0, which makes the entire probability 0 (since we're multiplying).

Laplace smoothing solves this by pretending every possible word appears at least once:

Add +1 to every word count (even words we've never seen)
Add +vocabulary_size to the total word count for each category

Example: Category "positive" has seen 100 words, vocabulary has 1000 unique words total.

Word "amazing" appeared 5 times: P("amazing"|positive) = (5 + 1) / (100 + 1000) = 6/1100
Word "zxqwerty" never seen: P("zxqwerty"|positive) = (0 + 1) / (100 + 1000) = 1/1100

This way, no word gets exactly 0 probability, but rare and unseen words get low probabilities.

Resources

Bayes theorem, the geometry of changing beliefs by 3Blue1Brown
Explaining Bayesian Problems Using Visualizations by Luana Micallef -
A Plan for Spam by Paul Graham
Naive Bayes Classifier
Bayes' Theorem - The mathematical foundation
Laplace Smoothing - Handling zero probabilities

About

A JavaScript library for Bayesisan classification

Releases

No releases published

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Naive Bayes Text Classifier

Quick Start

Naive Bayes

Laplace Smoothing

Resources

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors3

Uh oh!

Languages

Movatterモバイル変換

Programming-from-A-to-Z/bayes-classifier-js

Folders and files

Latest commit

History

Repository files navigation

Naive Bayes Text Classifier

Quick Start

Naive Bayes

Laplace Smoothing

Resources

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors3

Uh oh!

Languages

Packages