Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

k-means clustering in Ruby

License

NotificationsYou must be signed in to change notification settings

gbuesing/kmeans-clusterer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

k-means clustering in Ruby. UsesNArray under the hood for fast calculations.

Jump to theexamples directory to see this in action.

Features

  • Runs multiple clustering attempts to find optimal solution (single runs are susceptible to falling into non-optimal local minima)
  • Initializes centroids viak-means++ algorithm, for faster convergence
  • Calculatessilhouette score for evaluation
  • Option to scale data before clustering, so that output isn't biased by different feature scales
  • Works with high-dimensional data

Install

gem install kmeans-clusterer

Usage

Simple example:

require'kmeans-clusterer'data=[[40.71,-74.01],[34.05,-118.24],[39.29,-76.61],[45.52,-122.68],[38.9,-77.04],[36.11,-115.17]]labels=['New York','Los Angeles','Baltimore','Portland','Washington DC','Las Vegas']k=2# find 2 clusters in datakmeans=KMeansClusterer.runk,data,labels:labels,runs:5kmeans.clusters.eachdo |cluster|putscluster.id.to_s +'. ' +cluster.points.map(&:label).join(", ") +"\t" +cluster.centroid.to_send# Use existing clusters for prediction with new data:predicted=kmeans.predict[[41.85,-87.65]]# Chicagoputs"\nClosest cluster to Chicago:#{predicted[0]}"# Clustering quality score. Value between -1.0..1.0 (1.0 is best)puts"\nSilhouette score:#{kmeans.silhouette.round(2)}"

Output of simple example:

0. New York, Baltimore, Washington DC [39.63, -75.89]1. Los Angeles, Portland, Las Vegas [38.56, -118.7]Closest cluster to Chicago: 0Silhouette score: 0.91

Options

The following options can be passed in toKMeansClusterer.run:

optiondefaultdescription
:labelsniloptional array of Ruby objects to collate with data array
:runs10number of times to run kmeans
:logfalseprint stats after each run
:init:kmppalgorithm for picking initial cluster centroids. Accepts :kmpp, :random, or an array of k centroids
:scale_datafalsescales features before clustering using formula (data - mean) / std
:float_precision:doublefloat precision to use. :double or :single
:max_iter300max iterations per run

About

k-means clustering in Ruby

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp