Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Simple Extensions for Scala

NotificationsYou must be signed in to change notification settings

aerabi/skala

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

Simple Extensions for Scala

Build

We are using SBT to build the project. Although, you can build it just using Scalac.

SBT

Simple Build Tool is the most popular build tool for scala. We respect that. So for compiling the code:

$ sbt compile

And for testing:

$ sbt test:run

SBT respects our own SUnit tests.For making a jar out of the project, do:

$ sbt package

Current Things

Currently we have implemented two thing:

PairedIterable

Actually,ir.angellandros.scala.collection.PairedIterable. The main reason for such a data structure is to havereduceByKey.You can now do this:

importir.angellandros.scala.collection.Implicits._vall=List(1,1,1,2,2,3)l.map(_->1).reduceByKey(_+_)

Keyed Vector

The idea behindKeyedVector is that getting rid of dictionary. You don't need a dictionary, because the indices could beString or any other thing that you want. Although it could beInt and you can use it with some dictionary.Anyway,KeyedVector has a close relation withHashMap. EveryKeyedVector has an ID, and this is the most important difference:

valvector=newKeyedVector(id, map)

and withvector.id orvector.toMap you can get either one of them. You can also dovector.get(key) orvector.keySet.

The other class implemented here isKeyedVectors, that provides some tools forKeyedVectors:

vali1=KeyedVectors.dot(v1, v2)vali2=KeyedVectors.normInfinity(v1)vali3=KeyedVectors.normNN(v1, n)// Minkowski norm ^n, which is squared Euclidean norm for n=2vali4=KeyedVectors.distNN(v1, v2, n)// distance function induces from normvali5=KeyedVectors.norm(v1)// Euclidean normvali6=KeyedVectors.eucDist(v1, v2)vali7=KeyedVectors.sqEucDist(v1, v2)// squared Euclidean distancevali8=KeyedVectors.cosineSim(v1, v2)vali9=KeyedVectors.cosineDist(v1, v2)

Canopy Clustering Algorithm

Canopy algorithm on pure scala. There are two implementations for canopy algorithm. First, the simple implementation with O(n^2), and the other, the merged implementation with O(n^{3/2}) running time.

valcanopy=newCanopyDriver(0.95,0.5)valresults= canopy.mergedRun(vectors,KeyedVectors.cosineDist[String])

As you can see, the distance function is an input argument.

SUnit

Simple unit testing class. Now hasassertEquals.

About

Simple Extensions for Scala

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp