NotificationsYou must be signed in to change notification settings
Fork0
Star1

Simple Extensions for Scala

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src		src
README.md		README.md
build.sbt		build.sbt

Repository files navigation

skala

Simple Extensions for Scala

Build

We are using SBT to build the project. Although, you can build it just using Scalac.

SBT

Simple Build Tool is the most popular build tool for scala. We respect that. So for compiling the code:

$ sbt compile

And for testing:

$ sbt test:run

SBT respects our own SUnit tests.For making a jar out of the project, do:

$ sbt package

Current Things

Currently we have implemented two thing:

PairedIterable

Actually,ir.angellandros.scala.collection.PairedIterable. The main reason for such a data structure is to havereduceByKey.You can now do this:

importir.angellandros.scala.collection.Implicits._vall=List(1,1,1,2,2,3)l.map(_->1).reduceByKey(_+_)

Keyed Vector

The idea behindKeyedVector is that getting rid of dictionary. You don't need a dictionary, because the indices could beString or any other thing that you want. Although it could beInt and you can use it with some dictionary.Anyway,KeyedVector has a close relation withHashMap. EveryKeyedVector has an ID, and this is the most important difference:

valvector=newKeyedVector(id, map)

and withvector.id orvector.toMap you can get either one of them. You can also dovector.get(key) orvector.keySet.

The other class implemented here isKeyedVectors, that provides some tools forKeyedVectors:

vali1=KeyedVectors.dot(v1, v2)vali2=KeyedVectors.normInfinity(v1)vali3=KeyedVectors.normNN(v1, n)// Minkowski norm ^n, which is squared Euclidean norm for n=2vali4=KeyedVectors.distNN(v1, v2, n)// distance function induces from normvali5=KeyedVectors.norm(v1)// Euclidean normvali6=KeyedVectors.eucDist(v1, v2)vali7=KeyedVectors.sqEucDist(v1, v2)// squared Euclidean distancevali8=KeyedVectors.cosineSim(v1, v2)vali9=KeyedVectors.cosineDist(v1, v2)

Canopy Clustering Algorithm

Canopy algorithm on pure scala. There are two implementations for canopy algorithm. First, the simple implementation with O(n^2), and the other, the merged implementation with O(n^{3/2}) running time.

valcanopy=newCanopyDriver(0.95,0.5)valresults= canopy.mergedRun(vectors,KeyedVectors.cosineDist[String])

As you can see, the distance function is an input argument.

SUnit

Simple unit testing class. Now hasassertEquals.

About

Simple Extensions for Scala

Languages

Scala100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

skala

Build

SBT

Current Things

PairedIterable

Keyed Vector

Canopy Clustering Algorithm

SUnit

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors2

Uh oh!

Languages

Movatterモバイル変換

aerabi/skala

Folders and files

Latest commit

History

Repository files navigation

skala

Build

SBT

Current Things

PairedIterable

Keyed Vector

Canopy Clustering Algorithm

SUnit

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors2

Uh oh!

Languages

Packages