airtonjal/centrality-nubankPublic

NotificationsYou must be signed in to change notification settings
Fork1
Star2

A simple RESTful server application to solve the graph centrality problem

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
img		img
project		project
src		src
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Repository files navigation

centrality-nubank

A simple RESTful server application to solve the graph centrality problem

Problem description

In this challenge, suppose we are looking to do social network analysis for prospective customers. We want to extract from their social network a metric called "closeness centrality".

Centrality metrics try to approximate a measure of influence of an individual within a social network. The distance between any two vertices is their shortest path. Thefarness of a given vertexv is the sum of all distances from each vertex tov. Finally, thecloseness of a vertexv is the inverse of thefarness.

The first part of the challenge is to rank the vertices in a given undirected graph by theircloseness. The graph is provided in the attached file; each line of the file consists of two vertex names separated by a single space, representing an edge between those twonodes.

The second part of the challenge is to create a RESTful web server with endpoints to register edges and to render a ranking of vertexes sorted by centrality. We can think of the centrality value for a node as an initial "score" for that customer.

The third and final part is to add another endpoint to flag a customer node as "fraudulent". It should take a vertex id, and update the internal customer score as such:

The fraudulent customer score should be zero.
Customers directly related to the "fraudulent" customer should have their score halved.
More generally, scores of customers indirectly referred by the "fraudulent" customer should be multiplied by a coefficient F:

F(k) = (1 - (1/2)^k)

where k is the length of the shortest path from the "fraudulent" customer to the customer in question.

Rules and assumptions

Edge weight == 1
edges.txt must be read and loaded when the application starts
REST server might receive new vertexes, not necessarily existed previously
The graph is not necessarily connected

Solution

Read edges.txt, create an adjacency map (given a vertex returns adjacent vertexes)
ImplementBreadth First Search to find the shortest paths of a vertex to every other in the graph
UseBFS for all vertexes, put the result in a map, calculate and sort closeness
When a fraud is signalized, recalculate and sort the closeness of every connected node
When an edge is added, recalculate and sort the closeness of the whole graph

Building, testing and running

The application uses Scala'sSBT tool. To build it, cd to the project root dir and type:

sbt package

To run the application and start the REST server:

sbt run

To run tests (developed with ScalaTest):

sbttest

The shell output should be something like the following:

Dependencies

JDK 7
Scala 2.10.5
SBT 0.13.0
Spray.io 1.3.2
Akka 2.3.6
ScalaTest 2.2.5

REST Services

Services are invoked through http requests. The data format isJSON. The following table describes the services:

Service	Description	Path	Method	Return
List	Lists the vertexes sorted by score	/graph/list	GET	A JSON like the following:`[ {"vertex":44,"score":0.005988024}, {"vertex":88,"score":0.00591716}, {"vertex":33,"score":0.005882353} ... ]`
Add	Adds a new edge to the graph	/edge/${v1}/${v2}	POST	A string confirming the edge was added
Fraud	Signalizes a vertex as fraudulent	/fraud/${v}	PUT	A string confirming the vertex was marked

TODO

Implement core functionality usingAkka Actors
Avoid recalculating the whole graph closeness when an edge is added
Use a dependency injection framework likeScaldi orGuice

Initial thoughts / brainstorm

ImplementFloyd-Warshall algorithm. Maybe could be simplified since edge weight == 1
Usebreadth first search, probably better than Floyd-Warshall
Each time a vertex is added, the closeness of each vertex must be recalculated. Can we optmize it??
When a fraud is detected, must recalculate closeness of each vertex. Can we optimize it??

Other ideas:

Use lazy evaluation??
Djikstra Algorithm

About

A simple RESTful server application to solve the graph centrality problem

Releases

No releases published

Packages

No packages published

Languages

Scala100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Folders and files

Latest commit

History

Repository files navigation

centrality-nubank

Problem description

Rules and assumptions

Solution

Building, testing and running

Dependencies

REST Services

TODO

Initial thoughts / brainstorm

About

Resources

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

airtonjal/centrality-nubank

Folders and files

Latest commit

History

Repository files navigation

centrality-nubank

Problem description

Rules and assumptions

Solution

Building, testing and running

Dependencies

REST Services

TODO

Initial thoughts / brainstorm

About

Resources

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages