13
Closed. This question does not meetStack Overflow guidelines. It is not currently accepting answers.

Questions asking us torecommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead,describe the problem and what has been done so far to solve it.

Closed11 years ago.

I would like to calculate K-nearest neighbour in python. what library should i use?

askedApr 6, 2011 at 11:56
sramij's user avatar

4 Answers4

22

I think that you should usescikit ann.

There is a good tutorial about the nearest neightbourhere.

According to the documentation :

ann is a SWIG-generated python wrapper for the Approximate Nearest Neighbor (ANN) Library (http://www.cs.umd.edu/~mount/ANN/), developed by David M. Mount and Sunil Arya. ann provides an immutable kdtree implementation (via ANN) which can perform k-nearest neighbor and approximate k

DCS's user avatar
DCS
3,3941 gold badge27 silver badges40 bronze badges
answeredApr 6, 2011 at 12:00
Sandro Munda's user avatar
Sign up to request clarification or add additional context in comments.

4 Comments

+1 this library is very easy to work with.
scikit.ann not the same as scikit-learn. scikit.ann hard to compile even using easy_install(it requires swig), so scikit-learn is better solution.
The scikit ann link is broken.
ANN is not same as KNN (which the question is originally about)
5

Here is a script comparing scipy.spatial.cKDTree and pyflann.FLANN. See for yourself which one is faster for your application.

import cProfileimport numpy as npimport osimport pyflannimport scipy.spatial# Config paramsdim = 4data_size = 1000test_size = 1# Generate datanp.random.seed(1)dataset = np.random.rand(data_size, dim)testset = np.random.rand(test_size, dim)def test_pyflann_flann(num_reps):    flann = pyflann.FLANN()    for rep in range(num_reps):        params = flann.build_index(dataset, target_precision=0.0, log_level='info')        result = flann.nn_index(testset, 5, checks=params['checks'])def test_scipy_spatial_kdtree(num_reps):    flann = pyflann.FLANN()    for rep in range(num_reps):        kdtree = scipy.spatial.cKDTree(dataset, leafsize=10)        result = kdtree.query(testset, 5)num_reps = 1000cProfile.run('test_pyflann_flann(num_reps); test_scipy_spatial_kdtree(num_reps)', 'out.prof')os.system('runsnake out.prof')
answeredJul 15, 2011 at 5:41
Chris Flesher's user avatar

Comments

4

scipy.spatial.cKDTreeis fast and solid.For an example of using it for NN interpolation, see (ahem)inverse-distance-weighted-idw-interpolation-with-python on SO.

(If you could say e.g. "I have 1M points in 3d, and want k=5 nearest neighbors of 1k new points",you might get better answers or code examples.
What do you want to do with the neighbors once you've found them ?)

answeredApr 6, 2011 at 16:00
denis's user avatar

Comments

4

It is natively in scipy if you're looking to do a kd-tree approach:http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.html#scipy.spatial.KDTree

answeredJun 8, 2012 at 19:29
Max Bileschi's user avatar

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.