Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

k - Nearest Neighbor Search

DZone's Guide to

k - Nearest Neighbor Search

· Web Dev Zone ·
Free Resource

Deploy code to production now. Release to users when ready. Learn how to separate code deployment from user-facing feature releases with LaunchDarkly.

A k-nearest neighbor search identifies the top k nearest neighbors to a query. The problem is: given a dataset D of vectors in a d-dimensional space and a query point x in the same space, find the closest point in D to x. The following function performs a k-nearest neighbor search using the euclidean distance:
from numpy import random,argsort,sqrt
from pylab import plot,show

def knn_search(x, D, K):
 """ find K nearest neighbours of data among D """
 ndata = D.shape[1]
 K = K if K < ndata else ndata
 # euclidean distances from the other points
 sqd = sqrt(((D - x[:,:ndata])**2).sum(axis=0))
 idx = argsort(sqd) # sorting
 # return the indexes of K nearest neighbours
 return idx[:K]
The function computes the euclidean distance between every point of D and x then returns the indexes of the points for which the distance is smaller.
Now, we will test this function on a random bidimensional dataset:
# knn_search test
data = random.rand(2,200) # random dataset
x = random.rand(2,1) # query point

# performing the search
neig_idx = knn_search(x,data,10)

# plotting the data and the input point
plot(data[0,:],data[1,:],'ob',x[0,0],x[1,0],'or')
# highlighting the neighbours
plot(data[0,neig_idx],data[1,neig_idx],'o',
  markerfacecolor='None',markersize=15,markeredgewidth=1)
show()
The result is as follows:



















The red point is the query vector and the blue ones represent the data. The blue points surrounded by a black circle are the nearest neighbors.

Deploy code to production now. Release to users when ready. Learn how to separate code deployment from user-facing feature releases with LaunchDarkly.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}