# Thread: Nearest Neighbor search for high dimension, slow insert

1. ## Nearest Neighbor search for high dimension, slow insert

Hello all,

I have a mostly static set of high-dimensional data. Given a query point I want to find (exact) the nearest neighbor of this point. As the set is static, insertion time can be relatively slow (N^3 would be okay-ish still), but I need the query to be as quick as possible.
What are some good algorithms for this? Any ideas?

EVOEx 2. I think the usual method is to subdivide the space recursively by hyperplanes, down to some fixed depth, or some fixed number of points in each subspace. To find the nearest neighbor, you find the closest subspace, then the closest sub-subspace, etc until you hit a leaf, then you exhaustively test all the points in that subspace. For a concrete example see Octree, which is what I'm talking about if the number of dimensions is 3. You have more dimensions but the idea is the same. 3. Thanks Brewbuck. But isn't it that the same as a kd-tree (that only splits it in two segments every subdivision, but the algorithm in essence would be the same, would it not)?

However, the kd-tree wikipedia article contains the following:
kd-trees are not suitable for efficiently finding the nearest neighbour in high dimensional spaces. As a general rule, if the dimensionality is k, then number of points in the data, N, should be N >> 2k. Otherwise, when kd-trees are used with high-dimensional data, most of the points in the tree will be evaluated and the efficiency is no better than exhaustive search, and approximate nearest-neighbour methods are used instead.
So am I missing a fundamental difference between kd-trees and "octrees"? Or is that wikipedia quote only true in worst case? Popular pages Recent additions 