1. ## proximity algorithm question

Since there's no forum dedicated for algorithm questions.. thought I could ask here

Suppose I have a collection of coordinates (x,y or x,y,z), and I want to have an operation which looks up the closest n neighbors to a particular coordinate..

My friends told me to do this using quadtrees, and some told me to have some sort of sparse maps or something that I don't understand

What's the best way? There may be a million entries inside this collection..

2. It depends a bit on how your setup is...if you need to get the distance over and over you could build a voxel grid and assign each point to a specific voxel in space(2d or 3d). Then when you lookup the points close to any given point you only need to check the points in the surrounding voxels (If you don't find the wanted amount you may have to check the next set of voxels). If you are unfamiliar with voxels you can simply think of it as slicing up the space into box volumes(3d) or rectangles(2d).

When points change position you can assign them a different voxel ID.

3. Um....(x*x)+(y*y)+(z*z)???

4. For the actual checking of points you could do something like this...create an array of the number of elements you want to find and assign a very large value to each element of the array. Then loop through the points for each point. For each point-distance look for the biggest value in the array and replace it with the current distance if the current distance is less than the array value. That way you will be left with the closest distances. You might want to also create another array and store the points(or indices or whatever) in that as you switch values in the first array.

5. Do these points have floating point or integral coordinates?
I have code that implements a sparse 3D voxel map (as DrSnuggles mentions) in the 'useful classes' section of my website.

You need to tell us more about it however, in order to be suggested the most appropriate answer. Tell us what you told your friends

6. Originally Posted by Bubba
Um....(x*x)+(y*y)+(z*z)???
The question is not how to determine the distance, but rather how to QUICKLY determine a set of closest points. If you have a million points, checking each of them is obvious non-optimal.

The voxel solution is basically a crude simulation of octrees. Building an octree is only a good idea if you are going to be performing these checks repeatedly.

7. Basic idea of octrees:

Space is bounded by a cube. The location and size of this cube in space are arbitrary -- the important part is that it is finite in size. In other words, the values of the points must be bounded -- no infinite points.

D is the dimension of the cube (length of one side).
N is the maximum number of points per node. This should be some reasonably small number.

Start with a root cube with dimension DxDxD. Insert each point into the octree in the following manner:

Code:
```insert_point_in_octree(root, point):
if the number of points in root is <= N:
else:
split root, if not already split, into 8 subcubes of dimension D/2xD/2xD/2 -
-and divvy points into new subcubes
for each subcube:
if point is in subcube:
insert_point_in_octree(subcube, point)
return```
This is like a binary tree, but instead of 2-way branching, we use 8-way branching. And we choose when to split based on how many points are at a node.

Searching the octree is similar, but the logic is more complex, because you have to exclude subcubes which cannot possibly contain points which are "close enough." This is how the octree gets its speed. I leave this difficult part to you

8. There's a whole area of database studies on nearest neighbor queries (google should help). You'll likely want to use an R-tree.

9. The question is not how to determine the distance, but rather how to QUICKLY determine a set of closest points. If you have a million points, checking each of them is obvious non-optimal.
And who says a circle cannot have a finite resolution? The resolution here is cell size.

But the absolute fastest way to do this in 2D or 3D is by using a quadtree. Once you create the tree, finding neighbors amounts to traversing the tree. Since each quad tree node contains children which are inside the volume of the node you know which points lie in the node. So you set a distance factor to search and set that as the quad-tree recursion size.

10. Hmm.. R-trees looks interesting, might be what I'm looking for, I'll have to give it a read on wikipedia. Though it seems that it's often used for "find neighbours within X miles", not "find the closest n neighbours"..

Now that R-tree was mentioned, I remembered a google interview question I had ages ago about geolocation lookup.. I answered quad-tree, maybe the real answer must have been R-trees? Heh I guess that's why I didnt get the job..

You need to tell us more about it however, in order to be suggested the most appropriate answer. Tell us what you told your friends
Was thinking of making a website for people who want to be stalked to let them get stalked So you can register yourself, give yourself an x, y coordinate (I guess this might be different with geolocation, because the earth is round?), and the system will give you the closest people to stalk

So it's pretty simple, just a neighbor lookup. But maybe in the future I'll want lookup on something like "the closest 50 girls in your area" or "the closest 50 guys with income over 200k" or something like that. Heh.. if such an application ever comes out, there's gonna be all sort of trouble with it

11. Well that's a fairly limited dataset. My solution will work just fine unless you are searching millions of people. And since this is a database you can pre-sort these according to various locations. Use these locations as 'nodes'. Find nodes closest to the user and you have all the users that are somewhat close to the user.

12. Originally Posted by underthesun
Was thinking of making a website for people who want to be stalked to let them get stalked So you can register yourself, give yourself an x, y coordinate (I guess this might be different with geolocation, because the earth is round?), and the system will give you the closest people to stalk
Oh, so x and y are latitude and longitude?
That's spherical coordinates iirc, so you can't simply use the usual (x*x+y*y) squared distance formula. See this is why it's good for us to ask lots of questions.

Since your data points are limited to basically points on a sphere, there might be an easier way to find the closest points. Well at least I assume nobody is under molten rock or in space.
I wonder if converting to a quaternion would help?

13. I see.. yeah I guess converting to a quarternion could help with calculating the actual distances? I mean if you have a degree difference you can calculate the actual surface distance by multiplying things with the earth's radius?

But I guess if everyone's on the earth's surface, unless we're talking about people on opposite edges of the earth, absolute distance won't make too much of a difference eh.. could even label it as "distance if you dig a hole straight to the target"