Thread: proximity algorithm question

  1. #1
    Registered User
    Join Date
    Jan 2005
    Posts
    108

    proximity algorithm question

    Since there's no forum dedicated for algorithm questions.. thought I could ask here

    Suppose I have a collection of coordinates (x,y or x,y,z), and I want to have an operation which looks up the closest n neighbors to a particular coordinate..

    My friends told me to do this using quadtrees, and some told me to have some sort of sparse maps or something that I don't understand

    What's the best way? There may be a million entries inside this collection..

  2. #2
    Registered User
    Join Date
    Oct 2007
    Posts
    166
    It depends a bit on how your setup is...if you need to get the distance over and over you could build a voxel grid and assign each point to a specific voxel in space(2d or 3d). Then when you lookup the points close to any given point you only need to check the points in the surrounding voxels (If you don't find the wanted amount you may have to check the next set of voxels). If you are unfamiliar with voxels you can simply think of it as slicing up the space into box volumes(3d) or rectangles(2d).

    When points change position you can assign them a different voxel ID.

  3. #3
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Um....(x*x)+(y*y)+(z*z)???

  4. #4
    Registered User
    Join Date
    Oct 2007
    Posts
    166
    For the actual checking of points you could do something like this...create an array of the number of elements you want to find and assign a very large value to each element of the array. Then loop through the points for each point. For each point-distance look for the biggest value in the array and replace it with the current distance if the current distance is less than the array value. That way you will be left with the closest distances. You might want to also create another array and store the points(or indices or whatever) in that as you switch values in the first array.

  5. #5
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Do these points have floating point or integral coordinates?
    I have code that implements a sparse 3D voxel map (as DrSnuggles mentions) in the 'useful classes' section of my website.

    You need to tell us more about it however, in order to be suggested the most appropriate answer. Tell us what you told your friends
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Bubba View Post
    Um....(x*x)+(y*y)+(z*z)???
    The question is not how to determine the distance, but rather how to QUICKLY determine a set of closest points. If you have a million points, checking each of them is obvious non-optimal.

    The voxel solution is basically a crude simulation of octrees. Building an octree is only a good idea if you are going to be performing these checks repeatedly.

  7. #7
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Basic idea of octrees:

    Space is bounded by a cube. The location and size of this cube in space are arbitrary -- the important part is that it is finite in size. In other words, the values of the points must be bounded -- no infinite points.

    D is the dimension of the cube (length of one side).
    N is the maximum number of points per node. This should be some reasonably small number.

    Start with a root cube with dimension DxDxD. Insert each point into the octree in the following manner:

    Code:
    insert_point_in_octree(root, point):
        if the number of points in root is <= N:
            add point to root
        else:
            split root, if not already split, into 8 subcubes of dimension D/2xD/2xD/2 -
              -and divvy points into new subcubes
            for each subcube:
                if point is in subcube:
                    insert_point_in_octree(subcube, point)
                    return
    This is like a binary tree, but instead of 2-way branching, we use 8-way branching. And we choose when to split based on how many points are at a node.

    Searching the octree is similar, but the logic is more complex, because you have to exclude subcubes which cannot possibly contain points which are "close enough." This is how the octree gets its speed. I leave this difficult part to you

  8. #8
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    There's a whole area of database studies on nearest neighbor queries (google should help). You'll likely want to use an R-tree.

  9. #9
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    The question is not how to determine the distance, but rather how to QUICKLY determine a set of closest points. If you have a million points, checking each of them is obvious non-optimal.
    And who says a circle cannot have a finite resolution? The resolution here is cell size.

    But the absolute fastest way to do this in 2D or 3D is by using a quadtree. Once you create the tree, finding neighbors amounts to traversing the tree. Since each quad tree node contains children which are inside the volume of the node you know which points lie in the node. So you set a distance factor to search and set that as the quad-tree recursion size.
    Last edited by VirtualAce; 12-18-2007 at 06:22 PM.

  10. #10
    Registered User
    Join Date
    Jan 2005
    Posts
    108
    Hmm.. R-trees looks interesting, might be what I'm looking for, I'll have to give it a read on wikipedia. Though it seems that it's often used for "find neighbours within X miles", not "find the closest n neighbours"..

    Now that R-tree was mentioned, I remembered a google interview question I had ages ago about geolocation lookup.. I answered quad-tree, maybe the real answer must have been R-trees? Heh I guess that's why I didnt get the job..

    You need to tell us more about it however, in order to be suggested the most appropriate answer. Tell us what you told your friends
    Was thinking of making a website for people who want to be stalked to let them get stalked So you can register yourself, give yourself an x, y coordinate (I guess this might be different with geolocation, because the earth is round?), and the system will give you the closest people to stalk

    So it's pretty simple, just a neighbor lookup. But maybe in the future I'll want lookup on something like "the closest 50 girls in your area" or "the closest 50 guys with income over 200k" or something like that. Heh.. if such an application ever comes out, there's gonna be all sort of trouble with it
    Last edited by underthesun; 12-19-2007 at 02:09 AM.

  11. #11
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Well that's a fairly limited dataset. My solution will work just fine unless you are searching millions of people. And since this is a database you can pre-sort these according to various locations. Use these locations as 'nodes'. Find nodes closest to the user and you have all the users that are somewhat close to the user.

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by underthesun View Post
    Was thinking of making a website for people who want to be stalked to let them get stalked So you can register yourself, give yourself an x, y coordinate (I guess this might be different with geolocation, because the earth is round?), and the system will give you the closest people to stalk
    Oh, so x and y are latitude and longitude?
    That's spherical coordinates iirc, so you can't simply use the usual (x*x+y*y) squared distance formula. See this is why it's good for us to ask lots of questions.

    Since your data points are limited to basically points on a sphere, there might be an easier way to find the closest points. Well at least I assume nobody is under molten rock or in space.
    I wonder if converting to a quaternion would help?
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Registered User
    Join Date
    Jan 2005
    Posts
    108
    I see.. yeah I guess converting to a quarternion could help with calculating the actual distances? I mean if you have a degree difference you can calculate the actual surface distance by multiplying things with the earth's radius?

    But I guess if everyone's on the earth's surface, unless we're talking about people on opposite edges of the earth, absolute distance won't make too much of a difference eh.. could even label it as "distance if you dig a hole straight to the target"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Quick Algorithm Question
    By cjwenigma in forum C++ Programming
    Replies: 3
    Last Post: 11-01-2007, 10:39 AM
  2. Design layer question
    By mdoland in forum C# Programming
    Replies: 0
    Last Post: 10-19-2007, 04:22 AM
  3. STL algorithm question
    By Reggie in forum C++ Programming
    Replies: 1
    Last Post: 04-22-2003, 09:04 AM
  4. opengl DC question
    By SAMSAM in forum Game Programming
    Replies: 6
    Last Post: 02-26-2003, 09:22 PM
  5. More a math question than an algorithm
    By Gustaff in forum C Programming
    Replies: 1
    Last Post: 01-28-2003, 01:10 PM