Thread: finding nearest neighbour distance

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    16

    finding nearest neighbour distance

    hi,

    my problem is as follows:

    I have a list of N pairs of coordinates labelled (x[i],y[i]) i=1,...,N.

    What I want to do is to take every point (x[i],y[i]) and find the Pythagorean distance to it's nearest neighbour.

    So I have to loop over all the points, and then take that minimum distance for each point.. the distance will be given by

    sqrt((x-x1)^2 +(y-y1)^2)



    I am new to C and have no idea how to do this!

    Any help would be appreciated


    thanks a lot

    mo786

  2. #2
    Registered User
    Join Date
    Nov 2009
    Posts
    60
    I would start making a struct "point" which has two members: x and y (floating point numbers).
    Then you proceed with making an array of "points".

    I hope this helps you to start...

  3. #3
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    I assume that x[] and y[] are arrays. In which case you have N points. Each point (i) has the coordinate (x[i], y[i])
    Find the distance of its nearest neighboor? Do you just mean find the nearest neighboor? I mean you cannot know which is the nearest if you don't calculate distances.

    The algorithm is as follows in pseudo-code. Assume sqrt and pow (power)functions
    Code:
    for(i : 1...N)
       distance = sqrt(pow(x[i], 2) + pow(y[i], 2));
       store distance;
    You can store the distance in a 2D array with a NxN size. The element on (i,j) position in the array be the distance between the points i and j.

    Post what code you have until now

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    First some questions and an observation from what has been posted already:
    1. What do you plan on doing with the set of nearest points once you have it?
    2. How many points can you have?
    3. Do you have any experience building any kind of binary trees?
    4. How is it that you need to be solving such a potentially relatively complex thing in C when you're new to the language?


    There's no need to use a sqrt at all here. Just compare distances-squared.
    sqrt(a) < sqrt(b) if-and-only-if a < b
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #5
    Registered User slingerland3g's Avatar
    Join Date
    Jan 2008
    Location
    Seattle
    Posts
    603
    When you plot your 3 points, for your pythagorean theorem to work, these points must form a right triagle (90 deg). I guess your requirements or needing you to calculate the destance of the hypotenuse and use this as your distance measurement? There is a lot more at play here unless the points given in your array are guaranteed to form right angled triangles.

  6. #6
    Registered User
    Join Date
    Nov 2009
    Posts
    60
    For the distance between two points, you don't need a third point.
    If you take two perpendicular axes X and Y, then the points (x,y) and (x1, y1) are at distance:

    sqrt((x-x1)^2 +(y-y1)^2): this formula is based on Pythagoras, but the right angle comes from the angle between the axes.

    So, zesty was right with his formula.

  7. #7
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472
    There is a lot more at play here unless the points given in your array are guaranteed to form right angled triangles.
    i was thinking this also, there must be a rule in the problem that is not being explained to allow for the third vertice

  8. #8
    Registered User
    Join Date
    Nov 2009
    Posts
    60
    Here is my attempt for a function that takes the distance between two points:

    Code:
    #include <stdio.h>
    #include <math.h>
    
    struct point{
    	double x;
    	double y;
    }; 
    
    typedef struct point point;
    
    double distance (point a, point b); 
    
    int main(){
    	point a = {1.2, 2.3};
    	point b = {3.0, 3.1};
    	printf("%f\n", distance(a,b));
    	return 0;
    }
    
    double distance (point a, point b){
    	return  sqrt((a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y));
    }
    for an explanation of the formula: LOOK HERE
    Last edited by hilarius; 11-27-2009 at 02:06 AM. Reason: explanation

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by iMalc
    # Do you have any experience building any kind of binary trees?
    hmm... ignoring possible answers to your first question for the time being, what use will a binary tree be here? I had the impression that at least N(N+1)/2 (effective) distance computations are needed since the aim is to find the distance to the nearest neighbour for each point.

    Quote Originally Posted by iMalc
    There's no need to use a sqrt at all here. Just compare distances-squared.
    That will work to reduce the amount of computation needed to find the nearest neighbour, but eventually one must take the square root to actually obtain the distance.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    The larch
    Join Date
    May 2006
    Posts
    3,573
    hmm... ignoring possible answers to your first question for the time being, what use will a binary tree be here? I had the impression that at least N(N+1)/2 (effective) distance computations are needed since the aim is to find the distance to the nearest neighbour for each point.
    Perhaps he means something like a quad-tree. While I haven't used such a thing myself, I understand the point is that it allows you to find points close to somewhere. Checking everything against everything else could be expensive (e.g for collision detection in a game).

    However, in this excercise, I suppose a N^2 algorithm is acceptable and expected.

    That will work to reduce the amount of computation needed to find the nearest neighbour, but eventually one must take the square root to actually obtain the distance.
    Yes, but that will be N computations of square root instead of N*(N-1) / 2.
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  11. #11
    Registered User
    Join Date
    Nov 2009
    Posts
    60
    I think this program takes the shortest distance between a list of points. My list is a simple array of points. The point is a simple struct with 2 members: x and y coordinate.

    Code:
    #include <stdio.h>
    #include <math.h>
    
    struct point{
    	double x;
    	double y;
    }; 
    
    typedef struct point point;
    
    double distance (point a, point b); 
    
    int main(){
    	int i = 0, j = 0;
    	point myPoints[5] = {{1.0, 2.0}, {1.0, 4.0}, {2.0,3.0}, {1.0, 3.0}, {3.0, 7.0}};
    	double result = distance(myPoints[0], myPoints[1]);
    	printf("%s%f\n", "eerste: ", result);
    	for (i = 0; i < 5; i++){
    		for (j = 0; j < 5; j++){
    			if (i == j)
    			 	continue;
    			if (result > distance(myPoints[i], myPoints[j]))
    				result = distance(myPoints[i], myPoints[j]);
    			printf("%d %d\n", i, j);
    			printf("%f\n", result);
    		}
    	}
    	
    	return 0;
    }
    
    double distance (point a, point b){
    	return  sqrt((a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y));
    }
    I compare the counters i and j, because it doesn't make sense to compare the distance between a point and itself. So in case i == j, the program continues with a next j.
    If the new distance is smaller than the result so far, the result gets this smaller value.
    At the end of the for-loops, result has the smallest distance.

  12. #12
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    Some minor optimizations
    Code:
    int dis;
    ....
    //in for-loop
    dis = distance(myPoints[i], myPoints[j]);
    if  (result > dis)
    	result = dis;
    dis will be a variable so you won't calculate twice distance

    Code:
    double distance (point a, point b){
    	int t1 = a.x-b.x;
            int t2 = a.y-b.y;
            return sqrt(t1*t1, t2*t2);
    }
    so you avoid computing a.x-b.x and a.y-b.y twice. Though, the compiler would probably optimize the code for you anyway, just thought to point it out

  13. #13
    Registered User
    Join Date
    Nov 2009
    Posts
    60
    Quote Originally Posted by C_ntua View Post
    Some minor optimizations
    Code:
    int dis;
    ....
    //in for-loop
    dis = distance(myPoints[i], myPoints[j]);
    if  (result > dis)
    	result = dis;
    dis will be a variable so you won't calculate twice distance

    Code:
    double distance (point a, point b){
    	int t1 = a.x-b.x;
            int t2 = a.y-b.y;
            return sqrt(t1*t1, t2*t2);
    }
    so you avoid computing a.x-b.x and a.y-b.y twice. Though, the compiler would probably optimize the code for you anyway, just thought to point it out
    Thanks for these helpful hints!

  14. #14
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by laserlight View Post
    hmm... ignoring possible answers to your first question for the time being, what use will a binary tree be here? I had the impression that at least N(N+1)/2 (effective) distance computations are needed since the aim is to find the distance to the nearest neighbour for each point.
    If the number of points is very large then building a 2D-Tree or Quad-Tree for example would allow one to find the nearest neighbour in less than O(n*n) time.
    Sure I could have waited to get an answer to how many points there are first before asking that one, but as we can see it's taking long enough to get a second post as it is.

    That will work to reduce the amount of computation needed to find the nearest neighbour, but eventually one must take the square root to actually obtain the distance.
    Yes at that step it would be needed, but not before.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  15. #15
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    thanks everyone for all your help! sorry it has taken me so long to reply..

    at the moment, i am simply looking for a computation of the order of N^2 where N is the number of points labelled (x(i),y(i)) that I have. Once I have done this, then I will investigate ways to improve my algorithm to lower orders, but just want to deal with the simplest case at the moment.

    As, I only have a 2D array, the distance between any two points is of the form as I wrote in my original post. My aim was to take each point, compute the DISTANCES to every other point, and then take the minimum of these distances in order to find it's nearest neighbour. The value of this minimum distance would then be substituted into a seperate formula.

    All your posts have been a great help, thanks so much!

    zesty
    xx

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. For the numerical recipes in C types!
    By Smattacus in forum C Programming
    Replies: 5
    Last Post: 10-28-2008, 07:57 PM
  2. Distance Formula in my program..... I need help fast!!!
    By Mackology101 in forum C Programming
    Replies: 3
    Last Post: 09-23-2004, 10:10 PM
  3. shortest path problems
    By talz13 in forum C++ Programming
    Replies: 7
    Last Post: 05-08-2004, 06:13 AM
  4. Distance Formula Implecations: Urgent!
    By KneeLess in forum C Programming
    Replies: 6
    Last Post: 03-20-2004, 10:52 PM
  5. Fuzzy Logic
    By ygfperson in forum A Brief History of Cprogramming.com
    Replies: 10
    Last Post: 10-13-2002, 04:58 PM