Thread: finding nearest neighbour distance

  1. #16
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    and the number of points that I am dealing with will be of the order of N= 1000 say, so not very large,,

  2. #17
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    A few people have posted code, most notably 'hilarius', so you must have gotten started now. Perhaps you'd like to post the code you have now, and let us know what part you're finding difficult.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  3. #18
    Registered User
    Join Date
    Dec 2007
    Posts
    16
    the first bit of my code is as follows so far:

    insert
    Code:
    struct point{
    	double x;
    	double y;
    }; 
    typedef struct point point;
    
    double distance (point a, point b); 
    
    double distancex (point a, point b);
    
    double distancey (point a, point b);
    
    int main(){
    int i = 0, j = 0;
    double entropyxy, entropyx, entropyy;
    double dis, result;
    double disx, resultx;
    double disy, resulty;
    double digammaone, digammaN;
    double mutual_information;
    
    
    point myPoints[3] = {{1.0, 1.0}, {2.0, 3.0}, {3.0, 4.0}};
    	 
    	 
    	
    	 
    		for (i = 0; i < 3; i++){
    		
    
    		
    		result = distance(myPoints[i], myPoints[i+1]);
    		printf("first distance:%f\n", result);
    		
    		for (j = 0; j < 3; j++){
    		
    		
    		
    			if (i == j)
    			 continue;
    				
    				
    				
               dis = distance(myPoints[i], myPoints[j]);
                printf("%d %d\n", i, j);
                printf("%f\n", dis);
    				
               if (result > dis)
               result = dis;
    		 
    			
              
    	}
    	
    		   printf("shortest: %f\n", result);
    	    
    		   if(result != 0){entropyxy += (log(2*result));}
    		   printf("entropyxy: %f\n", entropyxy);
    		   
    	
    	}
    what I need help with is the first bit. As it stands I have only worked with 3 points. But what I want to do is use a data file as my input. This data file contains two columns: first is x values, the second column are all the y values. Each row thus makes 1 point.

    So I have a N x 2 array in my input file, where N is the number of points. How do I use this as my input rather than just the 3 points I have in my code at the moment? I understand that I will have to loop over N, and scanf each in but am having trouble with the syntax.

    Any help would be appreciated,

    Thanks

    zesty

  4. #19
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Many points are so far away that you do not have to calculate the actual distance to prove that the point is not the closest point.

    Suppose you have the current closest distance to a neighboring point in variable d. Then, if you are considering a new point with x distance dx and y distance dy, then if dx >= d or dy >= d, you know that this point cannot be closer, and you can avoid squaring numbers and taking square roots.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  5. #20
    Registered User slingerland3g's Avatar
    Join Date
    Jan 2008
    Location
    Seattle
    Posts
    603
    Quote Originally Posted by brewbuck View Post
    Many points are so far away that you do not have to calculate the actual distance to prove that the point is not the closest point.

    Suppose you have the current closest distance to a neighboring point in variable d. Then, if you are considering a new point with x distance dx and y distance dy, then if dx >= d or dy >= d, you know that this point cannot be closer, and you can avoid squaring numbers and taking square roots.


    There is a reason behind squaring. Will your solution hold true for all points across any quadrant?

  6. #21
    The larch
    Join Date
    May 2006
    Posts
    3,573
    I don't see why it wouldn't. You can work with squared values and only compute the square root only once.

    It doesn't matter if you compare the squared distances (x1 - x2) ** 2 + (y1 - y2) ** 2 or the square roots of it. Similarly, the distance is not going to be smaller, if either (x1 - x2) ** 2 or (y1 - y2) ** 2 is larger than the currently smallest distance squared.

    With 1000 points, the time it takes to calculate is almost immeasurable anyway, but with more points, adding this optimization indeed reduces the time a bit (I'm seeing about 1/5 - 1/6 speed-up).
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  7. #22
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Quote Originally Posted by brewbuck View Post
    Many points are so far away that you do not have to calculate the actual distance to prove that the point is not the closest point.

    Suppose you have the current closest distance to a neighboring point in variable d. Then, if you are considering a new point with x distance dx and y distance dy, then if dx >= d or dy >= d, you know that this point cannot be closer, and you can avoid squaring numbers and taking square roots.
    if you know current "optimal" differencies DeltaX and DeltaY you can also immediatle discard all point that have (dx>=DeltaX && dy >= DeltaY) || (dx >= DeltaY && dy >= DeltaX) and thus decreasing even more the number of points to be actually checked using full metrics
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. For the numerical recipes in C types!
    By Smattacus in forum C Programming
    Replies: 5
    Last Post: 10-28-2008, 07:57 PM
  2. Distance Formula in my program..... I need help fast!!!
    By Mackology101 in forum C Programming
    Replies: 3
    Last Post: 09-23-2004, 10:10 PM
  3. shortest path problems
    By talz13 in forum C++ Programming
    Replies: 7
    Last Post: 05-08-2004, 06:13 AM
  4. Distance Formula Implecations: Urgent!
    By KneeLess in forum C Programming
    Replies: 6
    Last Post: 03-20-2004, 10:52 PM
  5. Fuzzy Logic
    By ygfperson in forum A Brief History of Cprogramming.com
    Replies: 10
    Last Post: 10-13-2002, 04:58 PM