# Thread: For the numerical recipes in C types!

1. ## For the numerical recipes in C types!

Hey guys,

I'm looking for a little input on something I'm about to implement. I figure some of you guys might be interested in thinking about this a little bit if you haven't already before.

So, I have two lists of objects that I have positions of in RA and Dec - essentially spherical coordinates for the uninitiated. One list of objects is rather short - about 100 or less - and the other is potentially quite long - perhaps 2 or 3 thousand.

What needs to be done is to match each one of these objects in the short list to the closest object in the second list. So, obviously, the slow way to do this is to take object 1 from the short list and compare its distance against each one of the long list. I am saying this just to illustrate what needs to be done!

So, I'm thinking of implementing a KD tree form from Numerical Recipes in C in order to do this. Basically, I'll just construct the tree from the large list, and then feed every element from the small list into it and pull out the value that I need. Do any of you gurus out there have any input on perhaps efficient ways to do this, or other methods I should look at? I'm just looking for a little input on alternatives!

Thanks in advance for your help! 2. I'll be watching this thread. I have some code that does what the OP is trying to avoid. It gets the job done, but it's painfully slow.

Note to self - read Numerical Recipes in C.... 3. I think I'm going to use a KD tree, as it should be quite fast.

Making the tree has a complexity of N log N, and matching one point to it takes log N time, where N is the size of the list I'm making into a tree. Since I'll be matching N2 points, the complexity of the matching operations in total becomes N2 log N, plus N log N for making the tree. Obviously, this is way faster than N2 * N.

When I've implemented it and have it working I'll put some code up for perusal.

Wow, Katy! I went to Memorial High School right next door in Houston! 4. Originally Posted by Smattacus Wow, Katy! I went to Memorial High School right next door in Houston!
Our football team could kick your football team's hiney!! Comment your code well please! 5. Sup Dino, hopefully you still have a subscription on this. I'm posting this since you said you'd be interested in the solution. So I've implemented this KD Tree and got it working, though there are a few caveats since I've tailored it to my specific application. There are a few weird things running around, like the fact that I never use the
Code:
`Box`
struct; instead I just put the Box info in the Boxnode struct and use that all the way through. Here is the code that I've used.

First, here is the header file.:
Code:
```/*
* kdtree.h
*
*  Created on: Oct 21, 2008
*      Author: Sean Mattingly
*
*      A 2 dimensional implementation of a KD tree.
*
*      The one function that is different than a normal implementation
*      is that the distance function is a plane approximation of the sky,
*      assuming the coordinates are in RA and Dec for dimensions 1 and 2
*      respectively, rather than cartesian coords x and y.
*/

#define BIG 1.0e99
#define TRUE 1
#define FALSE 0

typedef struct {
double x;
}Point;

typedef struct {
Point hi, low;
} Box;

/**
* Boxnode. Mother box, two daughter boxes, and a
* low and high index on a list of points - indicates
* the range of points inside the box.
*/
typedef struct {
int mom, dau1, dau2, ptlo, pthi;
Point hi, low;
} Boxnode;

typedef struct {
int nboxes, npts;
Point *ptss;
Boxnode *boxes;
int *ptindx, *rptindx;
double *coords;
} KDtree;```
Secondly, here is the source file containing all the routines needed to construct and build this thing, with comments and warnings with this particular implementation. Again, reading these comments is essential because I've tailored a number of aspects to my implementation, but it should not be hard to modify. Specifically, I use a slightly different distance measure and it currently is only applied to two dimensions because that's all I'll use - however, it would be easy to extend it.

If you're able to get your hands on a copy of Numerical Recipes 3rd edition, check chpt 21, sections 1 and 2. It's an excellent description of what's going on here, though the code they have written is all in C++. I've based this off of that, but this is all in C.

If you have any questions about this, post here and I'll help out!

Code:
```/*
* kdtree.c
*
*  Created on: Oct 21, 2008
*      Author: Sean Mattingly
*
*      This is a KD tree implementation based off of Numerical
*      Recipes in C; however, some changes have been made. This
*      is specifically tailored to be done on two sets of RA and Dec points,
*      and as a result, can only be used in two dimensions - not three!
*
*      In addition, the distance operator is slightly different than a normal pythagorean distance.
*      Rather, it is based off a plane approximation of the sky using RA and Dec coords:
*
*      distance = sqrt( ( (RA_1 - RA_2) * cos(dec) )^2 + (Dec_1 - Dec_2)^2)
*
*		where dec is either (dec1 + dec2) /2 or simply dec1 or dec2. In this implementation
*		it is the average.
*
*      One can see that, in fact, this is a pythagorean distance, but in RA and Dec.
*
*      No implementation currently exists for finding the nearest neighbor within the tree of a point
*      within the tree, but the framework exists (see: disti()) so it may be easily added.
*/

#include "kdtree.h"
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define DEGTORA 0.0174532925
#define DIM 2
Point PConst(double, double);
int nearest(Point, KDtree *, double);
int locate(Point, KDtree *);
Point Copy_Point(Point);
void Point_Equals(Point *, Point *);
int Point_Bool_Equals(Point *, Point *);
double PDist(Point *, Point *);
double BDist(Boxnode *, Point *);
Boxnode BNConst(Point, Point, int, int, int, int, int);
Point PConst(double, double);
int selecti(int, int *, int, double *);
void SwapInt(int *, int *);

/**
* KD Tree constructor.
* *pts is a pointer to an array of points to be organized
* npts is the size of the array
* *tree is a pointer to a tree struct. It only needs to be
*  initialized and pointing to a KDtree, this constructor will take care
*  of all the allocation necessary.
*
*  This constructor is written in the form of a "pending task list"
*  in order to avoid the overhead from recursion calls. Each new daughter box
*  has a index of its mother and its beginning and end in the array ptindx.
*/
kdtree_const(Point *pts, int npts, KDtree *tree) {
int ntmp, m, k, kk, j, nowtask, jbox, np, tmom, tdim, ptlo, pthi, i;
int *hp;
tree->npts = npts;
tree->ptss = pts;
double *cp;
int taskmom, taskdim;

//Allocate memory chunks.
tree->ptindx = (int *) calloc(tree->npts, sizeof(int));
tree->rptindx = (int *) calloc(tree->npts, sizeof(int));

for (k = 0; k < tree->npts; k++) tree->ptindx[k] = k;
m = 1;
for (ntmp = tree->npts; ntmp; ntmp >>=1) {
m <<= 1;
}
tree->nboxes = 2 * npts - (m >> 1);
printf("Boxes is %i, m is %i\n", tree->nboxes, m);
if (m < tree->nboxes) tree->nboxes = m;
tree->nboxes--;
tree->boxes = (Boxnode *) calloc(tree->nboxes, sizeof(Boxnode));

tree->coords = (double *) calloc(DIM * tree->npts, sizeof(double));

for (j = 0, kk = 0; j < DIM; j++, kk += npts) {
for (k = 0; k < npts; k++) tree->coords[kk + k] = tree->ptss[k].x[j];
}
Point lo = PConst(-BIG, -BIG);
Point hi = PConst(BIG, BIG);
tree->boxes = BNConst(lo, hi, 0, 0, 0, 0, npts - 1);
jbox = 0;

taskmom = 0; //Which box
taskdim = 0; //Which dimension
nowtask = 1;

while(nowtask) {
tmom = taskmom[nowtask];
tdim = taskdim[nowtask--];
ptlo = tree->boxes[tmom].ptlo;
pthi = tree->boxes[tmom].pthi;
hp = &(tree->ptindx[ptlo]); //Points at left end of subdivision
cp = &(tree->coords[tdim*npts]); //Points to coordinate list for current dim
np = pthi - ptlo + 1; // # of points in the subdivision
kk = (np - 1) / 2; // Index of last point on left (boundary point)
selecti(kk, hp, np, cp); //This is where all the work is done. See the selecti() comments.
//Create daughters and push them on the task list if they need further subdividing.
hi = PConst(tree->boxes[tmom].hi.x, tree->boxes[tmom].hi.x);
lo = PConst(tree->boxes[tmom].low.x, tree->boxes[tmom].hi.x);
hi.x[tdim] = tree->coords[tdim * npts + hp[kk]];
lo.x[tdim] = tree->coords[tdim * npts + hp[kk]];
tree->boxes[++jbox] = BNConst(tree->boxes[tmom].low, hi, tmom, 0, 0, ptlo, ptlo + kk);
tree->boxes[++jbox] = BNConst(lo, tree->boxes[tmom].hi, tmom, 0, 0, ptlo + kk + 1, pthi);
tree->boxes[tmom].dau1 = jbox - 1;
tree->boxes[tmom].dau2 = jbox;
if (kk > 1) {
taskmom[++nowtask] = jbox - 1;
taskdim[nowtask] = (tdim + 1) % DIM;
}
if (np - kk > 3) {
taskmom[++nowtask] = jbox;
taskdim[nowtask] = (tdim + 1) % DIM;
}
}
for (j = 0; j < tree->npts; j++) tree->rptindx[tree->ptindx[j]] = j; //Create the reverse index.
free(tree->coords);
}

/**
* Deallocates the memory used in the KDtree.
* NOTE: This will deallocate ptss, which is pointed
* to the array that is passed in, and not generated
* anew
*/
void KDtree_Clean(KDtree *tree) {
free(tree->boxes);
free(tree->coords);
free(tree->ptss);
free(tree->ptindx);
free(tree->rptindx);
}

/**
* Returns the index of the nearest point in the kdtree to the
* given point.
*
* NB: This function will test the closest distance that is found, and
* return -1 if it fails the closest distance test. This is done to avoid
* recalculating the distance.
* NOTE: Pass nrst_test = 0 to bypass this.
*/
int nearest(Point pt, KDtree *tree, double nrst_test) {
int i, k, nrst, ntask;
int task; //Stack for boxes (2 ^ 50 max!) waiting to be opened.
double dnrst = BIG; double temp_d;
double d;
//First, we find the nearest kdtree point in the box as pt.
k = locate(pt, tree); //Which box is pt in?

for(i = tree->boxes[k].ptlo; i <= tree->boxes[k].pthi; i++) { //Find nearest.
d = PDist(&tree->ptss[tree->ptindx[i]], &pt);
if (d < dnrst) {
nrst = tree->ptindx[i];
dnrst = d;
}
}
//Second stage: Traverse the tree opening only possibly better boxes.
task = 0;
ntask = 1;
while (ntask) {
k = task[ntask--];
if (BDist(&tree->boxes[k], &pt) < dnrst) { //Distance to closest point in box.
if (tree->boxes[k].dau1) {
task[++ntask] = tree->boxes[k].dau1;
task[++ntask] = tree->boxes[k].dau2;
} else { //Check the 1 or 2 points in the box.
for (i = tree->boxes[k].ptlo; i <= tree->boxes[k].pthi; i++) {
d = PDist(&tree->ptss[tree->ptindx[i]], &pt);
if (d < dnrst) {
nrst = tree->ptindx[i];
dnrst = d;
}
}
}
}
}
//Test for my specific application, see NB.
if(nrst_test == 0 || dnrst < nrst_test) return nrst;
else return -1;
}

/**
* Given an arbitrary point pt, return the index of which kdtree box it is in.
*/
int locate(Point pt, KDtree *tree) {
int nb, d1, jdim;
nb = 0; jdim = 0;
while(tree->boxes[nb].dau1) {
d1 = tree->boxes[nb].dau1;
if (pt.x[jdim] <= tree->boxes[d1].hi.x[jdim]) nb = d1;
else nb = tree->boxes[nb].dau2;
jdim = ++ jdim % DIM;
}
return nb;
}

/**
* Finds the distance between two points in a kdtree. Returns BIG if the points
* are identical.
*/
double disti(int jpt, int kpt, KDtree *tree) {
if (jpt == kpt)
return BIG;
return PDist(&tree->ptss[jpt], &tree->ptss[kpt]);
}

/**
* Returns a copy of a point.
*/
Point Copy_Point (Point p) {
int i;
Point new;
for (i = 0; i < 2; i++) new.x[i] = p.x[i];
return new;
}

/**
* Assignment operators. Makes the point on the left equal
* to the point on the right, ie, l = r.
* Inputs:
* 		Point l 	- Point struct on the Left side
* 		Point r		- Point struct on the right side
*/
void Point_Equals(Point *l, Point *r) {
int i;
for (i = 0; i < 2; i++) l->x[i] = r->x[i];
}

/**
* Boolean equals operator. Returns TRUE or FALSE.
*/
int Point_Bool_Equals(Point *l, Point *r) {
int i;
for (i = 0; i < 2; i++) {
if (l->x[i] != r->x[i]) return FALSE;
}
return TRUE;
}

/**
* Point Distance operator. Uses the plane approximation of the sky.
*/
double PDist(Point *p, Point *q) {
double dist;
double dec, RA_diff, Dec_diff;

dec = (p->x + q->x ) / 2.0;
RA_diff = (p->x - q->x);
Dec_diff = (p->x - q->x);
//	printf("RA_diff is %f, Dec_diff is %f\n", RA_diff, Dec_diff);

dist = sqrt(pow((RA_diff) * cos(dec * DEGTORA), 2) + pow((Dec_diff), 2));
return dist;
}

/**
* Box distance operator. If point P lies outside box b, the distance
* to the nearest point is returned. If p is inside b or on its surface, zero is returned.
*/
double BDist(Boxnode *b, Point *p) {
double dd = 0.0;
double dec, RA_diff, Dec_diff;
if (p->x < b->low.x) {
dec = (p->x + b->low.x) / 2.0;
RA_diff = (p->x - b->low.x);
dd += pow((RA_diff) * cos(dec * DEGTORA), 2);
}
if (p->x > b->hi.x) {
dec = (p->x + b->hi.x) / 2.0;
RA_diff = (p->x - b->hi.x);
dd += pow((RA_diff) * cos(dec * DEGTORA), 2);
}
if (p->x < b->low.x) {
dd += pow((p->x - b->low.x), 2);
}
if (p->x > b->hi.x) {
dd += pow((p->x - b->hi.x), 2);
}
return sqrt(dd);
}

/**
* Box constructor.
*/
void BConst(Point low, Point hi, Box *b) {
b->hi = Copy_Point(hi);
b->low = Copy_Point(low);
return;
}

/**
* Box node constructor.
*/
Boxnode BNConst(Point mylow, Point myhi, int mom, int d1, int d2, int myptlo, int mypthi) {
Boxnode bn;
bn.hi = myhi;
bn.low = mylow;
bn.mom = mom;
bn.dau1 = d1;
bn.dau2 = d2;
bn.ptlo = myptlo;
bn.pthi = mypthi;
return bn;
}

/**
* Point constructor.
*/
Point PConst(double d1, double d2) {
Point p;
p.x = d1;
p.x = d2;
return p;
}

/**
* A selection sort that partitions an array of integers to index a separate array of values.
* The separate array of values is not modified. O(N) complexity. Returns indx[k].
*
* More specifically:
* Permutes the indx[0...n-1] to make arr[indx[0...k-1]]<= arr[indx[k]] <= arr[indx[k+1...n-1]].
* arr is not modified.
*/
int selecti(int k, int *indx, int n, double *arr) {
int i, ia, iright, j, left, mid;
double a;

left = 0;
iright = n - 1;
for( ; ;) {
if (iright <= left +1) { //Active partition contains 1 or 2 elements.
if (iright == left + 1 && arr [indx[iright]] < arr[indx[left]]) // 2 elems
SwapInt(&indx[left], &indx[iright]);
return indx[k];
} else {
//Median of left, center, and right. Rearrange in order.
mid = (left + iright) >> 1;
SwapInt(&indx[mid], &indx[left + 1]);
if (arr[indx[left]] > arr[indx[iright]]) SwapInt(&indx[left], &indx[iright]);
if (arr[indx[left + 1]] > arr[indx[iright]]) SwapInt(&indx[left + 1], &indx[iright]);
if (arr[indx[left]] > arr[indx[left + 1]]) SwapInt(&indx[left], &indx[left + 1]);
i = left + 1;// Initialize pointers for partition.
j = iright;
ia = indx[left + 1];
a = arr[ia]; // Partitioning element.
for ( ; ; ) {
do i++; while (arr[indx[i]] < a);
do j--; while (arr[indx[j]] > a);
if (j < i) break;
SwapInt(&indx[i], &indx[j]);
}
indx[left + 1] = indx[j];
indx[j] = ia;
if (j >= k) iright = j - 1;
if (j <= k) left = i;
}
}
}

/*
* SwapInts two elements in an array.
* Really, SwapInts the contents of any two pointers
* to doubles. Can be used for this as well.
*/
void SwapInt(int *a, int *b) {
int temp;
temp = *a;
*a = *b;
*b = temp;
}``` 6. So totally cool! I'll have to study this to absorb it. I did pick up a copy of NR3. Great book.

Thanks for coming back and posting it.    Popular pages Recent additions 