Thread: Finding indices efficiently

1. Finding indices efficiently

Hey all,

I have a three dimensional int array of 1s and 0s. I need a reasonably efficient way to, given an index i,j,k of a cell containing 0, find the index of the nearest cell that contains a 1. Here I define nearest as euclidian distance, ie sqrt((i-x)^2 + (j-y)^2 +(k-z)^2)). It is quite a simple matter to brute force it, by nesting three loops and testing each cell that contains a one for proximity to the cell of interest and then taking the minimal one, but give that this array may have several 10s of millions of cells this may take longer than I would like.

So: any suggestions on making this a little more efficient? Ideally I would like to find a way to only search for ones within a certain radius of a given cell, but I am stuck with the same problem of determining which cells are in that radius without looping through the whole array.

2. Originally Posted by KBriggs
It is quite a simple matter to brute force it, by nesting three loops and testing each cell that contains a one for proximity to the cell of interest and then taking the minimal one
A slightly less "brute" force method would be to check concentric circles (more accurately: cubes) around the cell, working out from the center. So the first circle would be 26 cells, the next one 128, and so on. You'd catch ties easily that way too.

I guess that is what you meant by radius but I don't see how there's a difficulty there if you have i, j, & k.

3. Originally Posted by KBriggs
Hey all,

I have a three dimensional int array of 1s and 0s. I need a reasonably efficient way to, given an index i,j,k of a cell containing 0, find the index of the nearest cell that contains a 1. Here I define nearest as euclidian distance, ie sqrt((i-x)^2 + (j-y)^2 +(k-z)^2)). It is quite a simple matter to brute force it, by nesting three loops and testing each cell that contains a one for proximity to the cell of interest and then taking the minimal one, but give that this array may have several 10s of millions of cells this may take longer than I would like.

So: any suggestions on making this a little more efficient? Ideally I would like to find a way to only search for ones within a certain radius of a given cell, but I am stuck with the same problem of determining which cells are in that radius without looping through the whole array.
Well, for one thing, you could forgo the sqrt step during the testing phase.

4. Originally Posted by Sebastiani
Well, for one thing, you could forgo the sqrt step during the testing phase.
Yeah, I know it's gimpy if you like math but you can get that radius with addition and subtraction.

5. Working out from the middle is perfect, but I am having some trouble working out the algorithm in my mind. Something like this, I think, but could you tell me if I am on the right track here for checking concentric cubes of "radius" 1 to nlim, where nlim is just the point at which the nearest 1 is too far away to matter?

Code:
```for (n = 1; n < nlim; n++)
{
for (x = i-n; x <= i+n; x++)
{
for (y = i-n; y <= i+n; y++)
{
for (z = i-n; z <= i+n; z++)
{
do checking...```
Seems to work in my mind, but could you critique? (I am ignoring edge effects for the moment)

Now, another improvement would be to check only the appropriate shell, rather then the entire cube again. But that I can't get my head around just yet. Any hints?

BTW, I need the euclidian distance at a later phase of the algorithm anyway, so I might as well find it now.

6. Are you storing say bits in the X dimension in consecutive bits in a machine word?

How sparse is the array (of 1's in a sea of 0's) ?

If it is sparse, you might be able to reject a whole set of possible coordinates with a simple test.

7. It is likely quite sparse, yes. On the order of a few 10's of thousands of ones in several 10 of millions of zeroes.

Are you storing say bits in the X dimension in consecutive bits in a machine word?
I am not really sure. I just calloc'd a 3D array and went from there.

Code:
```int ***int_calloc3D(int ***array, int x, int y, int z)
{
int i,j;
array = (int ***) calloc(x,sizeof(int **));
if (array==NULL)
{
printf("Error on level one int calloc\n");
abort();
}
for (i = 0; i < x; i++)
{
array[i] = (int **) calloc(y,sizeof(int *));
if (array[i]==NULL)
{
printf("Error on level two int calloc\n");
abort();
}
for (j = 0; j < y; j++)
{
array[i][j] = (int *) calloc(z,sizeof(int));
if (array[i][j]==NULL)
{
printf("Error on level three int calloc\n");
abort();
}
}
}
return array;
}```

8. Well if you're only doing
array[x][y][z] = 0; // or 1
then you're wasting a lot of bits.

Plus, there is a trick to allocating the entire block of memory in a single block, and keeping the [x][y][z] notation as well. Locating neighbours in a single block of memory may be easier if you can do simple relative offsets from your current position.

9. Could you elaborate?

10. Code:
```#include <stdio.h>
#include <stdlib.h>

int ***makeArray ( int x, int y, int z )
{
int ***pa = malloc( x * sizeof *pa );
int **pb = malloc( x * y * sizeof *pb );
int *pc = calloc( x * y * z, sizeof *pc );
int r, c;
for ( r = 0 ; r < x ; r++, pb += y ) {
pa[r] = pb;
for ( c = 0 ; c < y ; c++, pc += z ) {
pa[r][c] = pc;
}
}
return pa;
}
void freeArray ( int ***array )
{
free( array[0][0] );
free( array[0] );
free( array );
}

// With the sizes of y and z being 16 and 64 respectively, on
// a machine with 4-byte integers, the number of bytes occupied
// by each z is 0x100 bytes, and the number of bytes for each
// plane of y*z is 0x1000 bytes.
// For an address of 0xnnnnYXnn, then X increments in each row
// and Y increments in each column.
// This basically shows the whole array is one contiguous block of memory
void showAddresses ( int ***a, int x, int y, int z )
{
int r, c;
for ( r = 0 ; r < 10 && r < x ; r++ ) {
printf( "Addresses on %d: ", r );
for ( c = 0 ; c < 5 && c < y ; c++ ) {
printf( "%p ", (void*)&a[r][c][0] );
}
printf( "\n" );
}
}

void useArray ( int ***a, int x, int y, int z )
{
// set some values
a[5][5][5] = 1;
a[5][4][5] = 2; // directly left/right
a[6][5][5] = 3; // directly above/below
a[2][7][9] = 279;   // something further away

int *c = &a[5][5][5];   // our centre element

printf( "Centre=%d\n", c[0] );

// The three () expressions show the pitch to get from the current
// position to the neighbouring position in each dimension
printf( "Left/Right=%d\n", c[(0) + (-1*z) + (0)] );     // [0][-1][0] from [5][5][5]
printf( "Above/Below=%d\n", c[(1*y*z) + (0) + (0)] );   // [1][0][0]
printf( "Far away=%d\n", c[(-3*y*z) + (2*z) + (4)] );   // [-3][2][4]
}

int main(void)
{
int xs = 10, ys = 16, zs = 64;
int ***a = makeArray( xs, ys, zs );
showAddresses( a, xs, ys, zs );
useArray( a, xs, ys, zs );
freeArray( a );
return 0;
}```

11. That is really neat, but I am not sure that I fully understand how it works ^_^

In these lines

Code:
``` int ***pa = malloc( x * sizeof *pa );
int **pb = malloc( x * y * sizeof *pb );
int *pc = calloc( x * y * z, sizeof *pc```
what are sizeof(*pa), sizeof(*pb) etc?

is
*pa the same as int **
*pb the same as int *
*pc the same as int
?

The purpose of these loops:

Code:
```    for ( r = 0 ; r < x ; r++, pb += y ) {
pa[r] = pb;
for ( c = 0 ; c < y ; c++, pc += z ) {
pa[r][c] = pc;
}
}
return pa;```
is to ensure that the array ends up being one contiguous block of memory? I was under the impression that arrays assigned my calloc were contiguous in memory, but I guess that is not right?

But Ok, an array that I can easily do offsets from a point of interest is a very useful thing, so thank you. I would very much like if you could walk me through a more detailed explanation of how the above differs from regularly calloc'd memory though.

12. > what are sizeof(*pa), sizeof(*pb) etc?
Exactly what you think they are. Except it's done by the compiler rather than you having to make sure the sizeof(type) is still correct.

> is to ensure that the array ends up being one contiguous block of memory?
pc is the single contiguous block. pa and pb are just arrays of pointers that allow you to perform the [x][y][z] subscripting on it, by setting up suitable pointers (pa points to somewhere in pb, pb points to somewhere in pc)

> I was under the impression that arrays assigned my calloc were contiguous in memory
calloc is the same as malloc, except for the "clears memory" side effect.
Each block is contiguous in itself, but adjacent malloc calls can return memory from very different places.

13. Alright, so using your array above, assuming that I have set c to be my cell of interest and ignoring edge effects for the moment, I would like to work out an algorithm to work outward from the central point. I posted a code snippet earlier that I think was missed, if I modify it so:

Code:
```for (n = 1; n < nlim; n++)
{
for (p = i-n; p <= i+n; p++)
{
for (q = j-n; q <= j+n; q++)
{
for (r = k-n; r <= k+n; r++)
{
do something to c[(p*y*z) + (q*z) + (r)]
...```
Then I have a loop that will cover all 3^3 cells in the cube immediately surrounding c, then the 4^4 cells in the cube surrounding that, etc. I would like to be able to visit each cell only once, ie, once I have finished with the 27 in the immediate vicinity of c, I would like to be able to check only the cells that are in the shell surrounding it rather than the entire cube.

I am having trouble wrapping my head around how to modify the limits of the indices to do that, so if anyone has any hints while I mess around with the code, feel free to nudge me in the right direction

EDIT: actually the above code loops is quite wrong, and doesn't loop out from the centre at all. Hrm...

Further EDIT: reason being that since we are doing relative offsets we no longer need the index of the ell of intrest in the loop, so the following loops out from the centre:

Code:
```for (n = 1; n < nlim; n++)
{
for (p = -n; p <= n; p++)
{
for (q = -n; q <= n; q++)
{
for (r = -n; r <= n; r++)
{
do something to c[(p*y*z) + (q*z) + (r)]
...```

14. That seems about right
Drawing lots of diagrams, and testing on really small volumes (say 3x4x5).

If you only want to visit the edge squares on each iteration, then I guess it would be
for (p = i-n; p <= i+n; p+=2*n)

15. hm, not quite, because adding 2*n each time just visits the corners of the cubes, not the faces.

Code:
```for (n = 1; n < nlim; n++)
{
for (p = -n; p <= n; p++)
{
for (q = -n; q <= n; q+=((p==-n) || (p==n)) ? 1 : 2*n)
{
for (r = -n; r <= n; r+=(((p==-n)||(q==-n))||((p==n)||(q==n))) ? 1 : 2*n)
{```

Blarg, this indexing is bending my mind. I -almost- have it. For the case n = 3, this is visiting 104 of the 124 it should be ^_^

I must be missing a condition somewhere... Seems to be missing the cells in line with the central one, and a few others that I can't yet see the pattern.

Popular pages Recent additions