A 300 * 300 * 300 array takes 27 million elements. Each element (if they are float) takes up 4 bytes, so the array takes up 104MB each. (Plus 300 + (300 * 300) pointers, but that's 0.3% of the total size, so we don't really worry about that). 30 x 100MB makes 3GB, which is more than what Linux and Windows in 32-bit mode can cope with.

You will need to find a different approach. My guess would be that your arrays are "sparse" - meaning that they are only partly populated with meaningfull data. The solution then is to implement a more dynamic approach to storing the data.

There are several ways to implement the sparse arrays; a few ideas:

- a linked list of index and content,

- an array with blocks of "start, end, array of content"

- Always allocate the whole 300 entries, but only where you have them.

The last case may be the simplest to implement:

- In this case, you don't really need a "get content (x, y, z)" type function - just allocate a single 300 entry array, and assign that to the [x][y] arrays (in your example) to hold the single empty. This address would have to be stored somewhere so that you can later on compare it when you write data [you still need a "store data to (x, y, z)" function].

Another note: You should take care when you pick the order of the dimensions - there is a difference between these two loops:

Code:

int x, y, z;
for(z = 0; z < size; z++)
{
for(y = 0; y < size; y++)
{
for(x = 0; x < size; x++)
{
use(array[x][y][z]);
}
}
}

and

Code:

int x, y, z;
for(z = 0; z < size; z++)
{
for(y = 0; y < size; y++)
{
for(x = 0; x < size; x++)
{
use(array[z][y][x]);
}
}
}

Since, the x value changes most often, this means that in the first example that the pointer at the outermost level will need to be re-read every iteration of the loop. In the second example, only the index into the actual array needs to be re-calculated for each iteration, as the outer two pointers do not change within the inner loop. This means that the second loop form would probably be about 4x faster - and that before we take into account cache performance, which probably gives another 2x performance. So in total the first example may be about 10x slower than the second one.

--

Mats