Allocating a number of large 3d arrays in C

**Surrender** · 08-18-2008

I'm trying to allocate around 30 3d arrays of size 300*300*300.

I've tried allocating them statically, i.e.:

Code:

float dx[xSize][ySize][zSize];

With this method I get a crash (upon execution) when I allocate the 1st array.

I have better results allocating the arrays dynamically:

Code:

float*** dx ;
dx=malloc(xSize*sizeof *dx );//Allocate 'xSize' pointer-to-pointer of int

for(x=0;x<xSize;x++)
{
dx[ x ] = malloc(ySize* sizeof **dx);//Allocate 'ySize' pointers of int
for(y=0;y<ySize;y++)
{
dx[ x ][ y ]= malloc(zSize* sizeof ***dx);//Allocate 'zSize' ints
}
}

With the above I can allocate around 15 arrays before the execution crashes.

I'm using the Eclipse CDT IDe and the GNU compiler.
I have the program coded in Matlab already but I wanted to code it in C to compare speeds.
Has anyone any experience with working with large datasets in C such as this and could they point me in the right direction?

**Salem** · 08-18-2008

> I'm trying to allocate around 30 3d arrays of size 300*300*300.
size of float is typically 4 bytes.
So 4 * 30 * 300*300*300 = 3,240,000,000

Now, do you have an OS and a machine capable of dealing with 3GB+ of allocated memory?
32-bit windows won't do it for sure.

**Surrender** · 08-18-2008

Now, do you have an OS and a machine capable of dealing with 3GB+ of allocated memory?
32-bit windows won't do it for sure.

Running this on Vista Ultimate 64 and with 4GB of Ram, I've also tried this on a Ubuntu 64 machine with the same, no dice on either machine.

**matsp** · 08-18-2008

Originally Posted by Surrender

Running this on Vista Ultimate 64 and with 4GB of Ram, I've also tried this on a Ubuntu 64 machine with the same, no dice on either machine.

Are you actually compiling for 64-bit, or are you using a 32-bit executable?

Also, with 4GB of RAM, how much ACTUAL RAM do you have, and how much swap-space?

--
Mats

**Surrender** · 08-18-2008

Firstly, thanks for all the feedback folks.

Why do you need 30 of them at the same time?

I'm running an electromagnetic simulation. It's a 'leapfrog' algorithm, so each iteration of my code relies on the previous one and all arrays are required.

Are you actually compiling for 64-bit, or are you using a 32-bit executable?

Also, with 4GB of RAM, how much ACTUAL RAM do you have, and how much swap-space?

I don't know how to compile for 64 bit, I'm rusty with c. Can anyone help me with this?

I can use up to 4.029 GB of Ram, according to the task manager.
Page File is 4.039 GB

I might give MAT's linked list idea a go and see what happens.

**matsp** · 08-18-2008

A 300 * 300 * 300 array takes 27 million elements. Each element (if they are float) takes up 4 bytes, so the array takes up 104MB each. (Plus 300 + (300 * 300) pointers, but that's 0.3% of the total size, so we don't really worry about that). 30 x 100MB makes 3GB, which is more than what Linux and Windows in 32-bit mode can cope with.

You will need to find a different approach. My guess would be that your arrays are "sparse" - meaning that they are only partly populated with meaningfull data. The solution then is to implement a more dynamic approach to storing the data.

There are several ways to implement the sparse arrays; a few ideas:
- a linked list of index and content,
- an array with blocks of "start, end, array of content"
- Always allocate the whole 300 entries, but only where you have them.

The last case may be the simplest to implement:
- In this case, you don't really need a "get content (x, y, z)" type function - just allocate a single 300 entry array, and assign that to the [x][y] arrays (in your example) to hold the single empty. This address would have to be stored somewhere so that you can later on compare it when you write data [you still need a "store data to (x, y, z)" function].

Another note: You should take care when you pick the order of the dimensions - there is a difference between these two loops:

Code:

int x, y, z;
for(z = 0; z < size; z++)
{  
   for(y = 0; y < size; y++)
   {
      for(x = 0; x < size; x++)
      {
          use(array[x][y][z]);
      }
   }
}

and

Code:

int x, y, z;
for(z = 0; z < size; z++)
{  
   for(y = 0; y < size; y++)
   {
      for(x = 0; x < size; x++)
      {
          use(array[z][y][x]);
      }
   }
}

Since, the x value changes most often, this means that in the first example that the pointer at the outermost level will need to be re-read every iteration of the loop. In the second example, only the index into the actual array needs to be re-calculated for each iteration, as the outer two pointers do not change within the inner loop. This means that the second loop form would probably be about 4x faster - and that before we take into account cache performance, which probably gives another 2x performance. So in total the first example may be about 10x slower than the second one.

--
Mats

**Salem** · 08-18-2008

http://blogs.msdn.com/hiltonl/archiv...m-problem.aspx

XP has a "/3GB" startup switch which limits it to only using 1GB of the memory address space, and allows you up to 3GB.
Even then, you're still stuck.

Why do you need 30 of them at the same time?

If matlab can do this, it's using a hell of a lot of disk space to do it.
You'll need to do something similar as well.

**sgauravv** · 08-18-2008

hey use const to store the array because if u use const then it will store in your rom otherwise
if u use dynamically then it will access the ram and ram doesnot have so much memory.

**matsp** · 08-18-2008

Originally Posted by sgauravv

hey use const to store the array because if u use const then it will store in your rom otherwise
if u use dynamically then it will access the ram and ram doesnot have so much memory.

But that also means that it can not be modified by the code [compiler will refuse to do that]. In most PC's it makes no difference, and systems that have the capability of running Matlab tends to be PC's. Further, if you require about 3GB of memory, RAM is significantly less expensive than any ROM variant [I'm sure this includes MASK ROM's, but I could be wrong here - certainly any ROM that would be programmable at customer end would be more expensive than RAM], so putting the array in ROM won't help here.

--
Mats

**rasta_freak** · 08-18-2008

Maybe using big fat file to hold array? Could solve storage problem.
(EDIT: "fat" as opposite of skinny)

**matsp** · 08-18-2008

Originally Posted by rasta_freak

Maybe using big fat file to hold array? Could solve storage problem.
(EDIT: "fat" as opposite of skinny)

But it would be quite slow...

--
Mats

**rasta_freak** · 08-18-2008

Originally Posted by matsp

But it would be quite slow...

--
Mats

If i had that problem as OP, i'd do it

EDIT: OS will most probably use swap file anyway, which gives you same "performance"...

**Salem** · 08-18-2008

> It's a 'leapfrog' algorithm, so each iteration of my code relies on the previous one and all arrays are required.
So in the innermost loops, you always touch all 30 arrays?
If not, then some of them could be swapped.

If you're accessing [0][0][0], do you ever go anywhere near [299][299][299]?
If not, consider alternative allocations which make use of such "locality".
Say for example only storing the current [10][300][300] slice of each cube.

Thread: Allocating a number of large 3d arrays in C

Thread Tools

Search Thread

Display

Hybrid View

Allocating a number of large 3d arrays in C

Similar Threads

memory issue

Looking for constructive criticism

Allocating Arrays on the Free Store

3D or 4D arrays

Array of boolean