Allocating a tensor

**bhdavis1978** · 04-08-2010

Hi all,

I'm trying to allocate a tensor, but I seem to be having some difficulties with it, so I was hoping someone could point out my error to me. Here is my code

Code:

u_ssint ***u_ssint_tensor(int rows, int cols, int depth)
/* creates a memory structure of type u_ssint, which is an unsigned char  */
{
	int i, j;

	// check input
	if (rows < 0) 
	{
		printf("rows must be >= 1, rows = %d in u_ssint_tensor\n", rows);
		printf("exiting\n");
		exit(-1);
	}
	
	if (cols < 0) 
	{
		printf("cols must be >= 1, cols = %d in u_ssint_tensor\n", cols);
		printf("exiting\n");
		exit(-1);
	}
	
	if (depth < 0) 
	{
		printf("depth must be >= 1, cols = %d in u_ssint_tensor\n", depth);
		printf("exiting\n");
		exit(-1);
	}
	
	u_ssint ***usi;
	
        // first create a vector of size cols of type u_ssint**
        usi= calloc(cols, sizeof(u_ssint **));
	if (usi==NULL)
	{
		printf("could not allocate memory for columns in u_ssint_tensor\nexiting\n");
		exit(-1);
	}
	
	for (i= 0; i < rows; i++)
	{
		usi[i]= calloc(rows, sizeof(u_ssint *));
		if (usi[i]==NULL)
		{
			printf("could not allocate mmeory for row %d in u_ssint_tensor\nexiting\n", i);
			exit(-1);
		}
		
		for (j= 0; j < depth; j++)
		{
			usi[i][j]= calloc(depth, sizeof(u_ssint));
		
			if (usi[i][j]==NULL)
			{
				printf("could not allocate mmeory for row %d, depth %d in u_ssint_tensor\nexiting\n", i, j);
				exit(-1);
			}
		}
	}
	return usi;	
}

That doesn't produce any errors, but when I try to go through the created tensor, I get an EXC_ACCESS error

e.g.

Code:

int j, k, l;

usi= u_ssint_tensor(1000, 1000, 20);
		
usi= u_ssint_tensor(1000, 1000, 20);
for (j= 0; j < 1000; j++)
{
	for (k= 0; k < 1000; k++)
	{
		for (l= 0; l < 20; l++)
		{
			printf("j= %d, k= %d, l= %d\n", j, k, l);
			usi[j][j][l]= 1;
		}
	}
}

the program ends up bailing at:

Code:

j= 19, k= 999, l= 18
j= 19, k= 999, l= 19
j= 20, k= 0, l= 0

Also- note that if I change the dimensions of the tensor from 1000,1000,20 to 1000,1000,15, then instead of the program bailing at j=20, k=0, l=0, it bails at j=15, k=0, l=0, which makes me think I'm accidentaly not allocating enough memory for the first dimension- and that somehow it's 15, and not 1000.

If anyone could provide some assistance, I'd greatly appreciate it!

Cheers,
Brad

**bhdavis1978** · 04-08-2010

Nevermind- I found it! Funny how that simply in the act of posting the problem, I was able to see it in the code.

The line

Code:

for (j= 0; j < depth; j++)

Should be

Code:

for (j= 0; j < cols; j++)

Still, I've got another question.

Instead of allocating it in this fashion, would it be very bad to do the following?

u_ssint ***usi;
usi = calloc(rows * cols * depth, sizeof(u_ssint));

Since essentially what I want is a big chunk of memory with dimensions rows x cols x depth?

Just off the top of my head, this seems like a bad approach, if only because now I've got to allocate rows x cols x depth as one big chunk of memory (and it may be difficult to find a single contiguous chunk of memory that large), and instead by breaking it up as I did in the above posted code, I only need to allocate a bunch of chunks of memory, connected to each other via pointers.

Can anyone comment on that?

Thanks,
Brad

**tabstop** · 04-08-2010

In this case it's square, but your loop is off -- you allocate cols number, but then loop over rows in the first dimension.

**tabstop** · 04-08-2010

Originally Posted by bhdavis1978

Nevermind- I found it! Funny how that simply in the act of posting the problem, I was able to see it in the code.

The line

Code:

for (j= 0; j < depth; j++)

Should be

Code:

for (j= 0; j < cols; j++)

Still, I've got another question.

Instead of allocating it in this fashion, would it be very bad to do the following?

u_ssint ***usi;
usi = calloc(rows * cols * depth, sizeof(u_ssint));

Since essentially what I want is a big chunk of memory with dimensions rows x cols x depth?

Just off the top of my head, this seems like a bad approach, if only because now I've got to allocate rows x cols x depth as one big chunk of memory (and it may be difficult to find a single contiguous chunk of memory that large), and instead by breaking it up as I did in the above posted code, I only need to allocate a bunch of chunks of memory, connected to each other via pointers.

Can anyone comment on that?

Thanks,
Brad

It depends. Do you want to use three subscripts? If so, then one calloc Just Won't Do.

**bhdavis1978** · 04-08-2010

Originally Posted by tabstop

It depends. Do you want to use three subscripts? If so, then one calloc Just Won't Do.

True, but I could come up with a simple expression to uniquely describe each position- so, for a two dimensional array, it might be something like col+(row*maxcol), in that case, instead of it being

matrix[col][row]

it'd be

matrix[col+(row*maxcol)]

So that doesn't seem like a big problem to me. If I was worried about inputting the incorrect algorithm each time I wanted to do it, I could use a macro, or an inline function like

inline unsigned long address(int col, int row, int maxcol)
{
return (col+(row*maxcol));
}

and then just access it as

matrix[address(col, row)]

On a related question, why does it seem that so many people use malloc instead of calloc? Calloc seems so much more convenient to me.

**tabstop** · 04-08-2010

Originally Posted by bhdavis1978

True, but I could come up with a simple expression to uniquely describe each position- so, for a two dimensional array, it might be something like col+(row*maxcol), in that case, instead of it being

matrix[col][row]

it'd be

matrix[col+(row*maxcol)]

Naturally (this is what the compiler does for you, in a "real" multi-dimensional array). If you're willing to do it, then more power to you.

Originally Posted by bhdavis1978

On a related question, why does it seem that so many people use malloc instead of calloc? Calloc seems so much more convenient to me.

Calloc is slow*. Consider your own example: you are deliberately writing 0's into your memory, and on the very next line overwriting those zeros with the proper values. Allowing for the fact that the last calloc seems to be "proper" (i.e., your matrix seems rather sparse), you are writing 1000*4+1000*1000*4 = 4MB of data that you don't need to (or want to). (Technically, the diagonal elements are another 4MB of data that you double-write, but I'm not sure how that would compare with a big loop doing 0's and 1's together -- writing all 0's I think is somewhat faster than a bunch of different stuff.)

*Whether this slowness is actually noticeable on today's machines, I couldn't say.

**bhdavis1978** · 04-08-2010

Originally Posted by tabstop

Naturally (this is what the compiler does for you, in a "real" multi-dimensional array). If you're willing to do it, then more power to you.

I may start doing this, since it's easier to code- and I could even have the code fall back to creating large memory structures using my current method, if it couldn't allocate a large enough chunk of memory on it's own.

Calloc is slow*. Consider your own example: you are deliberately writing 0's into your memory, and on the very next line overwriting those zeros with the proper values. Allowing for the fact that the last calloc seems to be "proper" (i.e., your matrix seems rather sparse), you are writing 1000*4+1000*1000*4 = 4MB of data that you don't need to (or want to). (Technically, the diagonal elements are another 4MB of data that you double-write, but I'm not sure how that would compare with a big loop doing 0's and 1's together -- writing all 0's I think is somewhat faster than a bunch of different stuff.)

That's what I'd figured the issue with calloc was. In my case, I actually want the matrix to be filled with zeros- and yes, for the most part, it's going to be a very sparse matrix. I realize I'm actually wasting a huge amount of memory, storing a lot of 0's I don't need to. The alternate method, would be to just create a list, where each element contained the relevant information- this'd be much smaller, but accessing the information within that list would be much slower. If I want to be able to check if site x having value A is acceptable to site y having value B, using this large matrix-thing (I realized a while ago that actually, I need a 4 dimensional memory structure- a matrix of matricies, not a tensor), and I can do that very quickly through a direct look up using this large memory structure- thing. If I made a big list, it'd take a lot less memory, but finding the relevant information would be a lot slower.

Since that list would contain four pieces of information: two ints indicating location (position 1, position 2), and two chars indicating mutually exclusive values at those positions, it'd be difficult to sort the list intelligently. I could sore it by site 1, but then if I'm doing a comparison to site 2, I'd have to search through the whole list to find elements matching site 2. I suppose I could solve this problem by having two essneitally identical lists, one sorted in ascending order for site 1, and the other for site 2, but it's still going to require an awful lot of comparison statements to find the information I want. This way, I can just immediately jump to the address I want and grab it.

I imagine that might be a bit overly vague though. I'm still wrestling with the correct way to go about solving this problem, and I haven't completely made up my mind.

**tabstop** · 04-08-2010

You can have two different "threads" in a linked list, i.e., a structure that looks like

Code:

struct data {
    char valueA;
    char valueB;
    struct data *next_by_site_1;
    struct data *next_by_site_2;
};

Then when you add something to the list, you insert it in it's proper place in both directions.

**hk_mp5kpdw** · 04-08-2010

Code:

for (j= 0; j < 1000; j++)
{
	for (k= 0; k < 1000; k++)
	{
		for (l= 0; l < 20; l++)
		{
			printf("j= %d, k= %d, l= %d\n", j, k, l);
			usi[j][j][l]= 1;
		}
	}
}

That bit in red there doesn't look right... typo?

**bhdavis1978** · 04-08-2010

Originally Posted by tabstop

You can have two different "threads" in a linked list, i.e., a structure that looks like

Code:

struct data {
    char valueA;
    char valueB;
    struct data *next_by_site_1;
    struct data *next_by_site_2;
};

Then when you add something to the list, you insert it in it's proper place in both directions.

That specific structure won't work in this case, but that's an interesting suggestion. I've come up with another idea too, which might work. It'd be slower than my matrix of matricies approach, but faster than a straight linked list and use a fraction of the memory. But I need to think over your suggestion. I do appreciate your thoughts.

Originally Posted by hk_mp5kpdw

Code:

for (j= 0; j < 1000; j++)
{
	for (k= 0; k < 1000; k++)
	{
		for (l= 0; l < 20; l++)
		{
			printf("j= %d, k= %d, l= %d\n", j, k, l);
			usi[j][j][l]= 1;
		}
	}
}

That bit in red there doesn't look right... typo?

Yeah, it was a typo- thanks!

Thread: Allocating a tensor

Thread Tools

Search Thread

Display

Allocating a tensor

Similar Threads

allocating memory problems

Allocating memory for large text files

Allocating a number of large 3d arrays in C

Dynamically allocating 3D array

allocating memory screws up data being read in from file