C arrays and pointers etc. Help!

**chacham15** · 07-23-2008

>You can CERTAINLY create a 2D, 3D or 4D array:

Its only in your mind that that array can exist in more than one dimension, the compiler translates any higher dimensions into a single dimensions placed side by side but it cant always do the job. In order to create a true multidimensional array you need ptrs. The key difference is:

int foo[2][2][2][2];
foo[0][0][0][0] = 2;
foo[0][1][0][0] = 3;
int ****bar = foo; //wrong
printf("%d", bar[0][0][0][0]); //this segfaults
int *headptr = foo;
printf("%d", headptr[4]); //this works, why? the index is clearly out of bounds....because its a 1d array! This is the equivaluent of what you would think of as foo[0][1][0][0]. The following is a true multidimensional array.

int **twodimen = malloc(2*sizeof(int*));
twodimen[0] = malloc(2*sizeof(int));
twodimen[1] = malloc(2*sizeof(int));
twodimen[1][1] = 5;
int **test = twodimen;
printf("%d", test[1][1]); //This works!!

> This is usually discovered when dealing with multidimensional arrays.

As i said earlier, theres no such thing. Its simply a compiler hack (i.e. int foo[2][2] is implemented as int foo[4] and foo[1][1] gets translated to foo[3]...see above)

>Because that's one of the situations in which an array name decays to a pointer.

Thats semantics. It is for all intensive purposes a pointer.

>sizeof() is not a preprocessor directive. It is compiler proper - that's why you can't do something like

What I meant was that sizeof behaves like a preprocessor directive in that it isnt a real function. The compiler evaluates the size and then replaces the function call with the real number (i.e. malloc(sizeof(int)) gets translated to whatever target equivalent of the mips: $a0 = 4; jal malloc (for non assembly people: malloc(4)). Hence the uselesness of a function which evaluates the size of its argument (it'll always be the same).

**cas** · 07-23-2008

Multidimensional arrays are implemented as arrays of arrays, not as single dimensional arrays, although the layout is the same either way (they must be contiguous). I'm talking about C's type system.

The problem, though, is that you just can't go poking around where you want as far as C is concerned, even if you "know" where things are. At least, when you do start poking around all bets are off.

This is not legal:

Code:

int foo[2][2][2][2];
int *headptr = foo;

The initialization of headptr is a constraint violation since the types don't match. You could instead do:

Code:

int foo[2][2][2][2];
int *headptr = &foo[0][0][0][0];

Now the types match. But according to the standard, you can't access beyond headptr[1], even though we all know how the array is laid out. An implementation would probably have to go to extra effort to stop you ... but it's still a violation of the rules.

You could cast in the first example. But a cast is normally a sign of trying to sneak behind C's back and break the rules. Again, while we "know" it will work, it's not legal C, although perhaps a case could be made that this method is acceptable. I doubt it, though.

To reiterate, this is all inside the scope of C's rules. In one sense a multidimensional array is the same as a large single dimensional array, but as far as C is concerned, they are different animals, even if it mandates that they're laid out the same way. I think this is a rather subtle concept, but not a useless one.

**chacham15** · 07-23-2008

Yes, in the first example if you wanted you'd do an explicit cast. This, however, isnt necessary as the compiler will guess at what you mean. This fits in with the entire theme, rules are meant to be broken. You just have to know what you are doing. For example, you can rely on headptr[4] working because you know that the memory is contiguous. Why i chose those words is that a block of 16 bytes can be thought of as 16 blocks of 1 byte is just a matter of interpretation. But the problem with the int foo[20][20] is the fact that you cannot have any other function reference withough going through many pains unless you use my 'illegal' assumption that the blocks are laid out continuously. The only "proper" way would be to pass a pointer to the head of every 1d array (sooooooooooooo much less efficient and unnecessaary!). The mere fact that its so annoying to do otherwise is reason enough to break the rules. (Another famous example of this is breaking the "dont use goto's rule" (which as legal, but more of a rule of thumb) in order to get out of nested loops. You just need to know what you're doing. (I saw someone goto into a different function...ooh that was bad)

P.S. truely multidimenional arrays such as the one i presented earlier will not be arranged in the same way as this array.

**Salem** · 07-23-2008

> int ****bar = foo; //wrong
Well that's because you got the type wrong!

int (*bar)[2][2][2] = foo;
would be much better.

Do you have any problems passing multiple dimension arrays to functions?
It seems from what you've been saying that you would.

> >Because that's one of the situations in which an array name decays to a pointer.
> Thats semantics. It is for all intensive purposes a pointer.
It's not a pointer when you use it with sizeof - it's more than semantics!

> What I meant was that sizeof behaves like a preprocessor directive in that it isnt a real function.
It's called an operator, which is neither a pre-processor directive, nor a function call.
It's a unique operator in that it uses a sequence of characters (sizeof) rather than symbols (+,- etc).
Check your operator precedence table, it's listed there.

> printf("%d", headptr[4]); //this works, why? the index is clearly out of bounds....because its a 1d array!
"Works" maybe, but of dubious legality.
http://c-faq.com/aryptr/ary2dfunc2.html
"Flattening the array" is a known technique.
But it wouldn't work with your malloc'ed array which followed.

Incidentally, you can malloc a 2D array in one step.
int (*twod)[4] = malloc ( 4 * sizeof *twod );
you now have twod[0][0] through to twod[3][3]
AND it satisfies your flattened array techniques.

But I'm sure that use of sizeof will flummox you even more than it does already

**cas** · 07-24-2008

Yes, in the first example if you wanted you'd do an explicit cast. This, however, isnt necessary as the compiler will guess at what you mean.

A compiler is obligated to do no such thing. What it is obligated to do is issue a diagnostic because a constraint is violated. Apart from that, it needn't actually complete translation.

As for the rest, if you're not worried about breaking the rules of C, we'll just have to agree to disagree on that one.

**MacGyver** · 07-24-2008

Originally Posted by chacham15

This fits in with the entire theme, rules are meant to be broken. You just have to know what you are doing.

This is the problem with many people that profess to have advanced knowledge of C.

**matsp** · 07-24-2008

Originally Posted by chacham15

>You can CERTAINLY create a 2D, 3D or 4D array:

Its only in your mind that that array can exist in more than one dimension, the compiler translates any higher dimensions into a single dimensions placed side by side but it cant always do the job.

Yes, until we invent multi-dimensional memory, the compiler will have to do that - but the fact is that the compiler WILL do that for us, so we don't need to worry about it, and you can pass multidimensional arrays to other functions, and it will still do that [assuming we pass the array dimensions with it - except for the first dimension].

Yes, you can replace any array with a one-dimensional array - and then do the relevant math yourself. But which is easier to read of these two:

Code:

b = a[z][y][x]

or

Code:

b = a[z * dimy * dimx + y * dimx + x];

If the compiler does it for you, that's a great help.

--
Mats

**chacham15** · 07-24-2008

> It's not a pointer when you use it with sizeof - it's more than semantics!

sizeof isnt a real function!!!

> It's a unique operator in that it uses a sequence of characters (sizeof) rather than symbols (+,- etc).

there is no assembly code corresponding to the evaluation of this whereas int j = 1 + i does have assembly which corresponds to it.

>But it wouldn't work with your malloc'ed array which followed.
Thats the point, the malloced array is a real multidimensional array unlike the array we pretend is more than 1d

>Incidentally, you can malloc a 2D array in one step.
you can malloc anycontinuous block of memory in one step...but i never said that that was a true 2d array the one i malloc is.

>A compiler is obligated to do no such thing
are you saying that the compiler will error? I've never seen that and if it does, get a different compiler. (note, im talking casts like the one i skipped, not casts which require a conversion)

>you're not worried about breaking the rules of C
given requirements such as performance, sometimes its necessary, wouldnt you say?

>This is the problem with many people that profess to have advanced knowledge of C.
this is a very negative wording. What i mean is if you dont stricly follow rules of the language you'd better know what you're doing. However, you dont need all that advanced knowledge of C to break a few rules here and there.

>If the compiler does it for you, that's a great help.
100% agreed. When the compiler doesnt help you, then you break some rules.

**matsp** · 07-24-2008

Originally Posted by chacham15

> It's not a pointer when you use it with sizeof - it's more than semantics!

sizeof isnt a real function!!!

Not in the sense that you can call it and see code, no. Because the compiler knows the size of the data in itself, so the result of sizeof() is ALWAYS a constant [a different constant for different "calls" of sizeof, but still one value for any particular instance], so there is no need a runtime version of sizeof().

> It's a unique operator in that it uses a sequence of characters (sizeof) rather than symbols (+,- etc).

there is no assembly code corresponding to the evaluation of this whereas int j = 1 + i does have assembly which corresponds to it.

But the compiler is allowed to replace that with a constant expression if "i" has a known value at compile-time. Since the sizeof() ALWAYS turns into a constant value [or a compiler error], there is no need for an assembler instruction for it.

>But it wouldn't work with your malloc'ed array which followed.
Thats the point, the malloced array is a real multidimensional array unlike the array we pretend is more than 1d

A malloced "2d array" is not a two-dimensional array. It is an array of pointers [first dimension] that point to memory representing the second dimension. It is no more a 2D array than a single block of memory representing an array.

>you're not worried about breaking the rules of C
given requirements such as performance, sometimes its necessary, wouldnt you say?

But you need to:
1. Determine that there is no other better solution.
2. Understand the consequences of this type of choice [e.g. there is a limitation of the portability of the code - which may not be a problem, but if you or someone else plainly assume that because it's C it's portable, then you would probably end up with some code that doesn't work sooner or later].
3. Understand that someone else may need to ALSO understand these consequences, so you'd better document what you have done, why you choose to do so, and how it actually works, and what the consequences would be when the assumptions you made are not longer valid.

--
Mats

**dwks** · 07-24-2008

(I saw someone goto into a different function...ooh that was bad)

And it would be a compiler error, too. That's why setjmp was invented. (Please don't use it. It's worse than goto.)

**robwhit** · 07-24-2008

> As cas says, the only pointer cast which is guaranteed is the round trip T* to void* and back to T*

C99 (6.3.2.3.7) guarantees round-trip if a converted pointer is correctly aligned for the pointed-to type.

**chacham15** · 07-24-2008

>And it would be a compiler error, too. That's why setjmp was invented. (Please don't use it. It's worse than goto.)

haha, cool. I dont think his compiler errored, perhaps he did the goto in assembly.

>Not in the sense that you can call it and see code, no
Im glad you agree

>if "i" has a known value at compile-time
lol, its a variable...say it with me: var-i-a-ble.

And as far as the numbers I agree with 2 and 3. My reasoning for 1 is that a different way to do may be possible but require lots of extra effort and time which you dont have.

Oh, also, some technically illegal code is very portable (the multidimensional array thing for example)

**dwks** · 07-24-2008

>if "i" has a known value at compile-time
lol, its a variable...say it with me: var-i-a-ble.

Since you seem to know assembly, why not try examining the assembly that your compiler produces for this? (Be sure to enable optimisations.)

Code:

#include <stdio.h>
#include <stddef.h>

int main() {
    const size_t x = sizeof(int);
    printf("&#37;ld\n", (unsigned long)x);
    return 0;
}

I haven't tried it, but I'd guess that sizeof(int) (probably 4) is pushed onto the stack rather than x when printf is called.

**cas** · 07-24-2008

>A compiler is obligated to do no such thing
are you saying that the compiler will error? I've never seen that and if it does, get a different compiler. (note, im talking casts like the one i skipped, not casts which require a conversion)

The standard makes no distinction between warnings and errors. What I am saying is that a diagnostic is required. Further, the compiler is not required to actually finish translating. That's not indicative of a bad compiler, either. Write code that conforms to the standard.

>you're not worried about breaking the rules of C
given requirements such as performance, sometimes its necessary, wouldnt you say?

It depends on what you mean, but generally, no. Obviously you wind up using a lot more than standard C (Such as POSIX + X11, for example), but I don't violate language rules in some quixotic quest for performance. Most times somebody is talking about performance he hasn't actually profiled the code in question.

I'm not arguing against doing platform-specific things from time to time. It's clearly a necessity. But there's a difference between needing to know the endianness of a system, for example, versus going beyond the end of an array. With experience, you learn what is acceptable and what is not.

A good rule of thumb: If it can be done in standard-conforming code, do it in standard-conforming code.

**arpsmack** · 07-24-2008

Zug zug.

By the way, in regards to:

Originally Posted by chacham15

Thats semantics. It is for all intensive purposes a pointer.

The correct phrase is, "For all intents and purposes." For some reason, it really irks me when people screw this up.

Thread: C arrays and pointers etc. Help!

Thread Tools

Search Thread

Display

Similar Threads

pointers & arrays and realloc!

Pointers and multi dimensional arrays

Passing pointers to arrays of char arrays

pointers

Help understanding arrays and pointers