Thread: Pointer casting?

  1. #1
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545

    Pointer casting?

    I'm still working on that damn Linux port, and I think the problem I'm having is the Big vs Little Endian difference of Intel vs mainframe...

    The code has a LOT of pointer casting, and I've been seeing strange results in the debug code I added, so I wrote a simple test program so I could brush up on the basics of pointer casting in C.

    Here is my test program:
    Code:
    #include <stdio.h>
    
    int main()
    {
    	int n;
    	char str[9] = { 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08 };
    	int i		= 0x12345678;
    	long l		= 0x12345678;
    	short s[2]	= { 0x1234, 0x5678 };
    
    	printf( "str = 0x%p, &str = 0x%p\n", str, &str );
    	printf( "*(int*)str = 0x%x, *(long*)str = 0x%x\n", *(int*)str, *(long*)str );
    
    	for ( n = 0; n < 8; n += 4 )
    	{
    		printf( "*(short*)str[%d] = 0x%x, ", n, *(short*)(&str[n]) );
    		printf( "(short)str[%d] = 0x%x\n", n, (short)str[n] );
    	}
    
    	printf( "i = 0x%x, (short)i = 0x%x\n", i, (short)i );
    	printf( "l = 0x%x, (short)l = 0x%x\n", l, (short)l );
    	printf( "s[0] = 0x%x, s[1] = 0x%x, *(int*)s = 0x%x, *(long*)s = 0x%x\n", s[0], s[1], *(int*)s, *(long*)s );
    
    	return 0;
    }
    and these are the results I get:
    Code:
    str = 0x0012FF48, &str = 0x0012FF48
    *(int*)str = 0x4030201, *(long*)str = 0x4030201
    *(short*)str[0] = 0x201, (short)str[0] = 0x1
    *(short*)str[4] = 0x605, (short)str[4] = 0x5
    i = 0x12345678, (short)i = 0x5678
    l = 0x12345678, (short)l = 0x5678
    s[0] = 0x1234, s[1] = 0x5678, *(int*)s = 0x56781234, *(long*)s = 0x56781234
    Everything I see is what I expected to see, except for the parts in bold.
    Can someone explain where those values came from?

    This is what I was expecting to see:
    Code:
    *(int*)str = 0x12345678, *(long*)str = 0x12345678
    *(short*)str[0] = 0x1234
    *(short*)str[4] = 0x5678

  2. #2
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Apart from issues of endianness, you are making incorrect assumptions about the sizes of integer types. Specifically, you are assuming an int has size 8 bytes, a long is 8 bytes, and a short is 4 bytes, but your implementation supplies an int and long that are 4 bytes in size, and a short that is 2 bytes in size.

    Print out the values of sizeof(int), sizeof(long), and sizeof(short) [which are all implementation-defined quantities, not specific values fixed by the standard] and you'll see.

  3. #3
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by grumpy View Post
    Apart from issues of endianness, you are making incorrect assumptions about the sizes of integer types. Specifically, you are assuming an int has size 8 bytes, a long is 8 bytes, and a short is 4 bytes, but your implementation supplies an int and long that are 4 bytes in size, and a short that is 2 bytes in size.

    Print out the values of sizeof(int), sizeof(long), and sizeof(short) [which are all implementation-defined quantities, not specific values fixed by the standard] and you'll see.
    Oh crap! I started with a string of "12345678", but changed it to hex so it's easier to print in hex, but I should have made it 0x12, 0x34, 0x56, 0x78

    I knew it must have been something stupid I was overlooking...
    Thanks.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    You also need to watch out for alignment problems, such as

    int *p = &str[1];
    Most processors are only geared to fetch 32-bits in one cycle, if the data is on a 32-bit boundary. To fetch 32-bits on an odd boundary such as this, it may do something like 4 byte fetches and reassemble the result internally (this is a performance killer).
    Or it may decide the whole thing is just too damn complicated and throw a "bus error".
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Salem View Post
    You also need to watch out for alignment problems, such as

    int *p = &str[1];
    Most processors are only geared to fetch 32-bits in one cycle, if the data is on a 32-bit boundary. To fetch 32-bits on an odd boundary such as this, it may do something like 4 byte fetches and reassemble the result internally (this is a performance killer).
    Or it may decide the whole thing is just too damn complicated and throw a "bus error".
    A clarification:
    On x86 - which is the by far most common processor on the market, an unaligned access wil cost time [it takes two cycles to read instead of one].

    Many other processors do not allow unaligned access AT ALL - such as ARM, SPARC, 29K, 68K, etc, etc. They "solve" this by causing an exception. The OS and exception handling may choose to handle this as a sequence of byte reads and assemble it into a word, or it may just say "You daft b****r, can't you can't do that" (Bus error or similar). On some processors (or so I've heard), if this happens in kernel mode, the processor just rounds off the address to the even boundary and uses that data - which of course makes it much harder to figure out.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Actually, another thing that kind of surprised me was this:
    Code:
    str = 0x0012FF48, &str = 0x0012FF48
    str is a char*, so naturally you'd think that &str would be a char**, but for some reason they both print the same address?

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by cpjust View Post
    Actually, another thing that kind of surprised me was this:
    Code:
    str = 0x0012FF48, &str = 0x0012FF48
    str is a char*, so naturally you'd think that &str would be a char**, but for some reason they both print the same address?
    Yes, the second form _IS_ char **, but you can't really get the address of the address of str[0] - because there is no such thing - so the result is the same address, but the type is different, if you see what I mean by that.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Oh wait... I think I see now.

    str is a pointer to str[0]
    &str would be the address of a pointer to str[0], but since there is no separate char* variable holding the address of str[0], it's impossible to get the address of an address...
    I wonder if that should give you a warning saying something like "& in statement '&str' has no effect"?

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    &str is a pointer to the WHOLE array, and it's type would be 'char (*)[9]', not char**

    Compare with
    Code:
    struct foo {
      int baz;
    } bar;
    &bar.baz is a pointer to the int within the struct, and is of type 'int*'
    &bar is a pointer to the whole structure, and is of type 'struct foo*'
    Both have the same value, but different types.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. casting contents of pointer?
    By nutsNguts in forum C Programming
    Replies: 8
    Last Post: 11-10-2008, 11:07 AM
  2. Ban pointers or references on classes?
    By Elysia in forum C++ Programming
    Replies: 89
    Last Post: 10-30-2007, 03:20 AM
  3. scope of a pointer?
    By Syneris in forum C++ Programming
    Replies: 6
    Last Post: 12-29-2005, 09:40 PM
  4. Question About Pointer To Pointer
    By BlitzPackage in forum C++ Programming
    Replies: 2
    Last Post: 09-19-2005, 10:19 PM
  5. pointers
    By InvariantLoop in forum C Programming
    Replies: 13
    Last Post: 02-04-2005, 09:32 AM