Thread: Pointer Issue

  1. #1
    Registered User
    Join Date
    Jul 2010
    Posts
    56

    Pointer Issue

    Code:
    #include <iostream>
    
    using namespace std;
    
    int main()
    {
        char * c = new char[2];
        *c = 0x101112;
        unsigned int i = *(c);
        cout << i;
        return 0;
    }
    When I execute this, it outputs 18 (0x12)
    Why does 'i' gets 0x12 instead of 0x10 which is the first value in array/pointer *c?

  2. #2
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    This is not a pointer issue at all. The compiler doesn't care that you are using pointers in your example - it converts by value between int and char types according to a set of rules.

    The value of 0x101112 is typically more than can be stored in a char type. It will typically be represented as a signed or unsigned integral type larger than char, which means some truncation or rounding occurs when assigning the value to *c.

    The range of values that can be supported by an int type is implementation defined, so it is implementation defined whether 0x101112 is represented as a signed int, an unsigned int, or cannot be represented in an int type (in which case it would be represented as some long type). It is also implementation defined whether a char is a signed or unsigned type.

    If the int type supported by your compiler is 16 bits, 0x1011112 is an unsigned value, and the conversion to smaller types (char or unsigned char) uses modulo arithmetic. That would explain c[0] having the value 0x12.

    If the int type supported by your compiler is 32 bits (or more), then the treatment depends on whether char is a signed or unsigned type. If char is a unsigned type, the conversion from int to char also employs modulo arithmetic (hence c[0] will receive the value 0x12). If char is a signed type, the result of converting int to char is implementation defined (i.e. maybe the value 0x12, maybe another value).
    Last edited by grumpy; 07-02-2011 at 06:10 PM. Reason: Fixed typos
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  3. #3
    Registered User
    Join Date
    Jul 2010
    Posts
    56
    Thought it had something to do with endianess. Guess I was on the wrong track, thanks for your explanation, grumpy

  4. #4
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    That's not what endianness is. Truncation is happening because you are playing with the types you use. Perhaps if you would explain what you really want to do, there is a better answer.

  5. #5
    Registered User
    Join Date
    Jul 2010
    Posts
    56
    Perhaps if you would explain what you really want to do, there is a better answer.
    What I was trying to do is, when I assigned 0x101112 to an address in memory, I thought it would first assign to 0x10 to the first element of the array, then 0x11 to the second element and so on. All this in more direct way, rather than:

    Code:
    *(c+0) = 0x10;
    *(c+1) = 0x11;
    *(c+2) = 0x12;

  6. #6
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    The only way that would happen is if the native int (or, more accurately, unsigned) type was exactly three bytes (with each byte comprised of 8 bits) on a big-endian machine. There are plenty of big-endian machines, but very few (in fact none that I know of, offhand) that support a three-byte integral type.

    If we assume a machine with 8-bit bytes and 4-byte unsigned integral type, then the value 0x101112 would be represented on a big-endian machine with the 4 byte sequence (0x00, 0x10, 0x11, 0x12), on a little-endian machine as (0x12, 0x11, 0x10, 0x00). There are also middle-endian representations. One possible middle-endian representation (which is used, for example, on ARM architectures when writing 32-bit values on a 16-bit boundary) is (0x10, 0x00, 0x12, 0x11).

    Endianness is primarily determined by processor microarchitecture and, to some extent, operating system. It is not determined by the C or C++ standards. For example, x86 processors are little-endian, Motorola 68000 family processors are big-endian. Some processors (such as Power PC and SPARC) are bi-endian, meaning they can be configured one way or the other. Operating systems can also determine endianness, either by configuring bi-endian processors one way or the other or by emulation.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  7. #7
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    This version is well defined so there should be no implementation defined issues. Endianness is still an issue, but, you can be certain that the three bytes are stored in c, either in big or little endian.
    Code:
    #include <iostream>
    #include <iomanip>
    int main()
    {
       unsigned char c[3];
       unsigned int i = 0x101112;
       *(c+0) = (i & 0xff0000) >> 16;
       *(c+1) = (i & 0x00ff00) >> 8;
       *(c+2) = (i & 0xff);
    
       std::cout << std::hex <<
        (unsigned int) *(c+0) << 
        (unsigned int) *(c+1) << 
        (unsigned int) *(c+2) << '\n';
    }

  8. #8
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by whiteflags View Post
    This version is well defined so there should be no implementation defined issues.
    You are assuming 8-bit bytes and an unsigned type that is 3 (or more) 8-bit bytes. Neither of those is guaranteed by the standards (they are implementation defined).

    The use of hex constants also implies assumed endianness (not middle-endian in your case).
    Last edited by grumpy; 07-02-2011 at 11:28 PM.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  9. #9
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    You can do math if you want. The shift is some multiple of CHAR_BIT. The hex constants can also be generated by shifting char bit to the right constant. Still assuming (CHAR_BIT << 5) - 1 is 255 if CHAR_BIT == 8 which would mask the most significant or least significant byte. If you're targeting an architecture where such assumptions can't be made, you have to know enough about it to do math like that.
    Last edited by whiteflags; 07-02-2011 at 11:48 PM. Reason: typos

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Issue with pointer
    By ph5 in forum C Programming
    Replies: 6
    Last Post: 12-07-2010, 05:52 AM
  2. pointer issue
    By Justinb in forum C Programming
    Replies: 3
    Last Post: 10-17-2010, 08:26 PM
  3. not a pointer issue
    By ~Kyo~ in forum C++ Programming
    Replies: 45
    Last Post: 04-21-2010, 05:51 PM
  4. Pointer issue
    By pfs in forum C Programming
    Replies: 2
    Last Post: 09-26-2008, 11:37 AM
  5. Pointer issue when looking through .txt
    By kordric in forum C++ Programming
    Replies: 1
    Last Post: 05-17-2008, 11:26 AM