Pointer Issue

**dhuan** · 07-02-2011

Code:

#include <iostream>

using namespace std;

int main()
{
    char * c = new char[2];
    *c = 0x101112;
    unsigned int i = *(c);
    cout << i;
    return 0;
}

When I execute this, it outputs 18 (0x12)
Why does 'i' gets 0x12 instead of 0x10 which is the first value in array/pointer *c?

**grumpy** · 07-02-2011

This is not a pointer issue at all. The compiler doesn't care that you are using pointers in your example - it converts by value between int and char types according to a set of rules.

The value of 0x101112 is typically more than can be stored in a char type. It will typically be represented as a signed or unsigned integral type larger than char, which means some truncation or rounding occurs when assigning the value to *c.

The range of values that can be supported by an int type is implementation defined, so it is implementation defined whether 0x101112 is represented as a signed int, an unsigned int, or cannot be represented in an int type (in which case it would be represented as some long type). It is also implementation defined whether a char is a signed or unsigned type.

If the int type supported by your compiler is 16 bits, 0x1011112 is an unsigned value, and the conversion to smaller types (char or unsigned char) uses modulo arithmetic. That would explain c[0] having the value 0x12.

If the int type supported by your compiler is 32 bits (or more), then the treatment depends on whether char is a signed or unsigned type. If char is a unsigned type, the conversion from int to char also employs modulo arithmetic (hence c[0] will receive the value 0x12). If char is a signed type, the result of converting int to char is implementation defined (i.e. maybe the value 0x12, maybe another value).

**dhuan** · 07-02-2011

Thought it had something to do with endianess. Guess I was on the wrong track, thanks for your explanation, grumpy

**whiteflags** · 07-02-2011

That's not what endianness is. Truncation is happening because you are playing with the types you use. Perhaps if you would explain what you really want to do, there is a better answer.

**dhuan** · 07-02-2011

Perhaps if you would explain what you really want to do, there is a better answer.

What I was trying to do is, when I assigned 0x101112 to an address in memory, I thought it would first assign to 0x10 to the first element of the array, then 0x11 to the second element and so on. All this in more direct way, rather than:

Code:

*(c+0) = 0x10;
*(c+1) = 0x11;
*(c+2) = 0x12;

**grumpy** · 07-02-2011

The only way that would happen is if the native int (or, more accurately, unsigned) type was exactly three bytes (with each byte comprised of 8 bits) on a big-endian machine. There are plenty of big-endian machines, but very few (in fact none that I know of, offhand) that support a three-byte integral type.

If we assume a machine with 8-bit bytes and 4-byte unsigned integral type, then the value 0x101112 would be represented on a big-endian machine with the 4 byte sequence (0x00, 0x10, 0x11, 0x12), on a little-endian machine as (0x12, 0x11, 0x10, 0x00). There are also middle-endian representations. One possible middle-endian representation (which is used, for example, on ARM architectures when writing 32-bit values on a 16-bit boundary) is (0x10, 0x00, 0x12, 0x11).

Endianness is primarily determined by processor microarchitecture and, to some extent, operating system. It is not determined by the C or C++ standards. For example, x86 processors are little-endian, Motorola 68000 family processors are big-endian. Some processors (such as Power PC and SPARC) are bi-endian, meaning they can be configured one way or the other. Operating systems can also determine endianness, either by configuring bi-endian processors one way or the other or by emulation.

**whiteflags** · 07-02-2011

This version is well defined so there should be no implementation defined issues. Endianness is still an issue, but, you can be certain that the three bytes are stored in c, either in big or little endian.

Code:

#include <iostream>
#include <iomanip>
int main()
{
   unsigned char c[3];
   unsigned int i = 0x101112;
   *(c+0) = (i & 0xff0000) >> 16;
   *(c+1) = (i & 0x00ff00) >> 8;
   *(c+2) = (i & 0xff);

   std::cout << std::hex <<
    (unsigned int) *(c+0) << 
    (unsigned int) *(c+1) << 
    (unsigned int) *(c+2) << '\n';
}

**grumpy** · 07-02-2011

Originally Posted by whiteflags

This version is well defined so there should be no implementation defined issues.

You are assuming 8-bit bytes and an unsigned type that is 3 (or more) 8-bit bytes. Neither of those is guaranteed by the standards (they are implementation defined).

The use of hex constants also implies assumed endianness (not middle-endian in your case).

**whiteflags** · 07-02-2011

You can do math if you want. The shift is some multiple of CHAR_BIT. The hex constants can also be generated by shifting char bit to the right constant. Still assuming (CHAR_BIT << 5) - 1 is 255 if CHAR_BIT == 8 which would mask the most significant or least significant byte. If you're targeting an architecture where such assumptions can't be made, you have to know enough about it to do math like that.

Thread: Pointer Issue

Thread Tools

Search Thread

Display

Pointer Issue

Similar Threads

Issue with pointer

pointer issue

not a pointer issue

Pointer issue

Pointer issue when looking through .txt