C++ and memory

**Mario F.** · 06-07-2006

I need help understanding a little better about certain aspects of memory management.

1. Why isn't bool 1 bit? (possibly related with 2.)

2. All types I sizeof() seem to be 4 bytes or larger. 32 bits seem to be the minimum. Is this what defines 32 bit machines?

3. Simple classes with just data members don't reflect on memory the sum of their data members sizes. Is there an overhead to a class type?

4. If so, is this a known fixed value? My experiments seem to show a variable overhead.

5. What is the exact formula to calculate the significant digits on integral and floating point numbers?

**ChaosEngine** · 06-07-2006

first all sizes of types (bool, chars, ints, etc) are platform dependant. the standard doesn't actually specify the exact sizes anywhere, merely the relativ sizes (i.e. sizeof(char) <= sizeof(short) <= sizeof(int) and I'm not even sure that's it <= and not < )
All my answers refer to the most common case of win32.

1. because it's more computationally expensive to access a single bit than a byte. In the standard library a vector<bool> is actually a specialisation that stores each bool as a bit. whether this is a good thing is a matter of some debate.

2. sizeof(char) on win32 should return 1. sizeof(short) should return 2. can you post code?

3 and 4. C++ allows structs and classes to be aligned on a certain boundaries. this is usually the sizeof the largest pod. so

Code:

struct X
{
   int a;
   char b;
};

will be 8 bytes on win32. You can override this in the compiler settings or by using #pragma pack. Also you cannot instantiate a class of zero size. so

Code:

class empty
{};

will be at least 1 byte. This doesn't apply to base classes (often known as the "empty base class optimisation")

5. can't remember off the top of my head.

**Mario F.** · 06-07-2006

This is definitely one of my least known areas in computation.

You are right ChaosEngine. My bad. Char and Short are 1 and 2 bytes respectively. I didn't think testing these.

As for classes... I found a pattern after you mentioned they are aligned around a boundary. They seem to allways be multiples of 4 bytes. That helped also, thanks.

**ChaosEngine** · 06-07-2006

have a look in project properties->c/C++->code generation->struct member alignment
that lets you change the alignment (that's for vs2003 but I think it's the same for 2005)

**laserlight** · 06-07-2006

2. sizeof(char) on win32 should return 1.

sizeof(char) is always 1, as is sizeof(unsigned char) and of course sizeof(signed char).

**Dave_Sinkula** · 06-07-2006

the standard doesn't actually specify the exact sizes anywhere, merely the relativ sizes (i.e. sizeof(char) <= sizeof(short) <= sizeof(int) and I'm not even sure that's it <= and not < )

Because zen is to discover that it is about ranges of values, not about sizes of objects containing a range of values.

**VirtualAce** · 06-07-2006

Intel x86 architecture was designed to be byte-accessible. The smallest data type available is a byte or 8 bits. Because of this it takes some shifting and/or masking to perform operations on data which is less than 8 bits in size.

As has been said, this additional overhead would be more expensive than just accessing a byte. And since memory is so widely available now even using 32-bits for a bool would be better than using 8.

Look up memory alignment on google and consult the Intel Tech Refs concerning the internal architecture of their CPU's.

**Mario F.** · 06-08-2006

Ok. Thanks. I'm doing some research on this and now it's starting to make sense to me.

**DougDbug** · 06-08-2006

Is this what defines 32 bit machines?

YES! Every physical address in RAM contains 32 bits of data.

At the hardware level, it's like this:

Say your PC has a 32-bit data bus and a 32 bit address bus. This means there are 32 connections (or "wires") for each bus.

Your computer's memory essentially has 66 connections. There's a read line, a write line, 32 address lines, and 32 data lines. (plus power, refresh-clocks, etc.)

If you want to read an address, you set-up the address bus with a bit-pattern that represents the binary address.

Address 1 would look like this:
0000 0000 0000 0000 0000 0000 0000 0001

(I put spaces between each set of 4-bits to make it easier to read. This is common practice.)

One's are equal to 3-5 volts, and zeros are about zero-volts. So, in the above example, the address line that connects to bit 0 (the rightmost, least significant bit) has a voltage, and all of the other lines are zero-volts.

When you activate the read line, usually by bring it down to zero volts (this is called "active low", just to confuse the issue), the data from the associated address is placed on the data bus.... ALL 32 LINES. Every address location contains 32 bits, and every time you activate the read line, 32 bits of data are placed on the data bus.

But like Bubba said, it is possible to "pack" 4-bytes into a 32-bit memory address. And, this can all be done behind the scenes by the BIOS, Operating System, and compiler. With virtual addressing, this can all be hidden from you and your your C++ program. ...You can use pointer arithmetic etc., as if the data was not packed. ...In your program, each char will appear to have it's own address.

You don't have to worry about this hardware shtuff... That's for Assembly-language programmers and compiler writers to worry about!

**vaibhav** · 06-08-2006

Why Dos is called 16 bit
Perhaps because it has 16 bit(2 bytes) pointers.

Then Win XP should have 64 bit (8 bytes) pointers

Thread: C++ and memory

Thread Tools

Search Thread

Display

C++ and memory