# Thread: Byte confusion!!! HELP!

1. ## Byte confusion!!! HELP!

Okay, Bytes and Bits were things I learned about when I was 12... but, now, I feel like I have to learn them all over again.

A byte, to me, is 8 bits... and one byte is usually one single character, like a "1" or "A". I know these are factual.

the INT type allocates 1-2 bytes I believe, it depends on the machine.

My first question is:

Why does allocated memory vary from different machines? For example, INT may reserve 1 byte on my machine, and reserve 2 on another.

My second question is:

For now, lets assume INT reserves 1 byte. If one byte is equal to 1 character, then why can variables declared by type INT store 5 digit numbers? That would be 5 bytes, correct? this isn't making much sense... I read that unsigned INT variables can hold up to 65,000, or somewhere along that.

2. The size of primitive types (in particular integer types) vary from machine to machine primarily because register sizes vary on different machines, and the underlying assembly is designed to work with specific sizes of data (most Windows machines have 4 byte registers, and hence 4 byte ints). Its more efficient that way (though, it can be a real pain). Floating point types often conform to certain standards though. Also, a byte does not have to be 8 bits on a given machine... it just generally is.

As for the characters - A character is stored (generally as a byte) in accordance to some table. Often, it the ASCII standard. Note that '0' != 0, '1' != 1, etc... The char type can actuallyl store quite a few values (256 in the case of 8 bit bytes).

3. Ints vary from 16 bits to 32 bits. Older machines have the short 16 bit ints, and most new ones have long 32 bit ints. Soon there will even be 64 bit ints!

The reason it varies, I believe, is because processors previously could only work with 16 bits at a time, ie, the 16 bit processors. It is more efficient to work with 32 bits, so newer processors now use 32 bits. The type int was changed to a 32 bit default. If you are really concerned about it, you can specify long or short in the declaration:

Code:
short int a; //16 bit integer
long int b; //32 bit integer
For now, lets assume INT reserves 1 byte. If one byte is equal to 1 character, then why can variables declared by type INT store 5 digit numbers? That would be 5 bytes, correct? this isn't making much sense... I read that unsigned INT variables can hold up to 65,000, or somewhere along that.
Are you familiar with the binary number system? You can represent the values 0-255 by using 8 bits, or 1 byte. 2 bytes, or 16 bits, can represent 0-65535. n bits can represent 0 through (2^n)-1. Numerical types in programming languages are represented in binary form rather than as individual ASCII chars. You COULD use a character string to represent a number, but seeing as all the functions would only accept an actual numerical type, you would have to keep converting back and forth.

I hope this helps. If you're still confused, read up on the binary number system or something.

4. so, its just a combination of binary code? For example, 2 bytes would calculate 255^2, which would equal 65025, which in turn is the max number 2 bytes can hold?

If i'm wrong, maybe you can tell me how you calculate the max number each byte can hold. Thanks!

5. ## Close...

16 bits can hold 0 thru 65535 = (2^16)-1 . (or FFFF hex) That means that it can hold 65536 = 2^16 different values, including zero... Sometimes you do count zero... If it's a 16 bit address, you can access address zero.
[EDIT2] [Corrected above... Thanks Cat and grib. ]

nybble = 4 bits
byte = 8 bits
word = 16 bits (or more... i.e "16 bit word" or "32 bit word")

As stated above, the actual number of bits for a type-char, or type-int is system-dependent. The C++ standard specifies a minimum size for each type. A type-char will always be at least 8 bits.

In C++, hexadecimal is usually used when working with bits and bit-manipulation. It's easy (for humans) to convert between hex and binary, and hex is easier to use than binary in C++. (You can cin or cout in hex, but not directly in binary.)

[EDIT]
Here's a link to an ASCII table. As you can see, an ASCII 1 (expressed as '1' in C++) has a decimal value of 49, 31 hex, or 0011 0001 binary. An ASCII A has the decimal value 65.

And, FYI - The Windows calculator can convert between decimal, hex, octal, and binary: Start/Programs/Accessories/Calculator/View/Scientific/(Hex etc.)

[EDIT3]
Note that the terms nybble, byte, and word are jargon. They have no meaning in the C++ language (no meaning to the compiler), unlike int and char which do have meaning in C++.

6. ## Re: Close...

Originally posted by DougDbug
16 bits can hold 0 thru 65535 = 2^16. (or FFFF hex)
Should be "16 bits can hold 0 thru 65535 = (2^16) - 1".

In general, you can represent 2^n different numbers with n bits. Typically, these are in the range of 0 ... (2^n) - 1, for unsigned numbers, or -(2^(n-1)) ... +2^(n-1) -1 for signed numbers.

E.g. for n = 8, we see:

Unsigned: 0 ... (2^8) - 1 = 0 ... 255
Signed: -(2^7) ... +2^7 - 1 = -128 ... 127

Note that in both cases, there are 256 (2^n) different possible values.

BTW, this is assuming that signed integers use 2s complement, which is typical on modern machines.

7. Just remember there are 1 types of people in the world, those who start counting from zero and those who don't.

8. Hey, my Intro to CS class recently went over stuff like this. The lecture notes are online at http://www.mcs.drexel.edu/~introcs/Fa03/. Click on Lectures and then check out the Lectures entitled Binary Numbers and Floating Point Numbers. I think those will help. Also, Windows built-in calculator can do binary math including bitwise operations.

Soon there will even be 64 bit ints!
Why wait when you can buy a G5?

9. Originally posted by joshdick
Why wait when you can buy a G5?
Because Macs should burn.

*Runs away before Mac VS. PC flame starts .*

10. Apples are made to eat, not process

11. It should be noted that the assembly language data types for the Intel 8086 family of processors have NEVER been changed. Just because an int in 16-bit C is 16 bits and in 32-bit C it is 32 does not mean that WORD is ever more than 16 bits.

A WORD is always 16 bits. The instruction rep stosw will always store (E)CX number of WORDs.

So in protected mode the assembly types do not change. It is just that in C an integer is actually a DWORD or 32-bits long. Intel will not change this in the future. There is already a 64-bit data type which is the QWORD. Future 64-bit processor will fetch QWORDs at a time instead of DWORDs. Intel has been very good about not changing anything about the x86 data types from processor to processor.

On Intel/AMD the following are always true no matter what generation of processor - only exception is that prior to the FPU the QWORD type was not available AFAIK.

BYTE - 8 bits (unsigned char in C - always)
WORD - 16 bits (varies in C depending on RM or PM)
DWORD - 32 bits (varies in C depending on RM or PM)
QWORD - 64 bits -> (doubles in C - always)

So my only guess is that the new 64-bit instructions will have such things as rep stosq which would store 64-bits at a time. I have the 64-bit Itanium processor reference books but I haven't really dove into them yet. But again I can't stress enough that Intel will not change the current set - they will simply add more functionality to it - as has been the case with SSE2, MMX, and the FPU.

And yes its true that no matter what you have in memory and no matter what mode you are in, the processors currently always fetch 32-bits even if you are only using 8 of em. Which is why it is a good practice to write your code in such a way as to take advantage of this - otherwise you are wasting cycles be it in C (depending on compiler optimizations) or in assembly.

Do an experiment if you have Win98. Load a text file into EDIT in a DOS box and scroll down. Very fast. Now reboot in DOS mode and do the same - huge difference. You can see the result of wasted cycles and poor memory usage.

Popular pages Recent additions