Thread: 16Bit and Signed Charactors?

  1. #1
    Registered User
    Join Date
    Aug 2007
    Posts
    2

    16Bit and Signed Charactors?

    Ok, I am reading Ivor Horton's Beginning Visual C++ 2005. Great Book, by the way.

    Anyway, Reading up on Signed and Unsigned things in Datatypes and Variables. I have only a few questions.

    What is the difference between Signed and Unsigned datatypes? What does it have to deal with me(The Learner)?

    Also, What does it mean when people put the "L" in front of variables? For instance....
    Code:
    wchar_t letter = L’Z’;
    I know it means that it is a 16bit character code value, but can someone explain it in a greater detail on what 16bit character code values are?

    Thanks.

  2. #2
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Quote Originally Posted by Zoiked View Post
    Ok, I am reading Ivor Horton's Beginning Visual C++ 2005. Great Book, by the way.

    Anyway, Reading up on Signed and Unsigned things in Datatypes and Variables. I have only a few questions.

    What is the difference between Signed and Unsigned datatypes? What does it have to deal with me(The Learner)?
    Well, pretty much what you read. An unsigned value cannot be negative. Unsigned integers have more well-defined math as well.

    Quote Originally Posted by Zoiked View Post
    Also, What does it mean when people put the "L" in front of variables? For instance....
    Code:
    wchar_t letter = L’Z’;
    I know it means that it is a 16bit character code value, but can someone explain it in a greater detail on what 16bit character code values are?
    Without regard to the underlying implementation, a wide character may assigned a wide character literal, much like an ordinary character may be assigned an ordinary character literal. How do you tell the difference between a wide character literal and an ordinary character literal? The wide character literal has the "L" in front.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  3. #3
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    What is the difference between Signed and Unsigned datatypes? What does it have to deal with me(The Learner)?
    You need to know the difference because the range of values that each type can hold differ vastly. Since one bit is used as the sign bit this means signed values have a different range of values than unsigned.

    When an unsigned value is greater than the top of the range it wraps to 0. When it is less than 0 or the bottom it wraps to the top of the range.

    When a signed value is greater than the top of the range, it wraps the bottom - or the smallest negative value. When it is less than the smallest negative value it wraps to the largest positive value.

    For instance a signed char:

    -128-1=127
    128+1=-127

    A signed short:

    -32768-1=32767
    32767+1=-32768

    An unsigned char:
    255+1=0
    0-1=255

    An unsigned short:
    65535+1=0
    0-1=65535

    So if you are testing values in an if statement it would not make sense to check value<0 condition for an unsigned variable since this will never happen. Data types are extremely important in C/C++.

    I think those values are right. Anyways x86 platforms use two's complement mantissa for negative integral values.

    An interesting thing happened in to me once SimCity 2 that has to do with data types. They were using a signed long to represent the money you had in the treasury. Well I made so much money that the data type wrapped. This caused me to go from having tons of money in the treasury to having a huge debt instantaneously. Of course I could never recover from the serious debt and so I went bankrupt. The tech support people got a laugh out of that one.
    Last edited by VirtualAce; 08-27-2007 at 02:49 AM.

  4. #4
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Actually, signed overflow is undefined behaviour. It's just that the concrete effect on most platforms is a wraparound (since they use plain 2's complement integers), but I believe there are a few platforms where signed overflow traps. And that, in turn, means your program will probably crash.


    The L prefix (for character and string literals, not variables) means that the extended execution character set is to be used. What does that mean?
    Well, C++ knows three character sets. The basic source character set consists of the core characters needed for C++ code: the alphabetic letters, the digits, and all the special signs in operators.
    The basic execution character set is an implementation-defined superset of the BSCS. That means that its exact contents, as well as the mapping of numeric values to members, is up to the compiler, but must be documented. The BECS must include the NUL character. A char must be large enough to hold any member of the BECS. This means that the BECS must be a single-byte encoding.
    By far the most common BECS is ASCII or a superset (like Latin-1 or Windows-1252), but BCDIC has been used occasionally.
    Normal string and character literals are translated to this encoding.
    The extended execution character set is a superset of the BECS, but it need not have the same numeric mappings, except for the BSCS part. (In other words, casting a char holding a member of the BSCS to wchar_t is safe. I'm pretty sure of this, but not 100&#37;.) The contents of the EECS are again implementation-defined.
    By far the most common is ISO-10646, Unicode, as of Unicode 5.0 consisting of over 90000 characters.
    The type wchar_t is required to be able to hold any member of the EECS in a single unit. This means that, if MS claims that their EECS is Unicode >2 (I think), they are non-conforming because their 16-bit wchar_t cannot hold every member. The UTF-16 encoding VC++ uses for wide literals is not a single-unit encoding. This is something they're probably perfectly aware of and willing to accept, as the advantages are considerable, considering that Windows uses UTF-16 internally.
    GCC on Linux, on the other hand, has a 32-bit wchar_t perfectly capable of holding every Unicode character. The kernel, depending on the configuration, typically uses UTF-8 internally, so UTF-16 provides little advantage.
    Wide character and string literals - those prefixed with an L - are translated to the EECS and use wchar_t as the underlying type.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  5. #5
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Integral and floating point data types wrap on all x86 platforms. I'm not sure about others.

  6. #6
    Registered User
    Join Date
    Jul 2006
    Posts
    162
    '-' is a sign, a negative sign. it represents a negative number. hence "signed." clear?

    a 16 bit value can hold all the numbers that 16 bits of data can physically represent. cool?

    the OS knows a number is negative if we format that 16 bit data space to allow one of the bits to represent the presence or lack of presence of a "sign." it's a value that's either "on" or "off", meaning "negative number" or "not negative number."

    the thing is, we lose 1 of the bits that could have given the potential for a higher value, so are maximum value is -1 less now.

    i'll use 4 bits for this example. lets say a 4 bit int unsigned has a maximum value of 255, 0 to 255 is 256 possible values.

    now we want to use an unsigned 4 bit int, well then you'd assume at first that we divide that number by two to get even halves of negative and positive numebrs, -128 to 128, no, we used one of the bits for the sign!

    the fact that a sign exists must be kept track of BY the INT data, so our range is

    -127 to 128

    the negative bit that would have represented -128 was used for the minus sign, so it's -127.

    same concept works exactly the same for any N bit data, where N represents the number of bits to represent integers.


    you as 'the learner' (no clue why you emphasize this) need to memorize all of these concepts, this knowledge IS the tools of this trade.
    Last edited by simpleid; 08-27-2007 at 08:18 AM.

  7. #7
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    i'll use 4 bits for this example. lets say a 4 bit int unsigned has a maximum value of 255, 0 to 255 is 256 possible values.
    Um, no, let's not. 4 bits have 16 values.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  8. #8
    Registered User
    Join Date
    Feb 2006
    Posts
    312
    Cornedbee pointed out one mistake, but you seem to be confused on rather alot more.
    Quote Originally Posted by simpleid View Post
    the thing is, we lose 1 of the bits that could have given the potential for a higher value, so are maximum value is -1 less now.
    Be specific, else you end up being misleading. The result of losing a bit to the sign causes the maximum positive value to reduce by exactly half of the range.

    we divide that number by two to get even halves of negative and positive numebrs, -128 to 128, no, we used one of the bits for the sign!

    the fact that a sign exists must be kept track of BY the INT data, so our range is

    -127 to 128

    the negative bit that would have represented -128 was used for the minus sign, so it's -127.
    Which signed integer representation are you talking about here? I can't think of any which work like that.
    - For two's complement, an 8-bit signed integer has range -128 through 127,
    - For one's complement, an 8-bit signed integer has range -127 through 127.

    I assume you're talking about two's complement, and got mixed up, although you should be aware that there's no standard way to represent signed integers in binary, it just happens that most modern machines use two's complement. I strongly suggest you read up on negative integer representations.

    you as 'the learner' (no clue why you emphasize this) need to memorize all of these concepts, this knowledge IS the tools of this trade.
    Better still, understand it, then you won't need to memorise a thing.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Unsigned Long returning signed value
    By dinklebaga in forum C Programming
    Replies: 3
    Last Post: 03-06-2009, 06:07 AM