Thread: Signed Char Overflow

  1. #1
    Registered User
    Join Date
    Nov 2006
    Posts
    65

    Signed Char Overflow

    Hello everyone,


    I just started learning C and don't quite understand whether you can simply store a non ASCII character (single byte character with a decimal value above 127) in a variable defined as a char. On my OS, the default char is signed. However, getchar() or similar functions seem to return a positive integer, possibly exceeding 127 (except the negative EOF value).

    Section 6.3.1.3 of the standard (latest version ISO/IEC 9899:1999) seems to say that the handling of this overflow:
    Code:
     char c = 170; /* char is by default signed on this OS causing an overflow */
    is implementation-defined or an implementation-defined signal is raised.

    In practice, the overflow seems to not cause too much trouble on my computer, as the original value (0-255) can be obtained by using a cast, like I did in the code below. However, I have no idea if other (possibly smarter) processors might convert +170 to +127 inseatd of some negative value though. So, I would like to know whether this overflow should just be ignored, and is handled reasonably protably; or if getchar() and similar functions can only be used with ASCII files; or if the char has to be explicitly defined as an unsigned char? The latter causes other problems though, because most standard functions expect a char, not an unsigned char, right? Furthermore, is there ever a possibility that an integer or float/double overflow overwrites adjacent memory, instead of just rolling over or staying at MIN or MAX?

    Thanks for reading! Below is just some short sample code.

    Code:
    #include <stdio.h>
    
    int main(void) {
      char c;
      int i;
      printf( "Enter one character: " );
      i = getchar();
      c = i;
      printf( "signed c equals: %d\t unsigned c equals: %d\n", c, (unsigned char)c );
      c = 170; /* generates compiler overflow warning */
      printf( "signed c equals: %d\t unsigned c equals: %d\n", c, (unsigned char)c );
          /* prints signed c equals: -86    unsigned c equals: 170 */
      return 0;
    }

  2. #2
    Registered User
    Join Date
    Jun 2006
    Posts
    75
    > c = 170; /* generates compiler overflow warning */

    It's not an overflow: you're assigning an out-of-range value.

    You're assigning a value in the range 0..UCHAR_MAX to a variable that can only represent the CHAR_MIN..CHAR_MAX range. This will work correctly on most implementations (like yours), but is not guaranteed by the standard. (According to the standard (as you've mentioned), either an implementation defined value is assigned to c or an implementation defined signal is raised (signals are specific to C99).)

    >So, I would like to know whether this overflow should just be ignored, and is handled
    >reasonably protably; or if getchar() and similar functions can only be used with ASCII
    >files; or if the char has to be explicitly defined as an unsigned char? The latter causes
    >other problems though, because most standard functions expect a char, not an unsigned
    > char, right?

    I believe that this "problem" with getchar() and other functions can be ignored, as it will work properly on most systems. It's a good idea to write code as standard and as portable as possible, but sometimes it's not worth the effort.

  3. #3
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >I just started learning C and don't quite understand whether you can simply
    >store a non ASCII character (single byte character with a decimal value above 127) in a variable defined as a char.
    No. Vanilla char can be either signed or unsigned, so you can't assume either. Therefore, you have to assume a strict subset of the two ranges: [0..CHAR_MAX). If you expect to exceed that range in any way, shape, or form, use a more appropriate type.

    >However, getchar() or similar functions seem to return a positive integer, possibly exceeding 127 (except the negative EOF value).
    To be specific, while getchar returns an int type, the range it's required to adhere to is that of an unsigned char, or EOF. So to be strictly correct, you would always test for EOF, then assign the value to an unsigned char. Alternatively, most compilers have a setting where you can force the vanilla char to be unsigned.

    >Section 6.3.1.3 of the standard (latest version ISO/IEC 9899:1999) seems to say
    I love you.
    My best code is written with the delete key.

  4. #4
    Registered User
    Join Date
    Mar 2006
    Posts
    725
    Let's address some stuff here.

    Arithmetic overflow should never overwrite memory you don't own. When a processor performs a calculation on a machine word it typically loads the word into a register and loads its contents into special calculating hardware. No other registers, cache lines or RAM chunks are directly affected.


    You could design a "smart" processor which handles overflows, you say. But this "smart" processor would also be an awfully slow and unwieldy processor because of the extra routines involved.


    I love you.
    Yeah, well some of us don't really seem to have the, uh, shellings for a copy.
    I just passed by a web store offering a copy for nearly $300 0_o hooray for WBM.
    Code:
    #include <stdio.h>
    
    void J(char*a){int f,i=0,c='1';for(;a[i]!='0';++i)if(i==81){
    puts(a);return;}for(;c<='9';++c){for(f=0;f<9;++f)if(a[i-i%27+i%9
    /3*3+f/3*9+f%3]==c||a[i%9+f*9]==c||a[i-i%9+f]==c)goto e;a[i]=c;J(a);a[i]
    ='0';e:;}}int main(int c,char**v){int t=0;if(c>1){for(;v[1][
    t];++t);if(t==81){J(v[1]);return 0;}}puts("sudoku [0-9]{81}");return 1;}

  5. #5
    Registered User
    Join Date
    Nov 2006
    Posts
    65
    Thanks for the help all.

    or an implementation defined signal is raised (signals are specific to C99)
    I checked whether my processor does indeed set an FPE signal (that seems like the most appropriate one). However, probably for performance reasons, as jafet said, no signal is raised.
    Code:
    #include <stdlib.h>
    #include <signal.h>
    #include <stdio.h>
    
    void handle_signal(int signal_number) {
      printf( "Caught a Floating Point Error signal! Signal number: %d\n", signal_number );
      exit(signal_number);
    }
    
    int main(void) {
      char c;
      int i;
      if(signal(SIGFPE, handle_signal) == SIG_ERR) {
        printf( "Could not set signal handler" );
      }
      printf( "Enter one character: " );
      i = getchar();
      c = i;
      return 0;
    }
    Alternatively, most compilers have a setting where you can force the vanilla char to be unsigned.
    Thanks for the hint. I use gcc, so -funsigned-char would do the trick. But the problem is that my program (the one I will write once I finish learnings XS) will take input from perl and from a file and then make use of the MySQL C API to transfer the data to a MySQL DB. The API requires that one link with another library, which is already compiled with default OS char, I assume. The documentation states that text data and binary data should use char data type.

    But, I think that this should be easily solvable, considering your replies. If I initially transfer the data to an unsigned char and then type cast the final pointer to a normal (char *), before using it with the API, all should be well.

    Arithmetic overflow should never overwrite memory you don't own.
    Thanks for confirming this; good to hear.

    I love you.
    Yeah, well some of us don't really seem to have the, uh, shellings for a copy. I just passed by a web store offering a copy for nearly $300 0_o hooray for WBM.
    lol
    I got "the working paper" from here:
    http://www.open-std.org/JTC1/SC22/WG14/

  6. #6
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >Yeah, well some of us don't really seem to have the, uh, shellings for a copy.
    The draft documents are freely available and the electronic copy is $18. There's a book from Wiley that's $60, and the official hardcopy is about $300. There's a range for everyone, so you've no excuses.

    >Arithmetic overflow should never overwrite memory you don't own.
    Signed overflow results in undefined behavior. The implementation could do anything, regardless of what the processor does internally.
    My best code is written with the delete key.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. signed char type
    By sarahr202 in forum C++ Programming
    Replies: 9
    Last Post: 05-21-2009, 07:12 PM
  2. The Interactive Animation - my first released C program
    By ulillillia in forum A Brief History of Cprogramming.com
    Replies: 48
    Last Post: 05-10-2007, 02:25 AM
  3. lvalue error trying to copy between structures
    By emanresu in forum C Programming
    Replies: 2
    Last Post: 11-16-2006, 06:53 AM
  4. Replies: 6
    Last Post: 06-30-2005, 08:03 AM
  5. String sorthing, file opening and saving.
    By j0hnb in forum C Programming
    Replies: 9
    Last Post: 01-23-2003, 01:18 AM