Thread: is this ever possible?

  1. #1
    DESTINY BEN10's Avatar
    Join Date
    Jul 2008
    Location
    in front of my computer
    Posts
    804

    is this ever possible?

    Halo,
    I was just going through "C programming gotchas" over the internet. There I found this issue. For the code below:
    Code:
    #include<stdio.h>
    int main(void)
    {
    	char ch;
    	ch=getchar();
    	while(ch!=EOF)
    	{
    		putchar(ch);
    		ch=getchar();
    	}
    }
    This is what they've qouted
    The loop may never terminate: if char is an unsigned type then EOF will be converted to some positive value. On systems with where char is signed, there is a more subtle bug. Suppose for example that EOF is -1 - then if character 255 is read it will be converted to the value -1 and terminate the input prematurely.
    If I take ch to be of unsigned type then it's fine that the loop will never terminate coz ch will always be positive. But what if I keep it signed as shown above then is it ever possible that ch will be 255(for which ch will be interepreted as -1)? I mean except for Cntrl+Z nothing will break the loop ? Then why it's converted to "int ch"? Why not it remain as "char ch"?
    Thanks
    HOPE YOU UNDERSTAND.......

    By associating with wise people you will become wise yourself
    It's fine to celebrate success but it is more important to heed the lessons of failure
    We've got to put a lot of money into changing behavior


    PC specifications- 512MB RAM, Windows XP sp3, 2.79 GHz pentium D.
    IDE- Microsoft Visual Studio 2008 Express Edition

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by BEN10 View Post
    But what if I keep it signed as shown above
    Show me where you've declared that it is signed.


    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Registered User
    Join Date
    Apr 2009
    Posts
    41
    Quote Originally Posted by quzah View Post
    Show me where you've declared that it is signed.


    Quzah.
    Isn't it signed by default? Only unsigned if its declared as so.
    Eventhough it isn't made explict doesn't mean its not.


    edited: Added content
    http://gcc.gnu.org/ml/gcc/2007-01/msg00966.html
    During the standards process of the original C standard (ANSI C89),
    Dennis Ritchie expressed an opinion that in hindsight, making chars
    signed was a bad idea, and that logically chars should be unsigned.
    After it was made standard I don't think it has ever changed. It would be a problem if some compilers made it signed by default and others unsigned by default.


    Also a char is an int. The compiler does the look-up on the table to tell you what character the number is suppose to represent. I'm sure you know that already. I've read several of your post. When you use a char in an evaluational expression you should think of it as a number. If's whiles' for's and so on.

    cnt z is only the EOF on PC's its soemthing different on unix I think.
    while(ch!=EOF)
    just says as long as I don't press the end of file terminator to continue.

    I guess to be safe make it explict and just say it's signed.
    Last edited by strickyc; 06-27-2009 at 03:56 AM.

  4. #4
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> Isn't it signed by default? Only unsigned if its declared as so.

    AFAIK, no. It's implementation defined.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  5. #5
    Registered User
    Join Date
    Apr 2009
    Posts
    41
    Quote Originally Posted by Sebastiani View Post
    >> Isn't it signed by default? Only unsigned if its declared as so.

    AFAIK, no. It's implementation defined.
    I'm pretty sure it's signed by default as a standard. Any compiler that doesn't have it like this doesn't adhere to the standards. Also I think it's an option you can change on some compilers.

  6. #6
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by strickyc View Post
    I'm pretty sure it's signed by default as a standard.
    Well you're wrong. The standard doesn't say it has to be signed. It can be either one. That's the entire reason why the program is wrong in the first post.


    Quzah.
    Hope is the first step on the road to disappointment.

  7. #7
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by strickyc View Post
    It has to be one or the other. it cant be both and it cant be neither. the standard states it should be signed. Thats what makes it a standard. I understand not everything is up to all the standards.
    No it doesn't.
    Quote Originally Posted by The C Standard
    The three types char, signed char and unsigned char are collectively called character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.
    Colored emphasis mine.

    Quzah.
    Hope is the first step on the road to disappointment.

  8. #8
    Registered User
    Join Date
    Apr 2009
    Posts
    41
    Character types in C and C++

    Ok you win. Its not a standard.

  9. #9
    DESTINY BEN10's Avatar
    Join Date
    Jul 2008
    Location
    in front of my computer
    Posts
    804
    Quote Originally Posted by quzah View Post
    Show me where you've declared that it is signed.


    Quzah.
    If it is unsigned it's range would have been from 0-255 but when I try to do this
    Code:
    #include<stdio.h>
    int main(void)
    {
    	char ch=128;
    	printf("%d",ch);
    }
    it prints -128 which means it's range is not from 0-255 and thus it is signed by default. This is true for any number out of the range -128-127.
    HOPE YOU UNDERSTAND.......

    By associating with wise people you will become wise yourself
    It's fine to celebrate success but it is more important to heed the lessons of failure
    We've got to put a lot of money into changing behavior


    PC specifications- 512MB RAM, Windows XP sp3, 2.79 GHz pentium D.
    IDE- Microsoft Visual Studio 2008 Express Edition

  10. #10
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    The program has bugs whether it's signed or unsigned (and I believe it is implementation defined) -

    If it's signed, and the byte 255 (0xFF) is read, it will be sign extended to (0xFFFFFFFF), which, if interpreted as a signed int, would be -1, and the loop will end prematurely.

    If it's unsigned, the loop will never terminate. Because when -1 (EOF) is read, it will be truncated to 0xFF, and then zero-extended to 0x000000FF, which is obviously different from 0xFFFFFFFF.

    As for whether 255 can be read, of course. Think about input redirection.

    Why not just keep it as an int?

  11. #11
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by BEN10 View Post
    If it is unsigned it's range would have been from 0-255 but when I try to do this
    Code:
    #include<stdio.h>
    int main(void)
    {
    	char ch=128;
    	printf("%d",ch);
    }
    it prints -128 which means it's range is not from 0-255 and thus it is signed by default. This is true for any number out of the range -128-127.
    ..... for your compiler.

    Other compilers are allowed to do different things to what you have observed. And, in practice, they do. That is, roughly, the meaning of "implementation defined" in the standard.

    The original program has bugs because getchar() returns an int. If the return value is outside the value that can be represented in a char, then a conversion occurs and that conversion will involve a change of the value stored in ch.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  12. #12
    DESTINY BEN10's Avatar
    Join Date
    Jul 2008
    Location
    in front of my computer
    Posts
    804
    Quote Originally Posted by cyberfish View Post
    The program has bugs whether it's signed or unsigned (and I believe it is implementation defined) -

    If it's signed, and the byte 255 (0xFF) is read, it will be sign extended to (0xFFFFFFFF), which, if interpreted as a signed int, would be -1, and the loop will end prematurely.

    If it's unsigned, the loop will never terminate. Because when -1 (EOF) is read, it will be truncated to 0xFF, and then zero-extended to 0x000000FF, which is obviously different from 0xFFFFFFFF.

    As for whether 255 can be read, of course. Think about input redirection.

    Why not just keep it as an int?
    Can you show me just an example where 255 is read? I dont know what is input redirection.
    Btw there's no problem keeping it as an int but I just wanted to know why they do that if it can be done with a char too.
    HOPE YOU UNDERSTAND.......

    By associating with wise people you will become wise yourself
    It's fine to celebrate success but it is more important to heed the lessons of failure
    We've got to put a lot of money into changing behavior


    PC specifications- 512MB RAM, Windows XP sp3, 2.79 GHz pentium D.
    IDE- Microsoft Visual Studio 2008 Express Edition

  13. #13
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Assuming a.txt contains a single byte - 255.

    Code:
    yourprogram.exe <  a.txt
    To your program, it would seem like 255 was entered in the console.

    Or if your program is taking input from another program (less common in Windows, but very common in UNIX) -
    Code:
    anotherprogram.exe | yourprogram.exe
    And of course, anotherprogram can output any byte it wants.

    [edit]
    Also, it might be possible that 255 can be part of a UTF-8 entity (not sure about this, I don't know too much about UTF-8).

    Or maybe the user doesn't use ASCII (some mainframes don't, and many non-English languages don't), and 255 represents something common.
    [/edit]
    Last edited by cyberfish; 06-27-2009 at 07:32 AM.

  14. #14
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    > Also a char is an int.

    char is an integer. integer is a classification. int is another type that is in the classification of integers.

    >The compiler does the look-up on the table to tell you what character the number is suppose to represent.

    Unless the compiler is displaying the graphical representation of a character, it probably doesn't need to look up the encoding in a table. Even if it printed the character to the screen, it probably lets another code library or hardware to the lookup for it.

  15. #15
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Since getchar() returns an int, you should get a warning telling you about 'possible loss of data' when assigning an int to a char. Assuming of course that you enable a high compiler warning level, which you always should.
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

Popular pages Recent additions subscribe to a feed