Thread: Scanning new line

  1. #1
    Registered User
    Join Date
    Mar 2010
    Location
    Australia
    Posts
    174

    Scanning new line

    I have this function:

    Code:
    void prompt(Line *head) {
        char command[CMDLENGTH] = {};
        int numScanned = 0;
        int lineNumber = 1;
        
        printf("? ");
        numScanned = scanf("%s", command);
        
        while (strcmp(command, "q") != 0 
                && strcmp(command, "x") != 0
                && numScanned != 0) {
                printf("command = %s\t", command);
            
            if (strcmp(command, "h") == 0) {
                helpCommand();
            } else if (strcmp(command, "p") == EQUAL
                        || strcmp(command, "+") == EQUAL
                        || strcmp(command, "-") == EQUAL
                        || strcmp(command, "\n") == EQUAL
                        || (stringIsNum(command) == YES
                        && stringToNum(command) > 0) ) {
                lineNumber = printCommand(head, command, lineNumber);
            } else {
                printf("Unknown command: ignoring\n");
            }
            
            printf("? ");
            numScanned = scanf("%s", command);
        }
        
    }
    And the way input works is by prompting the user with a question mark, then the user enters some character or number and pressed enter. I was first using getchar and then flushing out the new line character, but then I realized that the user can enter a multiple digit number as well, so I scrapped getchar and replaced it with scanf.

    Now I have a problem with scanf. While I want the new line characters after another character / number is entered to be flushed out, the user is supposed to be able to send enter as well and I'm meant to treat that as I would with the + character.
    The problem is that it seems like scanf doesn't read new line characters. So what can I do to get around this problem?

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    You may need to parse the input, possibly character by character, until the terminating characters are detected. Instead of using scanf, you would then parse the input yourself for the integers and other symbols.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Mar 2010
    Location
    Australia
    Posts
    174
    Quote Originally Posted by laserlight View Post
    You may need to parse the input, possibly character by character, until the terminating characters are detected. Instead of using scanf, you would then parse the input yourself for the integers and other symbols.
    In other words I should use getchar?

  4. #4
    Registered User
    Join Date
    Sep 2012
    Posts
    357
    Use fgets() to read a whole line (ENTER included), then parse it.
    You can parse it with strtol(), strtod(), sscanf() or character-by-character or a mixture.

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Mentallic
    In other words I should use getchar?
    If you want to. The way that I might approach this is to tokenise the input stream of characters, then work on the stream of tokens (not necessarily as two distinct steps, e.g., once you have a token, you could process it instead of storing it for later processing). Reading character by character using getchar is one way to implement such an approach.

    However, this approach is more for the case where the input is expected to be a continuous stream of tokens, but now that I look at your initial post again, you seem to be prompting token by token here.

    Quote Originally Posted by qny
    Use fgets() to read a whole line (ENTER included), then parse it.
    As such, I think that the approach using fgets would be simpler. I was going to suggest this earlier, but it is tricky to handle if you want to parse many tokens at one go, unless you end up treating the string read as a stream of characters that you then parse for tokens. That said, there's a caveat: the newline character is not guaranteed to be in the input stored by fgets.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    Mar 2010
    Location
    Australia
    Posts
    174
    Thanks for the input guys. I think I'm going to give fgets and go, although I am curious as to why newline characters aren't guaranteed to be stored?

    I don't know what parsing is, but I guess I could make a small function such as
    Code:
    char *removeNewLine(char *command)
    so that it could remove the new line character at the end of the string, and I can carry on as I have been so far with scanf.

    Oh and just an added question: Would sscanf work in this case, or would that just work in the same manner as scanf?

  7. #7
    Registered User
    Join Date
    Jun 2011
    Posts
    4,513
    ... although I am curious as to why newline characters aren't guaranteed to be stored?
    Code:
    // example 1 - User input less than array size - newline gets stored:
    
    char userString[10];
    
    fgets(userString,10,stdin);
    
    // user enters "Hello\n"
    
    userString[0] = 'H'
    userString[1] = 'e'
    userString[2] = 'l'
    userString[3] = 'l'
    userString[4] = 'o'
    userString[5] = '\n'
    userString[6] = '\0'
    Code:
    // example 2 - User input greater than array size
    // Protected from overflow, but newline does not get stored:
    
    char userString[5];
    
    fgets(userString,5,stdin);
    
    // user enters "Goodbye\n"
    
    userString[0] = 'G'
    userString[1] = 'o'
    userString[2] = 'o'
    userString[3] = 'd'
    userString[4] = '\0'
    
    // no newline in array
    Would sscanf work in this case, or would that just work in the same manner as scanf?
    You can use a simple loop to look for the newline and remove it if present.

  8. #8
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    If you use fgets(), you can only have one position for the newline: strlen() - 1. It can't be located anywhere else in the string.

    Code:
    //if line is the char array name:
    int len = strlen(line)-1;
    if(line[len]=='\n'
       line[len]='\0';

  9. #9
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    To remove the newline (LF or CR LF) at the end of the line, I recommend
    Code:
    line[strcspn(line, "\r\n")] = '\0';
    If line is a pointer to the buffer (not the buffer itself), you can use
    Code:
    line += strspn(line, "\t\n\v\f\r ");
    to skip over leading whitespace, too.

    When doing both, I recommend
    Code:
    size_t len;
    
    line += strspn(line, "\t\n\v\f\r ");
    len = strcspn(line, "\r\n");
    line[len] = '\0';
    which also gives you the length of the remaining part in len.

    Quote Originally Posted by Adak View Post
    Code:
    if(line[len]=='\n'
       line[len]='\0';
    and here you will happily modify data outside line when the input contains an embedded NUL byte ('\0') at the beginning or just after a newline. Ouch. It really is possible for fgets() to return a string which starts with a \0. To test, just try reading the output of command-line command
    Code:
    printf 'First line\n\0Bad line\nThird line\n'

  10. #10
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Nominal Animal
    To remove the newline (LF or CR LF) at the end of the line, I recommend
    Code:
    line[strcspn(line, "\r\n")] = '\0';
    I don't recommend that because, as I noted earlier, fgets might not store the newline. Furthermore, since this is not binary input, we know that the newline will be mapped to '\n', regardless of the actual implementation defined newline sequence. Therefore, I recommend:
    Code:
    char *p = strchr(line, '\n');
    if (p)
    {
        *p = '\0';
    }
    Unless you need to get the string's length anyway, in which case checking the last character of the string (other than the null character) makes sense.

    EDIT:
    Oh, I just remembered: since strcspn's returns the string length if none of the search characters are found, you'll just end up assigning a null character to the position containing the terminating null character. Thus, in terms of correctness, Nominal Animal's suggestion is actually correct even in the face of no newline. However, the difference is that it does more than required, since strcspn is inherently meant to do more than strchr.
    Last edited by laserlight; 10-11-2012 at 08:26 PM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  11. #11
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    It really is possible for fgets() to return a string which starts with a \0.
    O_o

    Well, that is true, but if the input expected has embedded nulls `fgets' is the wrong tool for the job.

    Soma

  12. #12
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by laserlight View Post
    I don't recommend that because, as I noted earlier, fgets might not store the newline.
    If line does not contain a CR or an LF, then strcspn(line, "\r\n") == strlen(line).

    So, what's the problem? Edit: oh, okay: no problem as such, just a different preference.

    Quote Originally Posted by laserlight View Post
    Code:
    char *p = strchr(line, '\n');
    strchr() is faster, because strspn() and strcspn() are not that efficient. I cannot think of a case where it would be measurable, though.

    I suspect, but have not checked, that
    Code:
    char *p;
    if ((p = strchr(line, '\r'))) *p = '\0';
    if ((p = strchr(line, '\n'))) *p = '\0';
    is faster than the equivalent
    Code:
    line[strcspn(line, "\r\n")] = '\0';
    In practice, I believe the fastest is
    Code:
    size_t  len;
    
    len = strlen(line);
    while (len > 0 && (line[len] == '\n' || line[len] == '\r'))
        len--;
    line[len] = '\0';
    which additionally provides the length of the string left.

    (To be precise, it does not work exactly like the above snippets, though. The former code snippets terminate the string at the first linefeed or carriage return, whereas this one only removes them at end of string. Since there should ever be at most one newline per line, the difference should be irrelevant.)

    All of the above snippets are safe as long as line is not a null pointer.
    Last edited by Nominal Animal; 10-11-2012 at 08:45 PM.

  13. #13
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Nominal Animal
    If line does not contain a CR or an LF, then strcspn(line, "\r\n") == strlen(line).

    So, what's the problem?
    Refer to my edit.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  14. #14
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by phantomotap View Post
    Well, that is true, but if the input expected has embedded nulls `fgets' is the wrong tool for the job.
    Embedded nulls, in my experience, are almost always user errors: reading the wrong file, or such. It does not matter that fgets() is the wrong tool to read such input.

    I'm actually quite upset about people recommending a solution which may cause a buffer underrun bug. That's just not cool.
    Last edited by Nominal Animal; 10-11-2012 at 08:46 PM.

  15. #15
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    It does not matter that fgets() is the wrong tool to read such input.
    O_o

    Of course it does.

    If you are going to try to recover gracefully in the face of "user error" nothing shown so far is sufficient.

    I'm actually quite upset about people recommending a solution which may cause a buffer underrun.
    O_o

    [Edit]
    Yes. This was just a joke so don't get in a twist.

    I'm just pointing out the problem with your last bit of code.
    [/Edit]

    The `line[strlen(line)]' character will never be a newline character unless something goes horribly wrong.

    [Edit]
    *sigh*

    Removed the wrong portion of the post.

    I have no idea what was written here. Sorry.
    [/Edit]

    All joking aside, the code forwarded by you will not get rid of the newline in the "\0bad input\n" form of string so it is just as likely to cause a serious problem at a later point.

    That's just not cool.
    I agree.

    That's the only reason I posted.

    I just wanted to make the original poster aware that trying to recover from an unexpected situation (like nulls in the input) requires an alternative strategy.

    [Edit]
    The reason being, you kind of imply that your code would sufficient to deal with embedded nulls in string input.
    [/Edit]

    Soma
    Last edited by phantomotap; 10-11-2012 at 09:07 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 6
    Last Post: 06-07-2012, 02:50 AM
  2. Replies: 3
    Last Post: 04-29-2011, 01:02 PM
  3. Replies: 7
    Last Post: 12-13-2010, 02:13 PM
  4. scanning till new line
    By dpp in forum C++ Programming
    Replies: 2
    Last Post: 06-29-2009, 10:31 PM
  5. Scanning a line in a document
    By bc120 in forum C Programming
    Replies: 5
    Last Post: 01-02-2002, 02:06 PM