Thread: sscanf need to include spaces in character string

  1. #1
    Registered User
    Join Date
    Mar 2010
    Posts
    14

    sscanf need to include spaces in character string

    I am reading some comma separated values in from a file and I need to read the last one as a character string, however the string may include white space and I can't get it to work correctly. it stops reading once it encounters white space. This is the line that I am using to format the data...type and code are ints and description is a char*

    Code:
    sscanf(token, "%d,%d,%s", &type, &code, description);

  2. #2
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    The sscanf() %s format descriptor is required to stop when it encounters whitespace. So the behaviour you are seeing is exactly what is intended to happen.

    If your application requires that space characters be treated as input (rather than something between inputs) then you need to use some other method of intepreting the content of your string.

    One option (assuming your token has the complete set of data that needs to be placed in description) is to search for the second comma, and copy everything after it into description.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  3. #3
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Or you can use a conversion specifier that will accept things with whitespace, such as %[.

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    %[^,] specifically being a good one here.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by MK27 View Post
    %[^,] specifically being a good one here.
    I don't think we're stopping at a comma, but at the end of the line.

  6. #6
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    %[^\n]
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  7. #7
    Registered User
    Join Date
    Mar 2010
    Posts
    14
    I have actually already stripped the \n out of the token and replaced it with a \0. I'm still unsure how to do this. Are you saying use the %[^\n] but I can't use sscanf anymore??

  8. #8
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by robin2aj View Post
    I have actually already stripped the \n out of the token and replaced it with a \0. I'm still unsure how to do this. Are you saying use the %[^\n] but I can't use sscanf anymore??
    Then unless you have other reasons for stripping the \n, that's a wasted step. You can use sscanf regardless.

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    %[^\n] is a scanf template specifier like %s.

    %[] means all content containing characters in braces. You can use ascii ranges, eg, %[a-zA-Z0-9] will include everything alphanumeric, %[a-z!-/] will include small letters and most punctuation, %[ \t\n] will include spaces tabs and newlines.

    %[^] is the inverse, it means all characters except this. %[^\n] -- everything up to the newline.

    %[^NULL] aka, the "sawed off shotgun" specifier is legal too It's very inclusive.
    Last edited by MK27; 04-10-2010 at 02:00 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  10. #10
    Registered User
    Join Date
    Mar 2010
    Posts
    14
    so I should be able to do this?
    [code]

    sscanf(token, "%d,%d,%[^NULL]", &type, &code, description);

    [\code]

  11. #11
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    There is only one way to find out if the "sawed off shotgun" works Apparently it does.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  12. #12
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by robin2aj View Post
    so I should be able to do this?
    [code]

    sscanf(token, "%d,%d,%[^NULL]", &type, &code, description);

    [\code]
    Unless your description has a N, U, or L in it, in which case you're in trouble.

    The point is that this:
    Code:
    sscanf(token, "%d,%d,%[^\n]", &type, &code, description);
    does not need a new-line character to stop. It will stop when the string runs out of data, all by itself. But every character in your token string after the second comma that matches "does not equal \n" will be added to description.

    EDIT: Sorry, that last sentence is a little incorrect. What I mean is that, starting at the second comma, every character will be checked; if it matches "does not equal \n", it will be added to description and we continue; if we run out of data, or we find a character that doesn't match, we stop (so no further characters are added).
    Last edited by tabstop; 04-10-2010 at 02:17 PM.

  13. #13
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by tabstop View Post
    Unless your description has a N, U, or L in it, in which case you're in trouble.
    Sorry, can't help myself sometimes.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  14. #14
    Registered User
    Join Date
    Mar 2010
    Posts
    14
    I do have other reasons for stripping the newline. Thanks for the help. I'm using [^\n] and it seems to be working how I expect it to.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Unable to compare string with 'getter' returned string.
    By Swerve in forum C++ Programming
    Replies: 2
    Last Post: 10-30-2009, 05:56 PM
  2. OOP Question DB Access Wrapper Classes
    By digioz in forum C# Programming
    Replies: 2
    Last Post: 09-07-2008, 04:30 PM
  3. String issues
    By The_professor in forum C++ Programming
    Replies: 7
    Last Post: 06-12-2007, 09:11 AM
  4. We Got _DEBUG Errors
    By Tonto in forum Windows Programming
    Replies: 5
    Last Post: 12-22-2006, 05:45 PM
  5. Calculator + LinkedList
    By maro009 in forum C++ Programming
    Replies: 20
    Last Post: 05-17-2005, 12:56 PM