Thread: Code Parsing in C/C++

  1. #1
    Registered User
    Join Date
    Feb 2008
    Posts
    2

    Data Parsing in C/C++

    Hi Guys,

    I am looking for a solution of parsing data in C language, I have attached the code below with some comments in it. What actually I am looking for not to call the sscanf function with the number data is available (In this it is 3) as it is random and can be anywhere between 0 to 9.

    Code:
    main()
    {
        char buffer[500] = "DATA:  3,43,56,32f402,44,57,32f403,45,58,32f404";
        char *buf = buffer;
        int count;
        sscanf(buffer, "DATA: %d", &count); //First Value is Count, get it in a variable.
        printf("Count - %d\n", count);
        buf+=8; //strlen("DATA: 3,") = 8
        for(int c = 0; c < count; c++)
        {
            int a1, a2, a3;
            sscanf(buf, "%d,%d,%x", &a1, &a2, &a3);
            printf("%d,%d,%x\n", a1, a2, a3);
            buf+=13;//strlen("43,56,32f402,");
        }
        return 0;
    }
    Can some one suggest any alternate solution with less number of calls to sscanf, in this sample code it is happenning count+1 times.

    If anyone can point me to C++ library routine will be helpfull as well.

    Many thanx,
    Sgies
    Last edited by sgies; 02-18-2008 at 10:37 AM. Reason: Misleading Title

  2. #2
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    First main returns int.
    And the solution to this would probably be to search for , and then extract. This could be done in various ways. For example, strstr to search for a , and then truncate that with a \0 and use appropriate library function such as strtol to convert the buffer to an integer.
    Then you could use a pointer to set the next position in the buffer and continue this approach:

    Code:
    const char* buf = "This,is,a,test";
    const char* p = buf;
    while (p < buf + strlen(buf))
    {
    	char* pPos = strstr(p, ",");
    	*pPos = '\0';
    	long mylong = strtol(buf, NULL, 10);
    	p = pPos + 1;
    }
    Something like that. Not tested, of course.
    Last edited by Elysia; 02-18-2008 at 10:50 AM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #3
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    2Elysia - strchr will be better than strstr
    strtol requires 3 parameters the pointer to the end of parsed number and base which should be 10 or 16 depending on the position of the number in string

    2sgies - &#37;n in scanf will give the position where the parsing stopped
    It could be used to avoid magic constants like += 8 and += 13
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  4. #4
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by vart View Post
    2Elysia - strchr will be better than strstr
    strtol requires 3 parameters the pointer to the end of parsed number and base which should be 10 or 16 depending on the position of the number in string
    I'm not very familiar with C functions, and I've never used strchr so you'll forgive me for not giving out 100% perfect code.
    Oh yes, I always forget that strtol takes 3 arguments. Let me fix that...
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  5. #5
    Registered User
    Join Date
    Oct 2004
    Posts
    151
    You can use strtok() to chop up the data, especially if your delimiters are no more complicated that a single comma.

    If your C library is POSIX, you can use regex.h functions to parse it (but then you've got two problems).

    Or you could go to the extreme and learn about lex.

  6. #6
    Registered User
    Join Date
    Feb 2008
    Posts
    2
    Thanks guys for your answers but I like to concentrate on sscanf rather then using strchr or strtok. If you can please turn back the topic on discussing sscanf will be great.

    BTW 3 in the sample code I mention can be seen as 3 rows (with 3 elements in each which I am parsing in loop) in case I am not clear in writing.

    Sgies

  7. #7
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    You could use a pointer to specify the position in the file for sscanf to start extracting. But you will still need to find the next ",", so using a searching function might still be necessary.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #8
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Quote Originally Posted by Elysia View Post
    You could use a pointer to specify the position in the file for sscanf to start extracting. But you will still need to find the next ",", so using a searching function might still be necessary.
    Using %n format you can also know where the sscanf stopped parsing and continue parsing from that position
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    This code works:
    Code:
    #include <stdio.h>
    
    int main()
    {
        char buffer[500] = "DATA:  3,43,56,32f402,44,57,32f403,45,58,32f404";
        char *buf = buffer;
        int count;
    	int pos;
        sscanf(buffer, "DATA: %d,%n", &count,&pos); //First Value is Count, get it in a variable.
        printf("Count - %d\n", count);
        buf+=pos; //strlen("DATA: 3,") = 8
        for(int c = 0; c < count; c++)
        {
            int a1, a2, a3;
            sscanf(buf, "%d,%d,%x,%n", &a1, &a2, &a3, &pos);
            printf("%d,%d,%x,%d\n", a1, a2, a3, pos);
            buf+=pos;
        }
        return 0;
    }
    As does this, but it's certainly a bit more complicated:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <assert.h>
    #include <string.h>
    
    void bad_input()
    {
    	printf("could not parse input\n");
    	exit(1);
    }
    
    int main()
    {
            char buffer[500] = "DATA:  3,43,56,32f402,44,57,32f403,45,58,32f404";
            char *buf = buffer;
            int count;
    	int pos;
    	char *p;
    	char *after;
    	p = strchr(buf, ':');
    	if (!p) bad_input();
    	buf = p+1;
            count = strtol(buf, &after, 10);
            printf("Count - %d\n", count);
    	buf = after+1;
            for(int c = 0; c < count; c++)
            {
                    int a1, a2, a3;
    		assert(*after == ',');
    		a1 = strtol(buf, &after, 10);
    		buf = after+1;
    		assert(*after == ',');
    		a2 = strtol(buf, &after, 10);
    		buf = after+1;
    		assert(*after == ',');
    		a3 = strtol(buf, &after, 16);
    		buf = after+1;
                   printf("%d,%d,%x\n", a1, a2, a3);
            }
    
            return 0;
    }
    Ok, so the latter code does some more error checks, which could help if the input data is "malformed" for any reason.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Proposal: Code colouring
    By Perspective in forum A Brief History of Cprogramming.com
    Replies: 28
    Last Post: 05-14-2007, 07:23 AM
  2. Values changing without reason?
    By subtled in forum C Programming
    Replies: 2
    Last Post: 04-19-2007, 10:20 AM
  3. Obfuscated Code Contest
    By Stack Overflow in forum Contests Board
    Replies: 51
    Last Post: 01-21-2005, 04:17 PM
  4. Updated sound engine code
    By VirtualAce in forum Game Programming
    Replies: 8
    Last Post: 11-18-2004, 12:38 PM
  5. Interface Question
    By smog890 in forum C Programming
    Replies: 11
    Last Post: 06-03-2002, 05:06 PM