Thread: Parsing a string

  1. #1
    Registered User
    Join Date
    Oct 2005
    Posts
    15

    Question Parsing a string

    - A have a hardware board with a 80186 processor.
    - I communicate with the board with telnet.
    - The server is a delphi program
    - I have developed a protocol for the communication so I can connect many hardware boards.

    I send strings with telnet, The string need to be parsed at the hardware board, which is programmed in C.

    I have never build a parser before, so i would like to know what's the best method for this? I new to C programming so I'm not very good with C.

    I have an example with the strtok function. It works, but I don't know how to detect errors in the strings. The protocol string have the following lay-out:

    command^comment|date|time|MAC|field|field|field|

    the number of fields is variable.

    Thanks

  2. #2
    Registered User
    Join Date
    Aug 2005
    Posts
    1,267
    it would be a lot easier to develop the program with a desktop program and compiler. Once this is tested you can transfer to the final target compiler. Your desktop program doesn't need to use telnet, just hardcode some strings for testing/debugging purposes. I would also use strtok() to tokenize the string. Then parse each of the strings for possible errors -- what errors you will have to decide because you havn't provided enough detailed information (that I probably don't need or want to know anyway).

  3. #3
    Registered User
    Join Date
    Oct 2005
    Posts
    15
    You have understand me wrong i think. I develop the source code with borland C++. I have special plugins for the hardware so i can debug via telnet.

    The biggest problem is the variable fields. I find it hard to make some code that can easily handle this.

  4. #4
    Registered User
    Join Date
    Sep 2005
    Location
    Sydney
    Posts
    60
    What type of data does each field contain? If this is known beforehand, you may be able to just parse the whole line with sscanf() in a loop. Otherwise, there are some good parsing tools out there: yacc is one I recommend - it's not difficult to use once you know what your grammar is (and in this case you seem to have that pretty well sorted).

    If neither of those methods are applicable to your situation, you will probably have to write a custom parser. This, again, isn't too difficult once you understand the concepts involved. Basically you need to work out what your grammar is, then what you want to do when you encounter each element. If you decide to go with writing your own, you should go read up on the topic. Here is a good place to start:
    http://en.wikipedia.org/wiki/Parsing

  5. #5
    Just kidding.... fnoyan's Avatar
    Join Date
    Jun 2003
    Location
    Still in the egg
    Posts
    275
    Maybe a primitive but applicable way: what about using some control characters in the message for each field and ,after receving the message string, counting all the control characters until running into a \0. So, at the end of the loop you will have the number of fields.

    I used this technique for a simple CGI URL parser.

  6. #6
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Another idea or two.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  7. #7
    Registered User
    Join Date
    Oct 2005
    Posts
    15
    Thnx for the replies!

    I tried the last example, but i get an error...probably my lack of C knowledge

    I want to make a function with a string as input and an array of fields as output.

    The protocol is definied as following:

    command^comment|field|field|field|0x0A

    command = 3 numbers
    0x0A = end of string (telnet)

    Code:
    char parser(char *line)
    {
       int i;
       char *token = line;
       char result;
       for ( i = 0; *line; ++i )
       {
       	size_t len = strcspn(token, "|\n");
       	printf("token[%2d] = \"%*.*s\"\n", i, (int)len, (int)len, token);
       	token += len + 1;
          result[i] = token;
       }
       return result;
    }
    I get the error: Invalid indirection at the line: result[i] = token;

  8. #8
    Registered User
    Join Date
    Sep 2005
    Location
    Sydney
    Posts
    60
    That's because you've declared result as a char, so result[i] doesn't make much sense.

  9. #9
    Registered User
    Join Date
    Oct 2005
    Posts
    15
    how can i return a array of strings ??

  10. #10
    Registered User
    Join Date
    Oct 2005
    Posts
    15
    I started testing and I can't get it working

    I have made the following function:

    Code:
    void parse(char *line,char *output)
    {
      char buf[100],ch;
      int i,count=0;
    
      //count | tokens in string
      for(i=0;ch!='\0';i++)
         {
           ch=line[i];
           if(ch=='|')
               count++;
      }
      printf("count=%d\n",count);
    
      //parse string into fields
       for (i=0;i==count;++i)
       	{
       	 size_t len = strcspn(line,"|");
                     sprintf(buf,"%*.*s\n", (int)len, (int)len, line);
                     printf("%s\n",buf);
       	 line += len + 1;
       	}
    }
    the token count works fine. The program prints count = 6 when i use the following as input:

    Code:
    002^Resultaat op aanmelden|20050915|121533|INT^TEMP^1^S|BOOL^LED^1^A|BOOL^LED^2^A|0x0A
    the format is: command^comment|date| time|field|field|field|0x0A

    unfortunately it doesnt print the fields

    i want the function to output an array of fields, but i havent implemented that yet

  11. #11
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    You removed an important part from the snippet (and added some unnecessary stuff). If instead you started with the base snippet and merely changed the second argument to strcspn from ";\n" to "|\n", it ought to give you an output like this.
    Code:
    line 1:
    token[ 0] = "002^Resultaat op aanmelden"
    token[ 1] = "20050915"
    token[ 2] = "121533"
    token[ 3] = "INT^TEMP^1^S"
    token[ 4] = "BOOL^LED^1^A"
    token[ 5] = "BOOL^LED^2^A"
    token[ 6] = ""
    When you say that the line ends in 0x0A, you do mean *nix-style newline, right (as opposed to actual literal text "0x0A")?
    Quote Originally Posted by robin171
    how can i return a array of strings ??
    You can't.
    Last edited by Dave_Sinkula; 10-17-2005 at 09:04 PM. Reason: Typo: way/say.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  12. #12
    Registered User
    Join Date
    Oct 2005
    Posts
    15
    I work with telnet, where 0x0A is the end of line character

  13. #13
    Registered User
    Join Date
    Oct 2005
    Posts
    15
    It just won't work

    I tried the examples and i can print the fields, but I want to store them in variables, so i can work with them in my program.

    Yacc looks suitable for my problem, but i don't know how to work with it.



    I will explain my problem in more detail:

    I am building a home automation system. Every room has a roomcontroller with several sensors and actuators. The core of the roomcontroller is a Beck SC12 chip (complete tcp/ip stack + 80186 proc.) and a I/O controller.

    I can debug the program with a program via telnet.

    There is a server which runs a delphi program that communicates with the roomcontrollers via telnet.

    For this I have the following protocol:

    Code:
    command^comment|MAC|field|field|field|0x0A
    the command field is a command of 3 numbers
    tha MAC field is for identification (for example 0A_0A_0A_0A-0A_0A)
    the number of fields is variable and there are subfields separated by ^
    0x0A is the telnet end of line

    an example string:

    Code:
    001^Aanmelden|00_00_00_00_00_00|INT^TEMP^1^S|INT^LIGHT^1^S|BOOL^RELAY^2^A|0x0A
    this string should do the following:
    - login to the server (command 001)
    - tell the server what sensors and actuators the roomcontroller has.

    In this case:
    tempature sensor INT = integer TEMP = name 1 = number S = sensor)

    light sensor INT = integer LIGHT = name 1 = number S = sensor)

    relay BOOL = boolean RELAY = name 2 = number A = actuator


    Anyone got any suggestions?? I'm really stuck on this

  14. #14
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Quote Originally Posted by robin171
    I tried the examples and i can print the fields, but I want to store them in variables, so i can work with them in my program.
    The printing is just a very simple example of *doing something* with them. If you want to store them, then you'll have to write that code (with linked lists or other such data structures).
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  15. #15
    Registered User
    Join Date
    Oct 2005
    Posts
    15
    I can't get that working. The problem is the variable fields, so I want to store the data in a array, but I can't get that working. Someone got an idea ??

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. char Handling, probably typical newbie stuff
    By Neolyth in forum C Programming
    Replies: 16
    Last Post: 06-21-2009, 04:05 AM
  2. String parsing
    By broli86 in forum C Programming
    Replies: 3
    Last Post: 07-03-2008, 05:06 PM
  3. Message class ** Need help befor 12am tonight**
    By TransformedBG in forum C++ Programming
    Replies: 1
    Last Post: 11-29-2006, 11:03 PM
  4. Replies: 4
    Last Post: 03-03-2006, 02:11 AM
  5. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM