Thread: Help on splitting string in C

  1. #1
    Registered User
    Join Date
    May 2008
    Posts
    11

    Help on splitting string in C

    Hi all,

    i have a string in format of :

    Code:
    Packet length: 64 [Bytes], Arrival time: 2018-02-21 12:43:36.877229
    Ethernet II Layer, Src: 00:04:96:9a:d1:00, Dst: 4c:72:b9:d1:a5:9c
    IPv4 Layer, Src: 10.221.0.69, Dst: 10.221.79.19
    TCP Layer, [ACK], Src port: 54671, Dst port: 5900
    Payload Layer, Data length: 10 [Bytes] Data: 310000780438
    Each line is a newline, delimiter would be '\n',
    Second delimiter would be "," (comma)
    and third would delimiter would ":" (colon)

    At this point, each key would be present or may not be. If not exist, a NULL / empty string is set.


    Could anyone help in coding this in C. I am trying but its really taking me time and i only need this for a proof of concept script.

    Thank you !

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Do you really want to split MAC addresses at every colon?

    > I am trying but its really taking me time
    So post what you tried?

    > i only need this for a proof of concept script.
    So why not prove it with a language with better string support like Python or Perl?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    May 2008
    Posts
    11
    Quote Originally Posted by Salem View Post
    Do you really want to split MAC addresses at every colon?

    > I am trying but its really taking me time
    So post what you tried?

    > i only need this for a proof of concept script.
    So why not prove it with a language with better string support like Python or Perl?
    Hi ,

    nope.. that should stay intact. my aim is actually to take
    all the content only without the keys.

    "Everything after the ':" (colon) as my content"


    What i have tried is as below:

    Code:
    #include<stdio.h>
    #include<string.h>
    
    int main() {
        
      char* mystr =
      "Packet length: 64 [Bytes], Arrival time: 2018-02-21 12:43:36.877229\n"
      "Ethernet II Layer, Src: 00:04:96:9a:d1:00, Dst: 4c:72:b9:d1:a5:9c\n"
      "IPv4 Layer, Src: 10.221.0.69, Dst: 10.221.79.19\n"
      "TCP Layer, [ACK], Src port: 54671, Dst port: 5900\n"
      "Payload Layer, Data length: 10 [Bytes] Data: 310000780438\n";
    
       const char s[] = "\r";
       char *token;
       
       /* get the first token */
       token = strtok(mystr, s);
       
       /* walk through other tokens */
       while( token != NULL ) {
          printf( "%s\n", token );
          token = strtok(NULL, s);
       }
    }
    But still wondering how do i do second level of split upon getting token.
    Last edited by bladez; 02-21-2018 at 01:04 AM.

  4. #4
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    > const char s[] = "\r";
    And where do you see \r in the data? It's no wonder then what your results are (i.e. doing nothing).

    > But still wondering how do i do second level of split upon getting token.
    Try using strtok's re-entrant cousin, strtok_r(). It works in such a similar fashion I think you will be okay using it, but there is an example on the manual page if you get stuck.
    It will let you take a larger token and break it up into smaller ones with different delimiters.

  5. #5
    Registered User
    Join Date
    May 2008
    Posts
    11
    Quote Originally Posted by whiteflags View Post
    > const char s[] = "\r";
    And where do you see \r in the data? It's no wonder then what your results are (i.e. doing nothing).

    > But still wondering how do i do second level of split upon getting token.
    Try using strtok's re-entrant cousin, strtok_r(). It works in such a similar fashion I think you will be okay using it, but there is an example on the manual page if you get stuck.
    It will let you take a larger token and break it up into smaller ones with different delimiters.

    Now i do not understand, am i not making any sense in here ? no ?

    Code:
    #include<stdio.h>
    #include<string.h>
    
    int main() {
        
      char *str1 =
      "Packet length: 64 [Bytes], Arrival time: 2018-02-21 12:43:36.877229\r\n"
      "Ethernet II Layer, Src: 00:04:96:9a:d1:00, Dst: 4c:72:b9:d1:a5:9c\r\n"
      "IPv4 Layer, Src: 10.221.0.69, Dst: 10.221.79.19\r\n"
      "TCP Layer, [ACK], Src port: 54671, Dst port: 5900\r\n"
      "Payload Layer, Data length: 10 [Bytes] Data: 310000780438\r\n";
    
    
        char *str2, *token, *subtoken;
        char *saveptr1, *saveptr2;
        int j;
    
        for (j = 1, str1; ;j++, str1 = NULL) {
            token = strtok_r(str1, "\r\n", &saveptr1);
            if (token == NULL)
                break;
            printf("%d: %s\n", j, token);
    
            for (str2 = token; ; str2 = NULL) {
                subtoken = strtok_r(str2, ",", &saveptr2);
                if (subtoken == NULL)
                    break;
                printf("	 --> %s\n", subtoken);
            }
        }
        
        
    }

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    1. Str1 should be an array, not a pointer. A char pointer initialized with "foo" is likely to point to read only memory. This breaks the strtok.
    2. It makes no sense to use the same delimiter in both loops.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    I have to admit, I did not expect the data to change merely because I pointed out that \r wasn't in it. Resist the urge to just massage your data until your code works. You will rarely have the choice to just change what input your program gets in the real world. It's best to avoid ever doing that if you want to write good working programs.

    >Now i do not understand, am i not making any sense in here ? no ?
    Yes, you cannot change a string literal. Put your data in an array.
    Last edited by whiteflags; 02-21-2018 at 02:20 AM.

  8. #8
    Registered User
    Join Date
    May 2008
    Posts
    11
    Quote Originally Posted by whiteflags View Post
    I have to admit, I did not expect the data to change merely because I pointed out that \r wasn't in it. Resist the urge to just massage your data until your code works. You will rarely have the choice to just change what input your program gets in the real world. It's best to avoid ever doing that if you want to write good working programs.

    >Now i do not understand, am i not making any sense in here ? no ?
    Yes, you cannot change a string literal. Put your data in an array.
    Awesome !

    Manage to make it work ! Your guide was extremely helpful and i seriously appreciate the timely answers. Definitely helped me alot.
    Else i hv no idea how much more time i would be wasting.

    Code:
    #include<stdio.h>
    #include<string.h>
    
    int main() {
        
      char *s_tr =
      "Packet length: 64 [Bytes], Arrival time: 2018-02-21 12:43:36.877229\r\n"
      "Ethernet II Layer, Src: 00:04:96:9a:d1:00, Dst: 4c:72:b9:d1:a5:9c\r\n"
      "IPv4 Layer, Src: 10.221.0.69, Dst: 10.221.79.19\r\n"
      "TCP Layer, [ACK], Src port: 54671, Dst port: 5900\r\n"
      "Payload Layer, Data length: 10 [Bytes] Data: 310000780438\r\n";
    
        
        char s[strlen(s_tr)];
        strncpy(s,s_tr, sizeof(s));
    
    
        char *str1, *str2, *token, *subtoken;
        char *saveptr1, *saveptr2;
        int j;
    
        for (j = 1, str1 = s; ;j++, str1 = NULL) {
            token = strtok_r(str1, "\r\n", &saveptr1);
            if (token == NULL)
                break;
            printf("%d: %s\n", j, token);
    
            for (str2 = token; ; str2 = NULL) {
                subtoken = strtok_r(str2, ",", &saveptr2);
                if (subtoken == NULL)
                    break;
                printf("	 --> %s\n", subtoken);
            }
        }
        
        
    }

    Why i have to leave the actual string as a pointer is because that is what i am getting from a function that i do not have access to change.
    Hence, i have to do manual conversion of char* to char[] .

    Plus , the data originally looks like that, i didnt change, when i posted it for you, i had some old copies that i accidentally posted. Thats why i changed later on.
    Last edited by bladez; 02-21-2018 at 02:45 AM.

  9. #9
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Why i have to leave the actual string as a pointer is because that is what i am getting from a function that i do not have access to change.
    Hence, i have to do manual conversion of char* to char[] .
    Well, as long as you keep in mind that your example is slightly different from a larger program.

    I highly doubt that you will have to copy the string you get from a function parameter into another array. The only time you need to be careful is if you know that the object behind the pointer is a string literal. In all likelihood it's probably an array that decayed into a pointer to the first element.

  10. #10
    Registered User
    Join Date
    May 2008
    Posts
    11
    Quote Originally Posted by whiteflags View Post
    Well, as long as you keep in mind that your example is slightly different from a larger program.

    I highly doubt that you will have to copy the string you get from a function parameter into another array. The only time you need to be careful is if you know that the object behind the pointer is a string literal. In all likelihood it's probably an array that decayed into a pointer to the first element.
    Code:
    #include<stdio.h>
    #include<string.h>
    
    int main() {
        
      char *s_tr =
      "Packet length: 64 [Bytes], Arrival time: 2018-02-21 12:43:36.877229\r\n"
      "Ethernet II Layer, Src: 00:04:96:9a:d1:00, Dst: 4c:72:b9:d1:a5:9c\r\n"
      "IPv4 Layer, Src: 10.221.0.69, Dst: 10.221.79.19\r\n"
      "TCP Layer, [ACK], Src port: 54671, Dst port: 5900\r\n"
      "Payload Layer, Data length: 10 [Bytes], Data: 310000780438";
    
        
        char s[strlen(s_tr)];
        strncpy(s,s_tr, sizeof(s));
    
    
        char *str1, *str2, *token, *subtoken;
        char *saveptr1, *saveptr2;
        int j;
    
        for (j = 1, str1 = s; ;j++, str1 = NULL) {
            token = strtok_r(str1, "\r\n", &saveptr1);
            if (token == NULL)
                break;
            printf("%d: %s\n", j, token);
    
            for (str2 = token; ; str2 = NULL) {
                subtoken = strtok_r(str2, ",", &saveptr2);
                if (subtoken == NULL)
                    break;
                printf("	 --> %s\n", subtoken);
                
                char *l = strchr(subtoken, ':');
                printf("             --> %s\n", l);
            }
        }
        
        
    }


    My final code, eventually manage to break it down to its core, except i have a ":" colon infront everyone, nx mission to figure how to remove that lol.

    Thanks again!

    Share URL: jdoodle.com/a/nB7
    Last edited by bladez; 02-21-2018 at 03:22 AM.

  11. #11
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > char s[strlen(s_tr)]
    You need to add 1 to allow space for the \0, and use strcpy to make sure you copy it.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  12. #12
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    If you're just trying to extract the data, you might be able to use sscanf.
    Partial example:
    Code:
    #include <stdio.h>
    
    typedef struct Packet {
        int packet_length;
        struct {int year, month, day;} date;
        struct {int hour, minute, second, microseconds;} time;
        char src[20], dst[20];
    } Packet;
    
    int main() {
        char s[] =
            "Packet length: 64 [Bytes], Arrival time: 2018-02-21 12:43:36.877229\n"
            "Ethernet II Layer, Src: 00:04:96:9a:d1:00, Dst: 4c:72:b9:d1:a5:9c\n"
            "IPv4 Layer, Src: 10.221.0.69, Dst: 10.221.79.19\n"
            "TCP Layer, [ACK], Src port: 54671, Dst port: 5900\n"
            "Payload Layer, Data length: 10 [Bytes], Data: 310000780438\n";
        Packet p;
        sscanf(s, "Packet length: %d [Bytes], Arrival time: %d-%d-%d %d:%d:%d.%d\n"
                  "Ethernet II Layer, Src: %[^,], Dst: %s\n",
               &p.packet_length, &p.date.year, &p.date.month, &p.date.day,
               &p.time.hour, &p.time.minute, &p.time.second, &p.time.microseconds,
               p.src, p.dst);
        printf("%d\n%d %d %d\n%d %d %d %d\n%s\n%s\n",
           p.packet_length, p.date.year, p.date.month, p.date.day,
           p.time.hour, p.time.minute, p.time.second, p.time.microseconds,
           p.src, p.dst);
        return 0;
    }
    Note that I added a seemingly-missing comma after "[Bytes]" in the last line.
    A little inaccuracy saves tons of explanation. - H.H. Munro

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. splitting a string.
    By Romyo2 in forum C Programming
    Replies: 27
    Last Post: 06-11-2015, 04:41 PM
  2. splitting a string.
    By Romyo2 in forum C Programming
    Replies: 7
    Last Post: 05-08-2015, 11:40 AM
  3. splitting a string
    By trsmash in forum C++ Programming
    Replies: 1
    Last Post: 11-29-2010, 05:22 PM
  4. Splitting up a string
    By monki000 in forum C Programming
    Replies: 12
    Last Post: 03-04-2010, 12:40 PM
  5. Splitting a string?
    By motionman95 in forum C Programming
    Replies: 12
    Last Post: 04-14-2009, 07:29 AM

Tags for this Thread