Thread: Parsing a comma separated string that has spaces while IGNORING the spaces...

  1. #1
    Registered User
    Join Date
    Apr 2009
    Posts
    7

    Parsing a comma separated string that has spaces while IGNORING the spaces...

    Hello, I'm having issues trying to parse a comma separated string...I can't seem to get past the spaces

    I'm simply trying to parse a string that is as follows:

    Code:
    Date,Subject,Start Time,End Time,Location
    Basically I want to end up with an array of strings containing each of the items that were separated by commas.

    Here is my program...

    Code:
    int
    main(void)
    {
            FILE *inp;
            char *line;
            int N = 10, M = 10;
            int linemax  = 256;
            int i = 0, j = 0;
            char **stringarray;
    
            line = (char *)malloc(sizeof(char) *N);
            stringarray = (char**)calloc(sizeof(char**), M);
    
            inp = fopen("appts.csv","r");
    
            fgets(line, linemax, inp);
    
    
            printf("%s\n",line);
    
            stringarray[0] = strtok(line, ",");
            stringarray[1] = strtok(NULL, ",");
            stringarray[2] = strtok(NULL, ",");
            stringarray[3] = strtok(NULL, ",");
            stringarray[4] = strtok(NULL, ",");
    
            for(j; j < 6; j++) {
                    printf("%s\n", stringarray[j]);
            }
    
    
    
            return(0);
    }

    This is my output...

    Code:
    $ a.out
    Date,Subject,Start Time,End Time,Location
    
    Date
    Subject
    Sta             )
    
    Segmentation fault (core dumped)
    How can I parse the string by commas? I know I should really be using strtok in a loop, but I'm trying to parse in the most simplest way possible before attempting to do anything more complex since I'm a beginner programmer.

    Thanks in advance.

  2. #2
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    You don't have to use strtok if you don't want to.

    Your logic is simple:

    1) Put the line of text, into a buffer - you do that

    2) using an index counter in a while loop, go through each array char, until you find the next comma

    3) When you find the comma, or the end of the line, put the char's that are not already processed, into their final data structure.

    This is simple *IF* you work it out by hand (paper and pencil), until you have a clear vision of this process. Then you'll use two index counters (or pointers), start will be the slower one, that marks the spot in the array where the next word should start, and end will be the "faster" leading edge marker, to mark the end of the next word.

    The big mistake is to sit down at the computer, and start banging out code, and thinking about C string functions, etc. Then you simply can't see the forest for the trees.

    If you get stuck with the algorithm, post back.

    Note: I'd prefer, while you're working on the basic algo, that you didn't malloc /calloc the array - use a static one for now. Also, you don't need to cast a C malloc/calloc call. And, you do need to include stdlib.h (and a cast will hide that error).
    Last edited by Adak; 10-03-2009 at 07:56 PM.

  3. #3
    Make Fortran great again
    Join Date
    Sep 2009
    Posts
    1,413
    Check out fscanf, or fgets + sscanf

  4. #4
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    The issue isn't with strtok(), altho' there's a better way to go about it, but with the storage allocation; cause of the segv at runtime.
    Code:
    line = (char *)malloc(sizeof(char) *N);  /* line points to an array that can hold only 10 characters */
    ...
    fgets(line, linemax, inp);  /* tries to read 255 chars from inp into the array pointed to be line?? */

  5. #5
    Registered User
    Join Date
    Apr 2009
    Posts
    7
    Ok, so I have a better idea of what I should be doing...trying to not make things too complex but I still seem to be getting a seg fault... >_<

    Any tips for what I should change?

    New stuff I did is in red.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int
    main(void)
    {
            FILE *inp;
            char *line;
            int N = 10, M = 10;
            int linemax  = 256;
            int i = 0, j = 0;
            char **stringarray;
            int a = 0, b = 0, c = 0, length;
    
    
            line = (char *)malloc(sizeof(char) *N);
            stringarray = (char**)calloc(sizeof(char**), M);
    
            inp = fopen("appts.csv","r");
    
            fgets(line, linemax, inp);
    
    
            printf("%s\n",line);
    
    
            length = strlen(line);
    
            while(line[a] != line[length]) {
    
                    if(line[a] == ',') {
                            b++;
                    } else {
                            stringarray[b][c] = line[a];
                            c++;
                    }
                    a++;
            }
    
    
            for(j; j < 6; j++) {
                    printf("%s\n", stringarray[j]);
            }
    
    
    
            return(0);
    }

  6. #6
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Your segfault won't go away until you fix your storage allocations.
    Why not go for static allocation instead of dynamic as you are a beginner.
    Once the code is working you can switch to dynamic allocation.

  7. #7
    Registered User
    Join Date
    Apr 2009
    Posts
    7
    Ok I did away with the dynamic arrays for now. I don't seg fault anymore, but there seems to be some unexplainable logical error somewhere that I can't find!

    The logic seems okay to me, but I'm guessing there's something I'm missing.

    This is the output that I want:

    Code:
    $ a.out
    Date,Subject,Start Time,End Time,Location
    
    Date
    Subject
    Start Time
    End Time
    Location
    
    $
    But this is the output that I'm getting:

    Code:
    $ a.out
    Date,Subject,Start Time,End Time,Location
    
    Date
    
    
    
    
    
    $
    What's going on here..?

    Here's my code, with new changes in red.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int
    main(void)
    {
            FILE *inp;
            char line[100];
            int linemax  = 256;
            int i = 0, j = 0;
            char stringarray[100][100];
            int a = 0, b = 0, c = 0, length, count = 0;
    
            inp = fopen("appts.csv","r");
    
            fgets(line, linemax, inp);
    
    
            printf("%s\n",line);
    
    
            length = strlen(line);
    
            while(count != length) {
    
                    if(line[a] == ',') {
                            b++;
                    } else {
                            stringarray[b][c] = line[a];
                            c++;
                    }
                    a++;
                    count++;
            }
    
    
            for(j; j < 6; j++) {
                    printf("%s\n", stringarray[j]);
            }
    
    
    
            return(0);
    }

  8. #8
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Now revert back to the strtok() instead of the new code in red.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. We Got _DEBUG Errors
    By Tonto in forum Windows Programming
    Replies: 5
    Last Post: 12-22-2006, 05:45 PM
  2. RicBot
    By John_ in forum C++ Programming
    Replies: 8
    Last Post: 06-13-2006, 06:52 PM
  3. I hate string parsing with a passion
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 2
    Last Post: 03-19-2002, 07:30 PM
  4. HELLLLP!!!! string parsing C++ only
    By waki in forum C++ Programming
    Replies: 1
    Last Post: 09-28-2001, 09:41 PM