Thread: Another CSV parsing question

  1. #1
    Registered User
    Join Date
    Oct 2011
    Posts
    6

    Another CSV parsing question

    Hello all,

    I'm hoping I'm not beating a dead horse here (which is completely possible) but I am unable to figure out what I'm doing wrong with this CSV parsing code. I have a CSV file that has null fields, but will always have the commas for those fields. So, what we get is a file that has the current structure:

    1,2,3,4,,,,,,6,7,,,7,8,,,8

    It could also all be filled in:

    1,2,3,4,5,6,7,8,9

    They will always have the same number of columns though (even though my example did not). The code I'm using to parse this file is shown below:

    Code:
    #include <stdio.h>   /* required for file operations */
    #include <string.h>
    
    FILE *fr;            /* declare the file pointer */
    
    
    main()
    
    
    {
       char line[2500];
       char delims[] = ",";
       char *result = NULL;
       int count = 0;
    
    
        fr = fopen ("testfile", "r");
    
    
       /* fr = fopen ("tmpfile", "r");   open the file for reading */
       /* elapsed.dta is the name of the file */
       /* "rt" means open the file for reading text */
    
    
       while(fgets(line, 2500, fr) != NULL)
       {
            if (line[0] == '#') {
                    continue;
            }
            result = strtok(line, delims);
            while (result != NULL) {
                    if (count == 11) {
                            printf("result: %s\n", result);
                            count = 0;
                    } else {
                            count+=1;
                    }
                    result = strtok(NULL, delims);
            }
            count = 0;
       }
       fclose(fr);  /* close the file prior to exiting the routine */
    } /*of main*/
    I've tested files with all the columns filled out in every line and this code seems to work. So I guess my question is, how do I test for a field having null characters such that I can print the REAL 11th field of the csv file's line? Hopefully this is a clear question, if you need any clarification, please let me know and I will provide as much data as I can!

  2. #2
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    You don't have a null character in that situation. You have an empty string. These are different concepts.

    That said, you apparently know how to check the value of a character. Why don't you start by trying to check the distance between delimiters as they are found?

    Soma

  3. #3
    Registered User
    Join Date
    Oct 2011
    Posts
    6
    So, for instance, parse the line with a for loop instead of strtok?

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    You need something other than strtok() if you have empty fields in your CSV file.

    Hello,,,,,,,,,World
    is just two tokens as far as strtok() is concerned.

    Consider instead (if you can)
    NAME
    strsep - extract token from string

    SYNOPSIS
    #include <string.h>

    char *strsep(char **stringp, const char *delim);

    Feature Test Macro Requirements for glibc (see feature_test_macros(7)):

    strsep(): _BSD_SOURCE

    DESCRIPTION
    If *stringp is NULL, the strsep() function returns NULL and does nothing else. Otherwise, this function finds the first token in the string
    *stringp, where tokens are delimited by symbols in the string delim. This token is terminated with a '\0' character (by overwriting the delimiter)
    and *stringp is updated to point past the token. In case no delimiter was found, the token is taken to be the entire string *stringp, and *stringp
    is made NULL.

    RETURN VALUE
    The strsep() function returns a pointer to the token, that is, it returns the original value of *stringp.

    CONFORMING TO
    4.4BSD.

    NOTES
    The strsep() function was introduced as a replacement for strtok(3), since the latter cannot handle empty fields. However, strtok(3) conforms to
    C89/C99 and hence is more portable.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Question on parsing
    By mhunt007 in forum C Programming
    Replies: 9
    Last Post: 03-01-2010, 10:16 AM
  2. Parsing Question
    By DickArmy in forum C Programming
    Replies: 2
    Last Post: 11-01-2009, 12:08 PM
  3. String parsing(parsing comments out of HTML file)
    By slcjoey in forum C# Programming
    Replies: 0
    Last Post: 07-29-2006, 08:28 PM
  4. Question about parsing a string from a line
    By edd1986 in forum C Programming
    Replies: 2
    Last Post: 04-23-2005, 03:18 PM
  5. question about parsing a string??
    By newbie02 in forum C++ Programming
    Replies: 2
    Last Post: 08-11-2003, 10:28 AM

Tags for this Thread