Thread: problem with CSV input

  1. #1
    Registered User
    Join Date
    Feb 2009
    Posts
    6

    problem with CSV input

    I am an extreme newbie to C and am trying to write a "simple" program that opens a csv text file of integers, parses them, and then sums them. I wrote what I thought should work, but it does not actually retrieve the integer values to sum. The only token it reads is the filename. Could someone help????

    Here is what I have.



    Code:
    #include <stdio.h>
    
    #include <stdlib.h>
    
    #include <string.h>
    
    int main()
    
    {
    
       int x;
    
       int sum;
    
       char filename[] = "data.txt";
    
       FILE *file = fopen(filename,"r");
    
       char *ptr;  
    
       ptr = strtok(filename, ",");
    
       sum = 0;
    
       if (file == NULL)
    
          {
    
          printf("File could not be opened.\n");
    
          getchar ();
    
          }
    
       else
    
       {
    
           while (ptr != NULL)
    
           {
    
           x = atoi(ptr);
    
           printf("The string is: %d\n", ptr);
    
           
    
           sum = sum + x;
    
           ptr = strtok(NULL, ",");
    
           }
    
           fclose(file);
    
       }         
    
       printf ("The sum of the numbers is: %d\n",sum);
    
       getchar ();
    
    }

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    So, strtok doesn't read from a file. Maybe fgets is what you're thinking of?

  3. #3
    Registered User
    Join Date
    Feb 2009
    Posts
    6
    that could be why it doesn't read the values. I know nothing about C and am constructing the code from examples I find...still trying to understand the structure and everything. I have been trying to figure out how to use fgets, but have not had any success.

  4. #4
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    So you should read what it does. You can go to cplusplus.com (despite the name), or you can type "man fgets" into your shell/Google, depending on your OS.

  5. #5
    Registered User
    Join Date
    Feb 2009
    Posts
    6
    yeah...been reading up on it, but trying to decipher the language...looking at a couple examples, too

  6. #6
    Registered User
    Join Date
    Feb 2009
    Posts
    138
    this might help.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    void read_csv_line(char *s);
    
    int main()
    {
        char s[80];
        FILE *fp = fopen("test.txt", "r");
        while (fgets(s, 80, fp)) read_csv_line(s);
        fclose(fp);
        return EXIT_SUCCESS;
    }
    
    void read_csv_line(char *s)
    {
        char *t = strtok(s, ",");
        while (t)
        {
            printf("%s\n", t);
            t = strtok(NULL, ",");
        }
    }

  7. #7
    Registered User
    Join Date
    Feb 2009
    Posts
    6
    That helped a lot. Is there any way to do the same thing without restricting the size? Ultimately it's supposed to be able to handle any file without knowing how large the data set is.
    Last edited by cmiller4; 02-05-2009 at 10:56 AM.

  8. #8
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    If you are referring to the number 80 - it is the line length not the file size. The solution given by Meldreth is unassuming about the file size.

  9. #9
    Registered User
    Join Date
    Feb 2009
    Posts
    138
    the only limit in my example was the line size. the number of lines in the file is unbound. you can remove the line limit, but it's harder because you have to make a growing array and that means using your own input function.
    Code:
    #define THRESHOLD 16
    
    char *getline(FILE *fp)
    {
        char *s = NULL;
        int sz = 0, cap = 0;
        char c;
        while ((c = fgetc(fp)) != EOF)
        {
            if (sz == cap)
            {
                cap += THRESHOLD;
                s = realloc(s, cap+1);
            }
            if (c == '\n') break;
            s[sz++] = c;
        }
        if (s) s[sz] = '\0';
        return s;
    }
    be careful with that one though, i wrote it fast just now so it might be buggy and it definitely wastes memory.

  10. #10
    Registered User
    Join Date
    Feb 2009
    Posts
    6
    Thanks...I'll play around with it. I'm not too worried about memory size right now, since I'm just learning and my primary objective is trying to figure out how to make it work. Efficiency is phase II.

  11. #11
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    s = realloc(s, cap+1);

    FAQ explains why this could be a bad idea - if realloc fails the original pointer is lost

    also there is no check for return value of realloc at all

    and fgetc returns int - so c should be declared as int - to distinguish EOF and 0xFF character
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by cmiller4 View Post
    Thanks...I'll play around with it. I'm not too worried about memory size right now, since I'm just learning and my primary objective is trying to figure out how to make it work. Efficiency is phase II.
    Assuming you are not trying to write a CSV reader to take ALL possible CSV files (and your current code doesn't look like it handles for examle quotes), then I'd suggest you just increase the maximum length of a line to some large value, e.g. 8000 - you can check if fgets read the whole line by checking if the last item is a newline:
    Code:
    size_t len;
    fgets(str, ...); 
    len = strlen(str);
    if (str[len-1] == '\n')  
        // We got a whole line
    else
       // Line too long to fit in buffer.
    Trying to cope with extremely long lines when a "large enough" value can be found is a waste of effort.

    Sure, if you are writing a commercial piece of software to handle CSV files, you obviously should cope with extremely long lines (and it may actually be better to read a chunk of data into a buffer and scanning each character in that case, rather than relying on strtok).

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  13. #13
    Registered User
    Join Date
    Feb 2009
    Posts
    6

    Smile

    Thanks so much for all of the valuable information. I am definitely learning more about this process...the books I have don't speak layman's English, so you all have been a huge help.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. problem with keyboard input
    By fighter92 in forum Game Programming
    Replies: 6
    Last Post: 03-20-2009, 09:41 AM
  2. A question related to strcmp
    By meili100 in forum C++ Programming
    Replies: 6
    Last Post: 07-07-2007, 02:51 PM
  3. Input statement problem
    By une in forum C Programming
    Replies: 3
    Last Post: 05-29-2007, 11:16 PM
  4. Problem with File Input & Pointers
    By jenna in forum C++ Programming
    Replies: 9
    Last Post: 12-04-2004, 11:34 PM
  5. Problem with text input
    By newbie in forum C++ Programming
    Replies: 2
    Last Post: 03-10-2002, 04:44 PM

Tags for this Thread