Thread: Parsing data in files

  1. #1
    Registered User
    Join Date
    Oct 2015
    Posts
    33

    Parsing data in files

    Is there any tutorial on how can I learn more on how to creat a file parsing program? Mainly how file parsing work in c and how can I type a code that scan through a file and look for a string that's

    Example would be a text file having <>a<\> repeated a lot, so I want to kbow how make write a c code that will scan through the file and look for a string/char/int/etc between <> and <\>

    I have try to search up some guide but i get info on data structure...

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    I don't think there is an all-encompassing tutorial. File input is varied and new formats are being invented all the time. You are best served learning the ins-and-outs of functions in string.h and complex scanners like sscanf() or fscanf(), because sometimes they are sufficient. The complexity of the format and your willingness to tolerate format errors, will dictate how much effort you should really put forth.

    For XML or other structured document formats, like HTML or JSON, a library is essential.

    In addition to learning the input functions, you could also learn about finite state machines. (See here or search the forum for the term.) Many great tasks that involved parsing input can be solved with one, and it is often a very clean, small amount of code.

    For this specific task, a quick hack might be good enough:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    const char fileName[] = "sample.txt";
    const char content[] = 
        "<>Is</> <>there</> <>any</> <>tutorial</> <>on</> <>how</> <>can</> <>I</> <>learn</> <>more</> "
        "<>on</> <>how</> <>to</> <>creat</> <>a</> <>file</> <>parsing</> <>program?</>\n";
    
    int main()
    {
        FILE *file = NULL;
        char buffer[2048] = "";
        char words[60][40] = { '\0' };
        char *token = NULL;
        int k = 0, i = 0;
        const int N = sizeof words / sizeof words[0];
        
        /* Create a sample to show a simple parse. */
        file = fopen(fileName, "w+");
        if (file == NULL)
        {
            perror(fileName);
            return EXIT_FAILURE;
        }
        fputs(content, file);
        rewind(file); /* prepare sample for reading. */
    
        fgets(buffer, sizeof buffer, file);
    
        /* Start using strtok.
        *  Note that strtok has various problems, and that there are alternatives. 
        *  http://web.archive.org/web/20080612190205/http://www.daniweb.com/code/snippet318.html
        */
        for (token = strtok(buffer, "</>"); token != NULL && k < N; token = strtok(NULL, "</>"))
        {
            strncpy(words[k++], token, sizeof words[0]);
        }
        
        for (i = 0; i < k; i++)
        {
            printf("%s ", words[i]);
        }    
        putchar('\n');
    
        fclose(file);
        return EXIT_SUCCESS;
    }
    
    
    /* results:
    Is   there   any   tutorial   on   how   can   I   learn   more   on   how   to   creat   a   file   parsing   program? 
    */
    Keep in mind that this is very quick and dirty. You can, and probably should code more defensively in real applications.
    Last edited by whiteflags; 12-30-2015 at 07:33 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Parsing data from File
    By lighten123 in forum C++ Programming
    Replies: 4
    Last Post: 09-17-2009, 11:47 PM
  2. Problem with Parsing Data
    By zrepaer in forum C Programming
    Replies: 2
    Last Post: 04-30-2009, 02:20 PM
  3. Some questions on sscanf and parsing data
    By green2black in forum C Programming
    Replies: 7
    Last Post: 12-02-2008, 08:25 PM
  4. parsing a text data as...
    By AngKar in forum C Programming
    Replies: 7
    Last Post: 04-22-2006, 12:18 AM
  5. Parsing Data in C++
    By ss3x in forum C++ Programming
    Replies: 1
    Last Post: 03-29-2002, 07:52 AM