Thread: Need ideas for parsing a file

  1. #1
    Registered User
    Join Date
    Apr 2009
    Posts
    10

    Need ideas for parsing a file

    Hello everyone, this is my first post here, so if I'm doing something wrong or explaining myself wrong, please give me constructive criticism.

    What I'm trying to do, is write a program that will rewrite my cell phone contact backup file from the format it is in presently, to a new format with a name on one line and the numbers on the following line.

    The file presently looks like this:
    Code:
    R3,H609@ .УQ8G0\zht}      2               
       2        I   BEGIN:VCARD
    VERSION:3.0
    N:Aaron;;;;
    TEL;VOICE:1561620191
    END:VCARD
          K   BEGIN:VCARD
    VERSION:3.0
    N:Aaron H;;;;
    TEL;VOICE:1561399224
    END:VCARD
          H   BEGIN:VCARD
    VERSION:3.0
    N:Adam;;;;
    TEL;VOICE:1772240172
    END:VCARD
          I   BEGIN:VCARD
    VERSION:3.0
    N:Admjgt;;;;
    TEL;VOICE:503733382
    END:VCARD
          `   BEGIN:VCARD
    VERSION:3.0
    N:Alcohol Tobacco And Firearms;;;;
    TEL;VOICE:18008003855
    END:VCARD
          G   BEGIN:VCARD
    VERSION:3.0
    N:Aleisha;;;;
    TEL;VOICE:418940
    END:VCARD
          P   BEGIN:VCARD
    VERSION:3.0
    N:Alex De Vries;;;;
    TEL;VOICE:772856317
    And I want it to look like this:
    Code:
    1:Aaron
    -1:1562620191
    2:Aaron H
    -1:1561399224
    3:Adam
    -1:1772340172
    4:Admjgt;;;;
    -1:503733382
    -2:561456754
    I've removed a digit from each number just for my own security.

    Basicly, I have a moderate knowledge of programming, but I don't know that much about string functions; and I only know a bit about file operations.

    I've heard something about reading files to the effect of being required to read the file bit by bit because you can't open the entire thing at once. Is this true?

    What I'm looking for is a basic idea about how I would go about this, and what functions I would use.

    Thanks in advance,
    -Primux
    Last edited by slackwarefan; 05-05-2009 at 10:24 PM.

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    VERSION:3.0
    N:Aaron;;;;
    TEL;VOICE:1561620191
    END:VCARD
    Read line. Discard it.
    Read line. Output record number, then everything from : to the first ;
    Read line. Ignore everything up to : and then write that.
    Read line. Discard it.

    There you go. You can do it with whatever you like, but you might stop off at the FAQ section and look at the file stuff there.


    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Registered User
    Join Date
    Apr 2009
    Posts
    10
    I love the sarcasm and the condescending tone, but obviously that's not what I'm asking.

    I was thinking of reading the file into a multidimensional array, then performing string functions on each line with a loop. Would this be feasible? Would the array be too big? Should I read the data into some other sort of structure? If I read each line into a pointer instead of an array, what kind of string functions would I use to delete only parts of a line?

    These are the types of questions I have. I've looked at the FAQ and the tutorials. Both are too basic for my needs, and haven't told me anything I don't already know.

  4. #4
    Registered User
    Join Date
    Feb 2009
    Posts
    26
    Unless your file runs to tens of GB, you shouldn't have a problem opening it at once. Atleast as far as I know..

    Open the file and fgets in a loop to get one line after another..

    Standard C String and Character [C++ Reference].
    List of various string functions.. You will be figure out which functions to use to get what you need from this.

    cheers

  5. #5
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by slackwarefan View Post
    I love the sarcasm and the condescending tone, but obviously that's not what I'm asking.
    There was neither there. But I can provide it if you'd like.
    Quote Originally Posted by slackwarefan View Post
    I was thinking of reading the file into a multidimensional array, then performing string functions on each line with a loop. Would this be feasible?
    It would, it'd also be dumb. You only need one array, big enough to store the line. Here, try to pay attention this time:
    Code:
    char buf[ BUFSIZ ] = {0};
    FILE *fpin;
    FILE *fpout;
    int counter;
    int record = 0;
    
    fpin = fopen( yourinputfilenameinquotes, "r" );
    fpout = fopen( youroutputfilenameinquotes, "w" );
    
    for( counter = 0; fgets( buf, BUFSIZ, fpin ); counter++ )
    {
        switch( counter % 4 )
        {
            case 1:
            {
                char *c = NULL;
    
                fprintf( fpout, "%d:", record++ );
                if( (c = strchr( buf, ';' )) )
                    *c = '\0';
                if( (c = strchr( buf, ':' )) )
                    c++;
    
                fpritnf( fpout, "%s;\n", c );
            }
            break;
    
            default: /* ignore it */ break;
        }
    }
    You do the rest. Or do it your way, and keep ignoring my advice. In any event, let's see some of your code.


    Quzah.
    Hope is the first step on the road to disappointment.

  6. #6
    Registered User
    Join Date
    Apr 2009
    Location
    Russia
    Posts
    116
    Quote Originally Posted by slackwarefan
    Should I read the data into some other sort of structure? If I read each line into a pointer instead of an array, what kind of string functions would I use to delete only parts of a line?
    I would create an array of structures (which have a char array for 100 for name and char array for 50 for number)
    you can use a tmp line for filter the name or number and use strstr for finding substrings "N:" ";;;" "TEL;VOICE:" "END:"
    every step you can determine by these lines (where data is start, where name, where number, where end of record)
    for copying you can use strcpy or strncpy (for copying substring)

  7. #7
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by slackwarefan View Post
    I love the sarcasm and the condescending tone, but obviously that's not what I'm asking.
    If you perceive sarcasm and condescension in quzah's reply I think you should
    1. check your head
    2. reappraise your knowledge of C and programming in general, because you have badly misinterpreted some simple technical statement to have some non-technical meaning (I do not see anything beyond a series of simple technical statements and think you must be paranoid)


    I was thinking of reading the file into a multidimensional array, then performing string functions on each line with a loop. Would this be feasible?
    It's feasible, but as quzah's idea implies, totally unnecessary. You read the file in one line at a time and perform operations on it. You do not need to store the entire file in memory, and then work -- that would be bad programming.

    If I read each line into a pointer instead of an array, what kind of string functions would I use to delete only parts of a line?

    These are the types of questions I have. I've looked at the FAQ and the tutorials. Both are too basic for my needs, and haven't told me anything I don't already know.
    I know you may already think I'm nasty, but I'm not, so please take this for what it is -- advice: The first sentence in this quote exposes the fact that there are some very basic things (like pointers and arrays) that you may think you understand, but I would wager you don't. To be honest, I don't think the cboard tutorials are anything special (sorry yall) but in any case, I am pretty sure ANY tutorial on pointers and string functions would be of benefit to you. If you've already been through that, then the problem is you haven't had to apply the concepts enough -- which this project should be perfect for learning.

    Since the file entries appear to be of a reliable, regular structure, quazh's use of switch/case with modulus* (%) is a good idea. For the first line, you output the record number and use strchr to find the first ';' then change it to '\0' (which is a null terminator, so that line now ends there). Then you find the ':', move forward one character and your pointer will now contain:
    Code:
           v this ';' is '\0', the end
    N:Aaron;;;;
      ^ pointer
    So the pointer "c" from quzah's example will contain "Aaron".

    Get it?

    *modulus gives a remainder of a division, so 1%4=1, 2%4=2, 3%4=3, 4%4=0, 5%4=1, 6%4=2, 7%4=3, 8%4=0...
    Last edited by MK27; 05-08-2009 at 06:59 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  8. #8
    Registered User
    Join Date
    Sep 2007
    Posts
    3
    Quote Originally Posted by quzah View Post
    You do the rest. Or do it your way, and keep ignoring my advice. In any event, let's see some of your code.
    PROTIP: First day on a new forum? Don't insult the regulars

    It is so often difficult to detect condescension and sarcasm on the interwebs. I'll always advocate liberal use of the <sarcasm> tags.

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by danthehat View Post
    PROTIP: First day on a new forum? Don't insult the regulars

    It is so often difficult to detect condescension and sarcasm on the interwebs. I'll always advocate liberal use of the <sarcasm> tags.
    I like cboard a lot, but it still has quacks (AFAIMC, quzah is not one of them) just like everywhere else. And some places are much worse than others, IMO. So maybe this is trauma accumulated from other forums
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  10. #10
    Registered User Sharke's Avatar
    Join Date
    Jun 2008
    Location
    NYC
    Posts
    303
    Quote Originally Posted by MK27 View Post
    I like cboard a lot, but it still has quacks (AFAIMC, quzah is not one of them) just like everywhere else. And some places are much worse than others, IMO. So maybe this is trauma accumulated from other forums
    Every programming forum I've ever visited has had the same variety of regulars - those who seem to take great delight in mocking the ignorance of beginners, those who sympathize with beginners but who use their questions as a platform to showcase their superior knowledge (and end up posting answers that are far too complicated for the beginner that asked the question), those who sympathize with beginners and who make the effort to explain things in simple terms - and of course every shade in between. Sometimes I'll read a thread and get that "good cop/bad cop" feeling. I've benefited from reading the answers of all types. It's great to have a new concept explained in baby language but on the other hand, there's been times when someone's far-too-advanced answer has prompted me to go away and read further so I could understand it.

  11. #11
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Sharke View Post
    Sometimes I'll read a thread and get that "good cop/bad cop" feeling.
    Are you trying to create a problem, pal?
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  12. #12
    Registered User Sharke's Avatar
    Join Date
    Jun 2008
    Location
    NYC
    Posts
    303
    Quote Originally Posted by MK27 View Post
    Are you trying to create a problem, pal?
    I ain't saying nothin'. You'd better stop hassling me or I'll call Al Sharpton's Action Network!

  13. #13
    Registered User
    Join Date
    Apr 2009
    Posts
    10
    So, I'm using fgets to read strings from the file. I'm trying to write a function to basicly chop off the first part of the string, supposing it's not needed. I wrote a test program to do this, but I can't seem to get it to work, any ideas?

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    
    char* strsh(char *lstr[], int charnum){
         int i = 0;
         
         while ( charnum <= strlen(*lstr)){
               
               *lstr[i] = lstr[charnum];
               i++;
               charnum++;
               }
         return *lstr;
    }
    
    int main(int argc, char *argv[]){
        
        char *thestring[20] = {0};
        *thestring = "I am sam.";
        *thestring = strsh(thestring, 2);
        printf("%s \n", *thestring);
        system("pause");
        return 0;
    }

  14. #14
    Registered User linuxdude's Avatar
    Join Date
    Mar 2003
    Location
    Louisiana
    Posts
    926
    I don't know but if this is for personal use, why not use another tool: grep, vim recording macros would do this extremely easily. As for the problem you are having, I am not sure what you are doing with an array of pointers?

  15. #15
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Code:
    char *thestring[20] = {0};
    That's not a string. It's an array of pointers to charaters.


    Quzah.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Data Structure Eror
    By prominababy in forum C Programming
    Replies: 3
    Last Post: 01-06-2009, 09:35 AM
  2. Encryption program
    By zeiffelz in forum C Programming
    Replies: 1
    Last Post: 06-15-2005, 03:39 AM
  3. archive format
    By Nor in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 08-05-2003, 07:01 PM
  4. Making a LIB file from a DEF file for a DLL
    By JMPACS in forum C++ Programming
    Replies: 0
    Last Post: 08-02-2003, 08:19 PM
  5. Hmm....help me take a look at this: File Encryptor
    By heljy in forum C Programming
    Replies: 3
    Last Post: 03-23-2002, 10:57 AM