Thread: File Parser Class Design:Need Suggestions

  1. #1
    Registered User
    Join Date
    Sep 2009
    Posts
    63

    File Parser Class Design:Need Suggestions

    Hello all,

    I've written a simple file parser that can handle two different, though fairly similar, types of files for a game. Everything works, but I wrote two classes for each type, and I think I could write it in a better way. Both types are read line by line in exactly the same way, as both have values separated by semi-colons.

    My current idea is to write a template class. I have defined two structs, each defining the data that needs to be collected for each type of file. The readFile method will be the same for each type, as it simply reads everything one line at a time, skipping some garbage lines that are present. It then calls a parse method, which reads the data into a list of the struct type. The parse method will vary based on what sort of file it needs to read, with differing requirements. Finally, we make a call to a write function, which will again vary based on the type of file.

    So, my idea is to pass along function pointers to the parse and write methods. Another possibility would be to create a class that has a definition for the read() method, but have methods for parsing and writing that have to be overridden in child classes.

    What sort of approach would you guys have?

  2. #2
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    This doesn't sound like something that a template would be appropriate for. Then again, maybe I'm just misunderstanding the requirements. What would be the purpose of the function pointers? Perhaps a concrete example (in code) help describe the problem more clearly?

  3. #3
    Registered User
    Join Date
    Sep 2009
    Posts
    63
    The function pointers came from a (flawed) understanding of how template classes work. I had forgotten that you can define a member function that is specific to a certain datatype. It was this error that caused me to "need" function pointers to handle a specific datatype.

    But now that I think about it, I too have come to the conclusion that templates weren't meant to be used in the ways I was thinking of.

    Anyways, back to the requirements. The only requirements are these:

    1) Whatever design is employed, it must call a different parsing and writing function depending on what type of file we're messing with. Inheritance, function pointers, whatever, this isn't a problem to do.

    2) It avoids duplication/wasting of code as much as possible.

    So, I'm leaning towards just doing the plain old boring class with children. And I was hoping to avoid children, because you have to clean up after them, buy them toys, and all that.

  4. #4
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    I'd probably go with the virtual inheritance approach (maybe. I'm still not clear what exactly you have in mind), eg:

    Code:
        
        
    class file_parser_base
    {
        public:
        
        virtual bool parse( istream& in ) = 0;
        virtual bool write( ostream& out ) = 0;
    };

  5. #5
    ...and never returned. StainedBlue's Avatar
    Join Date
    Aug 2009
    Posts
    168
    You said:

    >>it must call a different parsing and writing function depending on what type of file we're messing with

    and

    >>It avoids duplication/wasting of code as much as possible

    in the same breath.

    If you write really modular code, than I would start with separating out everything that is 100% common between the two, from things that are shared, and things that are absolutely not shared.

    Examine what's what, then think about the best approach.

  6. #6
    ...and never returned. StainedBlue's Avatar
    Join Date
    Aug 2009
    Posts
    168
    In all it sounds like you really only need one class, that calls the appropriate method(s) based on what's found in the data file. Templates or function pointers seem like overkill to me.

    Reason being, your data file is already highly structured, and relatively simplistic. Therefore, the need for a complex parsing scheme is unecessary. Seeing an example and brief explanation of the data files would help though...

  7. #7
    Registered User
    Join Date
    Sep 2009
    Posts
    63
    You're probably right that templates and such would be overkill for these kinds of files. Most of my error checking isn't needed for these files, either, but they're there in case someone does something silly.

    The files would look something like this. These are what would be the definition lines for the files, which I can't remember off the top of my head, and am too lazy to look up. But you certainly get the idea. All other lines of the file just fill in the data that would go in each column. Easy to read, easy to split up, etc.


    Code:
    Filetype One
    Name;ID;Start Year; Rank 1 Year; Rank 2 Year; Rank 3 Year; Traits;Type;......
    
    Filetype Two
    ID;Position;Start Year;Death Year;Position;Personality;.....
    This is not the whole of the story, however. In type two, we can have the same name defined twice, and they will refer to the same guy. Because the previous game treated every entry as its own person, you could have one version of the guy die while all others of him live. So I had to handle the case where we have another definition of the guy. The Position and Personality must be kept together.

    In both cases, I defined a struct for the filetype, which a variable for each bit of information that we want to read.

    e.g.

    Code:
    struct type_one {
        int startYear;
        std::string name;
        std::string type;
        ......
    }
    For the second type, its class has a map member varialbe where the key is the name of the guy, and the value is the struct. I use the find() function to find anyone with the same name. If he already exists, we add our copy's position and personality into a list with a small struct type (just two strings). If not, we add a new guy to our map. I collect all of the definitions in the file and then write them all at once.

    The number of lines to read varies quite a bit. The smallest file is two lines (with one guy defined), but the largest I've encountered had like 2800 lines and 1905 individual people defined. For the largest file it takes, in my estimation, less than a second or two to read, parse, and write. I'm sure there are ways to speed up the program buy

    The first type can suffice with any sort of collection type because every line is its own definition. I used maps, for the moment, so that I could define the name as a key and have it sort for me. I could write a comparison operator for my struct so that I can use other collection types, however.

    I don't have to check for name collisions. Just load all the data we need and go. I could get away with writing each definition to a file as it's parsed, but for now I read the whole file and then write out at one time. This helps modders, as the map sorts everything by name. Where multiple copies of the same name occur, we differentiate them by their types. If a name is repeated and has the same type, the old one is destroyed and overwritten. This is because it is about 99.999% likely that the modder was using the second definition to model things, like differing skill over time.

    For the moment, I just went with the simple generic handler class with two children, one for each type. I was just wondering if someone could have thought of some more advanced or elegant way of handling this.

  8. #8
    ...and never returned. StainedBlue's Avatar
    Join Date
    Aug 2009
    Posts
    168
    I think you are on the right track. Check out this discussion.

    I guess my point is leverage off the iostream library (well duh, right?) as much as possible.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Need Help Fixing My C Program. Deals with File I/O
    By Matus in forum C Programming
    Replies: 7
    Last Post: 04-29-2008, 07:51 PM
  2. gcc link external library
    By spank in forum C Programming
    Replies: 6
    Last Post: 08-08-2007, 03:44 PM
  3. C++ std routines
    By siavoshkc in forum C++ Programming
    Replies: 33
    Last Post: 07-28-2006, 12:13 AM
  4. archive format
    By Nor in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 08-05-2003, 07:01 PM
  5. Need a suggestion on a school project..
    By Screwz Luse in forum C Programming
    Replies: 5
    Last Post: 11-27-2001, 02:58 AM