Thread: Reading large complicated data files

  1. #16
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by Dave_Sinkula
    Not exactly, the original file was not well formatted.
    I suspected as much.

    If it had been white space filled, it would have worked fine. Consider:
    Code:
    012 012 012 012 012
    0123012 0123012 012
    01230123012301230123
    Here we define a chunk, for simplicity's sake, as being 4 characters. Read 4 characters, that's a chunk. Repeat. It doesn't matter if they run together, because we're simply reading four characters per chunk. The first file looked to be that way in note pad, but that was why I was wondering if it was tab delimited or not. However, if they had tabs, that method wouldn't have worked. That's the only way it would have failed, if it was tabs.

    Chunk size would have to be decided, based on what position they were reading. Say we have six positions per line. It's space-filled, but chunk per position may vary. We can simply use fgets to do all of our work for us if we want.
    Code:
    char buf[BUFSIZ] = { 0 };
    int chunksize[] = { 1, 2, 4, 7, 7, 3 };
    size_t x;
    
    for each line
        for( x = 0; x < sizeof( chunksize ) / sizeof ( chunksize[ 0 ] ); x++ )
        {
            fgets( buf, 1 + chunksize[ x ], fp );
            switch( x )
            {
                case 0: /* do something with the first chunk... */ break;
                ...
            }
        }
    It doesn't matter if they run together, and it doesn't matter if chunk sizes differ. However, this only works if the file is space filled, instead of tabs.


    Quzah.
    Last edited by quzah; 05-17-2006 at 04:47 PM.
    Hope is the first step on the road to disappointment.

  2. #17
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Quote Originally Posted by quzah
    Here we define a chunk, for simplicity's sake, as being 4 characters. Read 4 characters, that's a chunk.
    <snip>
    Chunk size would have to be decided, based on what position they were reading.
    Ah, now I see what you were saying. But that original file was pretty icky. It would need to be based on what position as well as some of the data already consumed, and I didn't see much of a pattern emerging from that.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reading data from a text file
    By Dark_Phoenix in forum C++ Programming
    Replies: 8
    Last Post: 06-30-2008, 02:30 PM
  2. using mmap for copying large files
    By rohan_ak1 in forum C Programming
    Replies: 6
    Last Post: 05-13-2008, 08:12 AM
  3. reading formatted data files
    By gL_nEwB in forum C++ Programming
    Replies: 5
    Last Post: 04-22-2006, 10:09 PM
  4. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  5. Reading Large Files!!!
    By jon in forum Windows Programming
    Replies: 1
    Last Post: 09-09-2001, 11:20 PM