Reading large complicated data files

**quzah** · 05-17-2006

Originally Posted by Dave_Sinkula

Not exactly, the original file was not well formatted.

I suspected as much.

If it had been white space filled, it would have worked fine. Consider:

Code:

012 012 012 012 012
0123012 0123012 012
01230123012301230123

Here we define a chunk, for simplicity's sake, as being 4 characters. Read 4 characters, that's a chunk. Repeat. It doesn't matter if they run together, because we're simply reading four characters per chunk. The first file looked to be that way in note pad, but that was why I was wondering if it was tab delimited or not. However, if they had tabs, that method wouldn't have worked. That's the only way it would have failed, if it was tabs.

Chunk size would have to be decided, based on what position they were reading. Say we have six positions per line. It's space-filled, but chunk per position may vary. We can simply use fgets to do all of our work for us if we want.

Code:

char buf[BUFSIZ] = { 0 };
int chunksize[] = { 1, 2, 4, 7, 7, 3 };
size_t x;

for each line
    for( x = 0; x < sizeof( chunksize ) / sizeof ( chunksize[ 0 ] ); x++ )
    {
        fgets( buf, 1 + chunksize[ x ], fp );
        switch( x )
        {
            case 0: /* do something with the first chunk... */ break;
            ...
        }
    }

It doesn't matter if they run together, and it doesn't matter if chunk sizes differ. However, this only works if the file is space filled, instead of tabs.

Quzah.

**Dave_Sinkula** · 05-17-2006

Originally Posted by quzah

Here we define a chunk, for simplicity's sake, as being 4 characters. Read 4 characters, that's a chunk.
<snip>
Chunk size would have to be decided, based on what position they were reading.

Ah, now I see what you were saying. But that original file was pretty icky. It would need to be based on what position as well as some of the data already consumed, and I didn't see much of a pattern emerging from that.

Thread: Reading large complicated data files

Thread Tools

Search Thread

Display

Similar Threads

Reading data from a text file

using mmap for copying large files

reading formatted data files

Binary Search Trees Part III

Reading Large Files!!!