Retrieving a certain line from a large text file

**Violet_Shift** · 05-06-2012

I'm writing a custom search engine, and I'm in the final stages of getting it to work. However, I am dealing with a very large document collection, and hence I am dealing with files weighing in the hundreds of megabytes, so loading everything into RAM is not an option.

The internal processing of my engine tells me precisely what line of the text documents I can find the information I need, but I can't figure out how to rip out a specific line.

Could anyone give me advice with this?

Thanks.

**Subsonics** · 05-06-2012

You can use something like fgets and keep a count, it would be much more efficient if you knew what offset the line started at however then you could use fseek instead.

**Violet_Shift** · 05-06-2012

Ahh, fseek would be impossible, as I'm looking through a postings list, which is a mountain of data in the form of:

0 1 2 3 4 5
1 4 7
0 1 2
1 5 8 11

and so on, for numbers pulled completely out of my brain. Unless there's a way of finding the offset otherwise.

**Subsonics** · 05-06-2012

It would not be impossible if you knew the byte offset instead of line nr.

The internal processing of my engine tells me precisely what line of the text documents I can find the information I need

**Violet_Shift** · 05-06-2012

Hmm. I'm not sure how I'd be able to do that, given only a line number. I think that'd require substantial rewriting of my indexer.

**Salem** · 05-06-2012

If your indexer is storing line numbers, then all you need is another parallel array storing the seek positions.

Code:

while ( (pos=ftell(fp))>= 0 && fgets( buff, sizeof buff, fp) != NULL ) {
    seekPositions[lineNo] = pos;
    // do stuff with buff and lineNo
}

When you want to retrieve a line, it's just fseek(fp,seekPositions[lineNo],SEEK_SET) and then read the line.

Thread: Retrieving a certain line from a large text file

Thread Tools

Search Thread

Display

Retrieving a certain line from a large text file

Similar Threads

A text file is written to another text file, at a sentence per line: one line works

Best way to create a large text file?

Read text file line by line and write lines to other files

Reading text file line by line and acting on each line...

Searching a VERY large text file