I need to read in lines from a simple text file 100,000 at a time in a file that contains millions of lines.
I tried opening the file, reading 100,000 lines, manipulating the info, saving the manipulated info out to a new file, then reading the next 100,000 lines. The program crashed.
Is there some reason why the data file wouldn't remain open? I open it in Main, then call a subroutine to load the data (and others to manipulate and then save the data).
To get around it I have changed the program to open the file, load 100,000 lines, then load the remaining lines and simultaneously write them out to a new file. When I need the next 100,000 lines, I open the new file and repeat the process. The file shrinks with each load, but it takes FOREVER to do things this way. I calculate that the current iteration will take about 10 hours to run, compared to 2 hours for the previous iteration. And I only expect it to continue to get worse for several more iterations before it should get better.
The program spends the vast majority of it's time simply trying to access the data. The actual manipulation only takes a fraction of the time currently.
I need to get it to work with only opening the file once and just reading each 100,000 entries out sequentially.
Any ideas what may be wrong? Would it be better for me just to post a stripped version of the main and load functions?