Hi. I have a program that reads a file, modifies the data, then writes it back into another file. I wanted to add a little feature which is the percentage complete. I have a basic view of how to do it, but I'm sure there's a better way.
This is the way I thought of:
-initialize the variable as 0.
-output it as the percentage.
-check size of writing file and divide by size reading file multiplied by 100.
-output '\b' twice and output new percentage.
I have a little problem here though: I don't know how to find the size of a file.
And I'm pretty sure there's a better algorithm out there for this purpose.
One way would be to seek to the end of the file when you open it and tell the position. Then seek back to the beginning to start inputting.
OK thanks anon that's not a bad way to find the size. But is there any better algorithm? I don't know I feel mine is very cheap if you know what I mean.
Why not get the size of the file being read (just once), then count how many bytes you've read from the file as you go.
Convert that into a percentage.
AFAIK, this would only work reliably when the file is opened in binary mode, not text mode, because of newline translation. not that it would matter that much in this application though.
Originally Posted by Salem
Percentage complete is easy enough, and you seem to know the formula (100 * current / total).
As an exersize, you could try and calculate the estimated time remaining too. I find such things even more useful. See if you can work that one out. Think of specific examples like 1/3 or 2/3 of the way through.
> AFAIK, this would only work reliably when the file is opened in binary mode, not text mode
True, but since the result is going to be mapped to a range from 0 to 100, I don't think a minor difference in the apparent file size will overly impact the accuracy of the % complete.
Indeed. The user won't usually care that much if the status report a fraction of a percent out. Only thing to remember, if there is that slight error due to text vs binary mode, is to set the percentage to 100 when the reading hits end of file .... users will tend to worry (and, worse, submit bug reports) if the program starts doing something else when the file reading/processing is reported as only 99% complete.
Originally Posted by Salem
OK it looks like percentage complete is pretty easy. About estimated time, I can't really find a way.
I know I have to divide the remaining size by the rate. But I'll need to know how fast the processor is.
Actually! there's a way to find the speed. You can record how long it takes to write 1 byte. but then again how can I do that while I'm reading and writing files?
Well, you probably don't want to update the "percentage done" for every byte you write (unless writing a single byte is REALLY slow).
Originally Posted by Abda92
If you start by taking a "start time", then take the time whenever you find it's time to update the completion percentage (e.g. when you've done another few kilobytes, or 10%, or whatever), you calculate how much time you've spent and calculate the time it should take to complete if you continue at the going rate.
Most of the time, file I/O is buffered in chunks. So timing one byte wouldn't work very well anyway -- the first byte would take a lot longer than the succeeding bytes. Not to mention that some operating systems have read-ahead, which I think means that they can anticipate which bytes you're going to read next, resulting in even larger buffers.
I think that the buffer size might be contained in BUFSIZ from <cstdio>. I'm not sure if this applies to C++ streams, though it probably does. You might want to use BUFSIZ as your "few kilobytes" value. It ranges from a minimum of 512 upwards. (On my 64-bit Linux system it's 8192.)
Time remaining = time_elapsed_so_far * (total_records - current_record) / current_record;
Of course it wont be accurate until a few records have been processed.
Anyway, I know it's not what you set out to do, so it's just extra related info in case you like to experiment and play around with stuff.
OK I integrated the percentage complete successfully. It works perfectly now.
One little issue though: After adding this feature I felt the process of reading and writing is slower. Can a little calculation and output statement make that much of a difference?
It's possible but not that likely if you do a lot of calculations each time you read a small amount of data. A more likely scenario is if you changed the way you read in the file. For example, if you read and wrote the file in chunks before, but are doing it byte by byte now to keep the progress bar updated, then that will likely slow things down.
I would build the code before and after the changes, then run them one after the other a few times while timing the process to see if there really is a difference. If there is, then you can figure out what went wrong.