I've written a C++ program to convert data from one data format to another. My benchmark program reads 136MB of binary data, converts each record and writes it to a text file.
My first cut ran for 42 seconds. Not too impressive. I reviewed my logic and realized I was passing whole structures between high-use functions instead of pointers (or references). I changed that and the process time dropped to 21 seconds. Better.
To read the file, I was using istream.read(), but I was writing each line (of text) with ostream << text << endl. I rewrote it to buffer the data up and then do a ostream.write() and now the elapsed time is down to 11 seconds. For the numbers of records I'm processing, these are the rates:
Input Records = 1,066,766
Input file size: 136 MB
Output text file size: 184.1MB
This is single threading.Code:elapsed: 42 seconds rate: 136MB / 42 = 3.238 MB per second elapsed: 21 seconds rate: 136 MB / 21 = 6.476 MB per second elapsed: 12 seconds rate: 136 MB / 11 = 12.363 MB per second.
If I remove the conversion logic, and just read the binary data, don't touch or move it, don't also write a 184 MB text file, and then write it back out from the same input buffer, it takes 2 seconds, for an I/O rate of 68MB per second.
Would you say the 12 MB per second rate I've reached is reasonable?