Thread: I/O Rates - MBs per second - what's good?

  1. #1
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332

    I/O Rates - MBs per second - what's good?

    I've written a C++ program to convert data from one data format to another. My benchmark program reads 136MB of binary data, converts each record and writes it to a text file.

    My first cut ran for 42 seconds. Not too impressive. I reviewed my logic and realized I was passing whole structures between high-use functions instead of pointers (or references). I changed that and the process time dropped to 21 seconds. Better.

    To read the file, I was using istream.read(), but I was writing each line (of text) with ostream << text << endl. I rewrote it to buffer the data up and then do a ostream.write() and now the elapsed time is down to 11 seconds. For the numbers of records I'm processing, these are the rates:

    Input Records = 1,066,766
    Input file size: 136 MB
    Output text file size: 184.1MB

    Code:
    elapsed: 42 seconds 
    rate: 136MB / 42 = 3.238 MB per second 
    
    elapsed: 21 seconds 
    rate: 136 MB / 21 = 6.476 MB  per second 
    
    elapsed: 12 seconds
    rate: 136 MB / 11 = 12.363 MB per second.
    This is single threading.

    If I remove the conversion logic, and just read the binary data, don't touch or move it, don't also write a 184 MB text file, and then write it back out from the same input buffer, it takes 2 seconds, for an I/O rate of 68MB per second.

    Would you say the 12 MB per second rate I've reached is reasonable?

    Thanks, Todd
    Last edited by Dino; 01-14-2008 at 09:41 AM. Reason: typo

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    This is definitely one of those things where you probably have "picked off the low-hanging fruit" - the very easy optimizations have been done.

    You should use a profiler to figure out what parts of the code take up most of the time, and then it's a "seat of pants" decision whether you can optimize that more or not.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    You should be able to get much closer to that 2 sec. / 68 MB/s by using asynchronous I/O. This will allow you to get work done while the OS/disk is busy doing work.

    The down side is that it won't be std::fstream based, so you'll loose portability etc, etc.

    Multiple threads with std::fstream is another option, but it probably won't yield as much speed up as asynchronous I/O in a single thread.

    gg

  4. #4
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    lol! Just as I was reading your post (matsp) , I realized I could do another optimization in my translate table (from EBCDIC to ASCII). Your "low hanging fruit" keyed me into it. I asked myself where the program most likely spends most of its time, and the answer surely has to be in the character conversion routine. So, I initialized my translate table a bit differently, which allowed me to remove the conditional logic from the char_convert routine. I just reran it and now I'm at 10 seconds.

    I'll look into a profiler. Never used one before. Thanks for the suggestion.

    Todd
    Last edited by Dino; 01-14-2008 at 10:02 AM.

  5. #5
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Quote Originally Posted by Codeplug View Post
    You should be able to get much closer to that 2 sec. / 68 MB/s by using asynchronous I/O. This will allow you to get work done while the OS/disk is busy doing work.

    The down side is that it won't be std::fstream based, so you'll loose portability etc, etc.
    ..
    gg
    Well, unfortunately, I need the portability.

    I do have some functions that use rather long parm lists, and they are called a lot (a lot = for every record). I'll look into shortening these parm list down to bare minimum.

    Todd

  6. #6
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> I'll look into shortening these parm list down to bare minimum.
    Play with your profiling tools first

    gg

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Also, post (the principle of if it's long) your EBCDIC to ASCII code.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Codeplug View Post
    >> I'll look into shortening these parm list down to bare minimum.
    Play with your profiling tools first

    gg
    Profiling gives you hard, indisputable data, but it's possible to find hot spots through careful analysis of the source code as well.

    Before using a profiler or other debugging tool you should try to form your own idea of what you are going to discover beforehand. My first boss and I used to take bets about what we were going to see happen during debugging sessions.

  9. #9
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Here's one version of the table. I have two tables - a 7-bit ASCII table and a UTF-8 table. Heres the 7-bit table. (a global var - char[256])

    Code:
    void init_translate_table_ascii7() { 
    	int i, j ; 
    	// Initialize the ascii translate table 
    	for (i=0 ;  i < sizeof(ascii_chars) ;  ++i ) ascii_chars[i] = i ;  // set each char to itself 
    	
    	// Initialize the ebcdic table with the ascii character set   
    	j = (int) '0' ; 
    	for (i = 0xF0 ; i <= 0xF9 ; ascii_chars[i] = j , i++, j++ ) ;   // put 0-9  in F0 through F9 
    	j = (int) 'A' ; 
    	for (i = 0xC1 ; i <= 0xC9 ; ascii_chars[i] = j , i++, j++ ) ;   // put A-I  in C1 through C9 
    	for (i = 0xD1 ; i <= 0xD9 ; ascii_chars[i] = j , i++, j++ ) ;   // put J-R  in D1 through D9
    	for (i = 0xE2 ; i <= 0xE9 ; ascii_chars[i] = j , i++, j++ ) ;   // put S-Z  in E2 through E9 
    	j = (int) 'a' ; 
    	for (i = 0x81 ; i <= 0x89 ; ascii_chars[i] = j , i++, j++ ) ;   // put a-i  in 81 through 89
    	for (i = 0x91 ; i <= 0x99 ; ascii_chars[i] = j , i++, j++ ) ;   // put j-r  in 91 through 99
    	for (i = 0xA2 ; i <= 0xA9 ; ascii_chars[i] = j , i++, j++ ) ;   // put s-z  in A2 through A9 
    	
    	ascii_chars[0x40] = ' ' ;   ascii_chars[0x4B] = '.' ; 	ascii_chars[0x4C] = '<' ;   
    	ascii_chars[0x4D] = '(' ; 	ascii_chars[0x4E] = '+' ; 	ascii_chars[0x4F] = '|' ;
    	 
    	ascii_chars[0x50] = '&' ; 	ascii_chars[0x5A] = '!' ; 	ascii_chars[0x5B] = '$' ; 
    	ascii_chars[0x5C] = '*' ;  	ascii_chars[0x5D] = ')' ; 	ascii_chars[0x5E] = ';' ; 
    	ascii_chars[0x5F] = '^' ; 	
    	
    	ascii_chars[0x60] = '-' ; 	ascii_chars[0x61] = '/' ;   ascii_chars[0x6A] = '|' ;   
    	ascii_chars[0x6B] = ',' ; 	ascii_chars[0x6C] = '&#37;' ; 	ascii_chars[0x6D] = '_' ; 	
    	ascii_chars[0x6E] = '>' ; 	ascii_chars[0x6F] = '?' ; 	
    	
    	ascii_chars[0x79] = '`' ; 	ascii_chars[0x7A] = ':' ; 	ascii_chars[0x7B] = '#' ; 	
    	ascii_chars[0x7C] = '@' ; 	ascii_chars[0x7D] = '\'' ;  ascii_chars[0x7E] = '=' ; 	
    	ascii_chars[0x7F] = '"' ; 
    	
    	ascii_chars[0xA1] = '~' ;	
    	ascii_chars[0xAD] = '[' ;   
    	
    	ascii_chars[0xBD] = ']' ; 	
    		
    	ascii_chars[0xC0] = '{' ; 	
    	
    	ascii_chars[0xD0] = '}' ; 	
    	
    	ascii_chars[0xE0] = '\\' ; 
    	
    	
    	// These next couple lines are "data forgiveness" lines.  They convert non-displayable ebcdic to chars to ascii blanks.
    	ascii_chars[0x00] = ' '  ;                                    // Translate Binary zero to a blank 
    	//for ( i = 0x01 ; i < 0x40 ; ascii_chars[i] = ' ' , i++ ) ;  // Convert ebcdic 0x01 - 0x3F to a blank. 
    }

  10. #10
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    And here's the actual code for picking up the translated value:

    Code:
    		if (!pic->b_numeric) { 
    			n = offset+len ;                         // Use "n" to keep the calculations out of the "for" loop condition. 
    			for (int k = offset ; k < n ; ++k ) { 
    
    				*outbuf++ = ascii_chars[binary_data[k]] ;   // Get the translated input character. 
    			} 
    		
    		}

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Ok, I can't think of any way to improve on a global array[256] - it should be as fast as it can be.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  12. #12
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    The first thing to time is just read the file and write the file, without any transformation of the data.

    If that takes say 10 of the 11 seconds you're seeing, then you're not going anywhere. The entire program is I/O bound and there's not a lot you can do about that within your code.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  13. #13
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by Salem View Post
    The first thing to time is just read the file and write the file, without any transformation of the data.

    If that takes say 10 of the 11 seconds you're seeing, then you're not going anywhere. The entire program is I/O bound and there's not a lot you can do about that within your code.
    I think he said it takes 2 seconds when he did that.

  14. #14
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    So I see (now).
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  15. #15
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    I'm reading up on Shark and will probably use it for profiling since I'm developing on a Mac under XCode. (Tiger)

    Thanks for all the feedback. I'm not going to make any more changes until I profile it.

    Todd

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Good resources for maths and electronics
    By nickname_changed in forum A Brief History of Cprogramming.com
    Replies: 8
    Last Post: 12-22-2004, 04:23 PM
  2. Good C++ books for a begginer
    By Rare177 in forum C++ Programming
    Replies: 13
    Last Post: 06-22-2004, 04:30 PM
  3. Good sound storage method
    By VirtualAce in forum Game Programming
    Replies: 1
    Last Post: 10-14-2003, 05:18 AM
  4. linked list recursive function spaghetti
    By ... in forum C++ Programming
    Replies: 4
    Last Post: 09-02-2003, 02:53 PM
  5. Using Windows File I/O functions
    By SMurf in forum Windows Programming
    Replies: 2
    Last Post: 10-05-2001, 05:36 AM