Thread: FILES in WinAPI

  1. #16
    jasondoucette.com JasonD's Avatar
    Join Date
    Mar 2003
    Posts
    278
    If anyone has any testing code that others can compile to compare times, I would be willing to do so. I feel tempted to write my own test code, because this is bothering that the windows functions may be that much faster.

    I still have to say that I would be very surprised to find out they are. FillYourBrain, are you positive the speed differences could not be attributed to some other factor? I fail to see why the windows OS would not perform the same caching features on the standard functions, as it does with the win32 api functions. Especially after _Elixia_ has stated that, aside from some overhead, they are, in fact, the same calls.

  2. #17
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    Windows 2000 VC++6. 6 or 7 months ago. Don't have the code anymore. I used fopen, CreateFile, fclose, CloseHandle, etc... GetTickCount() gives you the times before and after major repetition loops. There is a chance that NT is different than personal versions of windows. I don't know.

    I really didn't get into this for a debate anyway. But on the topic of portability, true portability is impossible without sacrificing all of the unique features of a particular operating system. This is why Java is a "lowest common denominator" language. You can achieve portability more effectively with wrappers and emulating classes.
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

  3. #18
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    Just covering the obvious, but has anyone taken into account the binary/text issue.

    WinApi functions are all binary where as the C lib functions, if in text mode, have to parse the entire string to find and replace '\n' with "\r\n". This would make a significant difference in speed.

    Having a look at the source, write.c, this is done into a 1025 byte buffer which is written out using WriteFile every time it is fill. This would make a massive difference for larger writes.

    It should be noted that we can mix and match to some extent:
    http://msdn.microsoft.com/library/en..._osfhandle.asp

    _osfhnd

  4. #19
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    I really didn't want to have to re-code this test but here it is for the unbelievers. My results were consistent with the old results I got. Very large margin of speed difference.

    Please, run it. enjoy.
    Code:
    #include <windows.h>
    #include <iostream>
    #include <stdio.h>
    
    
    int main(void)
         {
         DWORD test1, test2, written;
         struct _iobuf *file_runtime;
         HANDLE file_api;
         int i;
         
         char *buf = new char[10000];
         //Create the file
         file_api = ::CreateFile("C:\\TestFile.dat", GENERIC_WRITE, NULL, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
         ::CloseHandle(file_api);
    
         //OPEN and CLOSE tests.
    
         //runtime
         test1 = ::GetTickCount();
         for(i=0; i < 10000; i++)
              {
              file_runtime = fopen("C:\\TestFile.dat", "rb");
              fclose(file_runtime);
              }
         test2 = ::GetTickCount();
         std::cout << "OPEN/CLOSE runtime " << (int)(test2-test1) << std::endl;
    
         //API
         test1 = ::GetTickCount();
         for(i=0; i < 10000; i++)
              {
              file_api = ::CreateFile("C:\\TestFile.dat", GENERIC_READ, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
              ::CloseHandle(file_api);
              }
         test2 = ::GetTickCount();
         std::cout << "OPEN/CLOSE W32 API " << (int)(test2-test1) << std::endl;
    
         //LARGE WRITE tests
    
         //runtime
         test1 = ::GetTickCount();
         file_runtime = fopen("C:\\TestFile.dat", "wb");
         for(i=0; i < 10000; i++)
              {
              fwrite(buf, 10000, 1, file_runtime);
              fseek(file_runtime, 0, SEEK_SET);
              }
         fclose(file_runtime);
         test2 = ::GetTickCount();
         std::cout << "LARGE WRITE runtime " << (int)(test2-test1) << std::endl;
    
         //API
         test1 = ::GetTickCount();
         file_api = ::CreateFile("C:\\TestFile.dat", GENERIC_WRITE, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
         for(i=0; i < 10000; i++)
              {
              ::WriteFile(file_api, buf, 10000, &written, NULL);
              ::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
              }
         ::CloseHandle(file_api);
         test2 = ::GetTickCount();
         std::cout << "LARGE WRITE W32 API " << (int)(test2-test1) << std::endl;
    
         //SMALL WRITE tests
    
         //runtime
         test1 = ::GetTickCount();
         file_runtime = fopen("C:\\TestFile.dat", "wb");
         for(i=0; i < 100000; i++)
              {
              fwrite(buf, 1, 1, file_runtime);
              fseek(file_runtime, 0, SEEK_SET);
              }
         fclose(file_runtime);
         test2 = ::GetTickCount();
         std::cout << "SMALL WRITE runtime " << (int)(test2-test1) << std::endl;
    
         //API
         test1 = ::GetTickCount();
         file_api = ::CreateFile("C:\\TestFile.dat", GENERIC_WRITE, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
         for(i=0; i < 100000; i++)
              {
              ::WriteFile(file_api, buf, 1, &written, NULL);
              ::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
              }
         ::CloseHandle(file_api);
         test2 = ::GetTickCount();
         std::cout << "SMALL WRITE W32 API " << (int)(test2-test1) << std::endl;
    
         //LARGE READ
    
         //runtime
         test1 = ::GetTickCount();
         file_runtime = fopen("C:\\TestFile.dat", "rb");
         for(i=0; i < 10000; i++)
              {
              fread(buf, 10000, 1, file_runtime);
              fseek(file_runtime, 0, SEEK_SET);
              }
         fclose(file_runtime);
         test2 = ::GetTickCount();
         std::cout << "LARGE READ runtime " << (int)(test2-test1) << std::endl;
    
         // API
         test1 = ::GetTickCount();
         file_api = ::CreateFile("C:\\TestFile.dat", GENERIC_READ, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
         for(i=0; i < 10000; i++)
              {
              ::ReadFile(file_api, buf, 10000, &written, NULL);
              ::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
              }
         ::CloseHandle(file_api);
         test2 = ::GetTickCount();
         std::cout << "LARGE READ W32 API " << (int)(test2-test1) << std::endl;
    
    
         //SMALL READ
    
         //runtime
         test1 = ::GetTickCount();
         file_runtime = fopen("C:\\TestFile.dat", "rb");
         for(i=0; i < 10000; i++)
              {
              fread(buf, 1, 1, file_runtime);
              fseek(file_runtime, 0, SEEK_SET);
              }
         fclose(file_runtime);
         test2 = ::GetTickCount();
         std::cout << "SMALL READ runtime " << (int)(test2-test1) << std::endl;
    
         // API
         test1 = ::GetTickCount();
         file_api = ::CreateFile("C:\\TestFile.dat", GENERIC_READ, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
         for(i=0; i < 10000; i++)
              {
              ::ReadFile(file_api, buf, 1, &written, NULL);
              ::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
              }
         ::CloseHandle(file_api);
         test2 = ::GetTickCount();
         std::cout << "SMALL READ W32 API " << (int)(test2-test1) << std::endl;
    
    
         ::DeleteFile("C:\\TestFile.dat");
         delete [] buf;
    
         return 0;
         }
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

  5. #20
    jasondoucette.com JasonD's Avatar
    Join Date
    Mar 2003
    Posts
    278
    Thanks for taking the time to make your code available. I noticed a problem with the code. You basically determine that a 'small read/write' is that of 1 byte, and a 'large read/write' is 10,000 bytes. I do not think this is proper, as no one reads files in a byte at a time. You cannot even instruct hard disks (at low level) to read only 1 byte into memory. (As a result of this program's method, I should clarify that my argument is from the perspective of a proper programmer using proper programming techniques, as I do not care which method results in reading a file faster 1 byte at a time, as it is something no programmer who knows what he is doing would attempt to do.) If you wish to access such small amounts of data from a file, you can simply read it all into memory first (and thus it becomes a large read/write problem). If you have multiple small files, they should be combined into one larger file. A proper program should never have to read in a file this small (with the obvious exceptions of maybe .INI or .CFG files - in either case, it is a one shot deal, so speed is not a concern).

    I personally think at least several kilobytes would be considered a 'small read/write', and that 'large read/writes' would be on the order of megabytes. In any case, I changed your code a bit, and tried it with both your parameters and mine:

    FillYourBrain parameters:
    Code:
    Large read/writes ues a buffer of 10000 bytes.
    Small read/writes ues a buffer of 1 bytes.
    
    OPEN/CLOSE runtime 39347
    OPEN/CLOSE W32 API 35361
    Win32 API takes 89% the time of the C runtime library.
    
    LARGE WRITE runtime 7941
    LARGE WRITE W32 API 5218
    Win32 API takes 65% the time of the C runtime library.
    
    SMALL WRITE runtime 2093
    SMALL WRITE W32 API 1903
    Win32 API takes 90% the time of the C runtime library.
    
    LARGE READ runtime 2523
    LARGE READ W32 API 1502
    Win32 API takes 59% the time of the C runtime library.
    
    SMALL READ runtime 1643
    SMALL READ W32 API 1392
    Win32 API takes 84% the time of the C runtime library.
    Again, I think testing a 1 byte read/write is pretty much useless. So, I made the small read/write 10,000 bytes, and the large 1,000,000 bytes:

    Jason's parameters:
    Code:
    Large read/writes ues a buffer of 1000000 bytes.
    Small read/writes ues a buffer of 10000 bytes.
    
    OPEN/CLOSE runtime 193068
    OPEN/CLOSE W32 API 177655
    Win32 API takes 92% the time of the C runtime library.
    
    LARGE WRITE runtime 134313
    LARGE WRITE W32 API 134324
    Win32 API takes 100% the time of the C runtime library
    
    SMALL WRITE runtime 83570
    SMALL WRITE W32 API 55870
    Win32 API takes 66% the time of the C runtime library.
    
    LARGE READ runtime 180490
    LARGE READ W32 API 95807
    Win32 API takes 53% the time of the C runtime library.
    
    SMALL READ runtime 64373
    SMALL READ W32 API 50683
    Win32 API takes 78% the time of the C runtime library.
    I have to admit I am still very surprised at the results. Not as bad as 10% of the speed that FillYourBrain originally claimed (...Window API on first access of the file takes about half the time to read. On following reads of the same file however, Windows API takes about 1/10th of the time to read...), but still, 53% is only half speed.

    I guess I have to question the method the program is using to test this speed - it is continually accessing the same file. This is unlike (most) real world applications - the file is normally read once into memory. Isn't it awfully strange that writing is the same speed, and reading is only 1/2 speed? What do you guys think?

    Here's my revision to the program, in case you are interested:

    Code:
    // Thread URL: http://cboard.cprogramming.com/showt...5&pagenumber=2
    // Program original by: FillYourBrain
    // Modified slightly by: Jason Doucette (http://www.jasondoucette.com/)
    // Purpose: testing C functions vs Win32 API functions, for speed of file accessing
    
    #include <windows.h>
    #include <iostream>
    #include <stdio.h>
    
    /*
    // FillYourBrain
    #define NUM_OPENCLOSE	1000000
    #define NUM_LARGEWRITE	1000000
    #define NUM_SMALLWRITE	1000000
    #define NUM_LARGEREAD	1000000
    #define NUM_SMALLREAD	1000000
    #define BUFFERSIZE	10000
    #define SMALLSIZE	1
    */
    
    // Jason
    #define NUM_OPENCLOSE	5000000
    #define NUM_LARGEWRITE	10000
    #define NUM_SMALLWRITE	10000000
    #define NUM_LARGEREAD	2000000
    #define NUM_SMALLREAD	10000000
    #define BUFFERSIZE	1000000
    #define SMALLSIZE	10000
    
    int main(void)
    {
    	DWORD test1, test2, written;
    	int et1, et2;
    	struct _iobuf *file_runtime;
    	HANDLE file_api;
    	int i;
    
    	std::cout << "Large read/writes ues a buffer of " << BUFFERSIZE << " bytes." << std::endl;
    	std::cout << "Small read/writes ues a buffer of " << SMALLSIZE << " bytes." << std::endl;
    	std::cout << std::endl;
    
    	
    	char *buf = new char[BUFFERSIZE];
    	//Create the file
    	file_api = ::CreateFile("C:\\TestFile.dat", 
    		GENERIC_WRITE, NULL, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
    	::CloseHandle(file_api);
    
    	//--------------------------------------------------------
    	//OPEN and CLOSE tests.
    
    	//runtime
    	test1 = ::GetTickCount();
    	for(i=0; i < NUM_OPENCLOSE; i++)
    	{
    		file_runtime = fopen("C:\\TestFile.dat", "rb");
    		fclose(file_runtime);
    	}
    	test2 = ::GetTickCount();
    	et1 = test2-test1;
    	std::cout << "OPEN/CLOSE runtime " << et1 << std::endl;
    
    	//API
    	test1 = ::GetTickCount();
    	for(i=0; i < NUM_OPENCLOSE; i++)
    	{
    		file_api = ::CreateFile("C:\\TestFile.dat", 
    			GENERIC_READ, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    		::CloseHandle(file_api);
    	}
    	test2 = ::GetTickCount();
    	et2 = test2-test1;
    	std::cout << "OPEN/CLOSE W32 API " << et2 << std::endl;
    	std::cout << "Win32 API takes " << et2 * 100 / et1 
    		<< "% the time of the C runtime library." << std::endl;
    	std::cout << std::endl;
    
    
    	//--------------------------------------------------------
    	//LARGE WRITE tests
    
    	//runtime
    	test1 = ::GetTickCount();
    	file_runtime = fopen("C:\\TestFile.dat", "wb");
    	for(i=0; i < NUM_LARGEWRITE; i++)
    	{
    		fwrite(buf, BUFFERSIZE, 1, file_runtime);
    		fseek(file_runtime, 0, SEEK_SET);
    	}
    	fclose(file_runtime);
    	test2 = ::GetTickCount();
    	et1 = test2-test1;
    	std::cout << "LARGE WRITE runtime " << et1 << std::endl;
    
    	//API
    	test1 = ::GetTickCount();
    	file_api = ::CreateFile("C:\\TestFile.dat", 
    		GENERIC_WRITE, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    	for(i=0; i < NUM_LARGEWRITE; i++)
    	{
    		::WriteFile(file_api, buf, BUFFERSIZE, &written, NULL);
    		::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
    	}
    	::CloseHandle(file_api);
    	test2 = ::GetTickCount();
    	et2 = test2-test1;
    	std::cout << "LARGE WRITE W32 API " << et2 << std::endl;
    	std::cout << "Win32 API takes " << et2 * 100 / et1 
    		<< "% the time of the C runtime library." << std::endl;
    	std::cout << std::endl;
    
    
    	//--------------------------------------------------------
    	//SMALL WRITE tests
    
    	//runtime
    	test1 = ::GetTickCount();
    	file_runtime = fopen("C:\\TestFile.dat", "wb");
    	for(i=0; i < NUM_SMALLWRITE; i++)
    	{
    		fwrite(buf, SMALLSIZE, 1, file_runtime);
    		fseek(file_runtime, 0, SEEK_SET);
    	}
    	fclose(file_runtime);
    	test2 = ::GetTickCount();
    	et1 = test2-test1;
    	std::cout << "SMALL WRITE runtime " << et1 << std::endl;
    
    	//API
    	test1 = ::GetTickCount();
    	file_api = ::CreateFile("C:\\TestFile.dat", 
    		GENERIC_WRITE, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    	for(i=0; i < NUM_SMALLWRITE; i++)
    	{
    		::WriteFile(file_api, buf, SMALLSIZE, &written, NULL);
    		::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
    	}
    	::CloseHandle(file_api);
    	test2 = ::GetTickCount();
    	et2 = test2-test1;
    	std::cout << "SMALL WRITE W32 API " << et2 << std::endl;
    	std::cout << "Win32 API takes " << et2 * 100 / et1 
    		<< "% the time of the C runtime library." << std::endl;
    	std::cout << std::endl;
    
    
    	//--------------------------------------------------------
    	//LARGE READ
    
    	//runtime
    	test1 = ::GetTickCount();
    	file_runtime = fopen("C:\\TestFile.dat", "rb");
    	for(i=0; i < NUM_LARGEREAD; i++)
    	{
    		fread(buf, BUFFERSIZE, 1, file_runtime);
    		fseek(file_runtime, 0, SEEK_SET);
    	}
    	fclose(file_runtime);
    	test2 = ::GetTickCount();
    	et1 = test2-test1;
    	std::cout << "LARGE READ runtime " << et1 << std::endl;
    
    	// API
    	test1 = ::GetTickCount();
    	file_api = ::CreateFile("C:\\TestFile.dat", 
    		GENERIC_READ, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    	for(i=0; i < NUM_LARGEREAD; i++)
    	{
    		::ReadFile(file_api, buf, BUFFERSIZE, &written, NULL);
    		::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
    	}
    	::CloseHandle(file_api);
    	test2 = ::GetTickCount();
    	et2 = test2-test1;
    	std::cout << "LARGE READ W32 API " << et2 << std::endl;
    	std::cout << "Win32 API takes " << et2 * 100 / et1 
    		<< "% the time of the C runtime library." << std::endl;
    	std::cout << std::endl;
    
    
    	//--------------------------------------------------------
    	//SMALL READ
    
    	//runtime
    	test1 = ::GetTickCount();
    	file_runtime = fopen("C:\\TestFile.dat", "rb");
    	for(i=0; i < NUM_SMALLREAD; i++)
    	{
    		fread(buf, SMALLSIZE, 1, file_runtime);
    		fseek(file_runtime, 0, SEEK_SET);
    	}
    	fclose(file_runtime);
    	test2 = ::GetTickCount();
    	et1 = test2-test1;
    	std::cout << "SMALL READ runtime " << et1 << std::endl;
    
    	// API
    	test1 = ::GetTickCount();
    	file_api = ::CreateFile("C:\\TestFile.dat", 
    		GENERIC_READ, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    	for(i=0; i < NUM_SMALLREAD; i++)
    	{
    		::ReadFile(file_api, buf, SMALLSIZE, &written, NULL);
    		::SetFilePointer(file_api, 0, 0, FILE_BEGIN);
    	}
    	::CloseHandle(file_api);
    	test2 = ::GetTickCount();
    	et2 = test2-test1;
    	std::cout << "SMALL READ W32 API " << et2 << std::endl;
    	std::cout << "Win32 API takes " << et2 * 100 / et1 
    		<< "% the time of the C runtime library." << std::endl;
    	std::cout << std::endl;
    
    
    	//--------------------------------------------------------
    	// clean up
    	::DeleteFile("C:\\TestFile.dat");
    	delete [] buf;
    
    	return(0);
    }

  6. #21
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    point one: yes, I guess my original figures were exaggerated. Oh well. Point still stands. The difference is dramatic nonetheless.

    point2: I threw that test up there very quickly. the choice of read size is something you can mess with. but indeed you should mess with it and see the results.

    I'm glad you took the time to try it out.

    edit:
    And as far as reading multiple files.... You're going to have to make that edit in the program. It starts to get too elaborate for this purpose as far as I'm concerned.

    The point was only to show that it is clearly faster.
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

  7. #22
    jasondoucette.com JasonD's Avatar
    Join Date
    Mar 2003
    Posts
    278
    Originally posted by FillYourBrain
    And as far as reading multiple files.... You're going to have to make that edit in the program. It starts to get too elaborate for this purpose as far as I'm concerned.

    The point was only to show that it is clearly faster.
    The problem is that the test only shows one thing - that it is clearly faster for reading the same file over and over again. This is not a real world situation.

    You say that reading multiple files gets "too elaborate for this purpose", but I believe that reading multiple files (or doing some other method - does anyone else have any ideas?) is the only proper way of testing this. Anything simplier, like the method you have chosen, does not replicate a real world situation. Your test only shows that if you choose not to cache your files into memory yourself (which is rather foolish, but this information may be of interest to amateur programmers who choose not to do such things), then the win32 api functions are twice as fast as the runtime functions when reading from the same file in the future. Again, the point is that this simply does not happen in real world applications.

    Also, as a hunch, I modified the small read/write size to be closer to a multiple of 32,768 (I am using NTFS), and the speeds have noticeably improved. I thought you guys may be interested:

    Code:
    Large read/writes ues a buffer of 1000000 bytes.
    Small read/writes ues a buffer of 65000 bytes.
    
    OPEN/CLOSE runtime 20279
    OPEN/CLOSE W32 API 17916
    Win32 API takes 88% the time of the C runtime library.
    
    LARGE WRITE runtime 13499
    LARGE WRITE W32 API 13199
    Win32 API takes 97% the time of the C runtime library.
    
    SMALL WRITE runtime 47849
    SMALL WRITE W32 API 45125
    Win32 API takes 94% the time of the C runtime library.
    
    LARGE READ runtime 27139
    LARGE READ W32 API 18076
    Win32 API takes 66% the time of the C runtime library.
    
    SMALL READ runtime 45435
    SMALL READ W32 API 43803
    Win32 API takes 96% the time of the C runtime library.

  8. #23
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    Originally posted by JasonD
    The problem is that the test only shows one thing - that it is clearly faster for reading the same file over and over again. This is not a real world situation.
    This is a generalization. There are plenty of real world situations that work off of a single file and seek around the file. Most commercial software works off of a single file, and based on my experience (monitoring disk reads) most do not cache the entire file in memory. There is no worse way to waste memory than to cache huge files in memory.

    Originally posted by JasonD
    You say that reading multiple files gets "too elaborate for this purpose", but I believe that reading multiple files (or doing some other method - does anyone else have any ideas?) is the only proper way of testing this. Anything simplier, like the method you have chosen, does not replicate a real world situation.
    See Above.
    Originally posted by JasonD
    Your test only shows that if you choose not to cache your files into memory yourself (which is rather foolish, but this information may be of interest to amateur programmers who choose not to do such things)
    wasting memory is an amature programmer tendency as well. Seeking around a file is a perfectly legitimate way of doing things. I dare you to write a database app that works any other way!
    Originally posted by JasonD
    , then the win32 api functions are twice as fast as the runtime functions when reading from the same file in the future. Again, the point is that this simply does not happen in real world applications.
    real world is something I know my share about. And I have to disagree with you 100%. The norm is most certainly not what you are saying. And even if it were, write a test that does that. I have very little doubt of the results.
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

  9. #24
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    try 700 megabyte ISO files. How about a series of them. Let's see you come up with tests on that caching the whole thing in memory on a system with 64 meg of RAM.

    edit: And I'm a little confused by the fact that you're arguing. Thusfar I've completely backed up my claims. If you want to show an exception, then by all means back it up.
    Last edited by FillYourBrain; 09-24-2003 at 11:39 AM.
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

  10. #25
    jasondoucette.com JasonD's Avatar
    Join Date
    Mar 2003
    Posts
    278
    I think there is a confusion, and I will attempt to explain my perspective in more detail:

    I can understand that this may occur for a database program used by multiple users, since you may have to write the data back to the database in case others wish to access it. But, the fact that the database is changing so rapidly means the memory caching will be ineffective. In each other real world case that you mentioned, the file that is being read from is being read from different portions of the file. They are being accessed continuously, but never the same area over and over again. Your program continually reads the same portion of the same file.

    If you ever need to do what your test shows (i.e. reading the same portion of the file over and over again), this portion of the file should be cached manually into memory. This is not wasting memory like an amateur - it is what any professional programmer would do. Of course, the limit on how much you can do this depends on the file portion's size. As this size increases, there will be a limit reached in which you have no option but to manually cache a smaller section of it, or let Windows do its own memory caching (like your test program). At this point, the performance of either case is going to drop drastically, as there is no more RAM for you (or Windows) to cache the entire frequently accessed file.

    Originally posted by FillYourBrain
    try 700 megabyte ISO files. How about a series of them. Let's see you come up with tests on that caching the whole thing in memory on a system with 64 meg of RAM.
    Ok, this is a good example of a real-world application to explain my point. Your original test program reads the same file over and over again. This example program requires you to read a new 64 MB section of the file each time (of course you can't cache the whole thing - that was never my point). The difference is that each new section is uncached in this real-world app, and in your test program the section is cached after the first read.

    Please understand that I am not arguing that the win32 api functions are not faster - I am arguing that the test does not prove anything for a real world application (as most programs never access the files in the way that your test does). Now, if windows can anticipate that a future read may be done on the next chunk in the file (i know some caches do this), whereas the C runtime code does not, then a test on reading such a file in the sequential way we've just described would prove that the win32 api is faster in a way that most applications deal with files.

    I hope that I have made myself clearer.

    EDIT: Perhaps a program that does the following would be a better test: Create a file that is much larger than available RAM, and then continually access random portions of the file. Since the file is much larger than available RAM, and each portion of the file has just as much of a chance to be accessed as any other, we will not do our own memory cache management, and let Windows take care of it. I believe this test would prove a lot. Perhaps with the same results as your test.

  11. #26
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    if the test proves nothing (which I doubt) provide a better test.

    and my comments were only because of your blanket statement about reading small pieces of a file being non-real-world.

    Fact remains. best test so far is what I have provided and it leaves us with moderatedly conclusive results.
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

  12. #27
    jasondoucette.com JasonD's Avatar
    Join Date
    Mar 2003
    Posts
    278
    Originally posted by FillYourBrain
    if the test proves nothing (which I doubt) provide a better test.

    and my comments were only because of your blanket statement about reading small pieces of a file being non-real-world.

    Fact remains. best test so far is what I have provided and it leaves us with moderatedly conclusive results.
    Yes, your test is the best test so far, but it is the only test so far. I was simply pointing out that it may not prove as much as it appears due to its method. That is why I mentioned that perhaps someone may have a better alternative.

    I just edited my post above, as you were writing your reply, with a test that may be more real-world like. Let me know what you think about it.

    I would also, actually, be interested in the differences of file accessing when each file needs only be accessed once - much like a lot of real-world applications, such as reading an .ISO file. One generalized test cannot show the differences of all possibilities. If I have time later today, I'll throw a couple of these tests together.

  13. #28
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    by the way, if we can prove that the API is faster than the runtime at least some of the time and the API is slower than the runtime none of the time then I think its fair to say that a programmer concerned with speed would have to use the API for a windows based app. The real challenge for you I guess is to show an example where the API is actually slower. Because with the data so far, performing the same isn't good enough to make a case for using C runtime calls. I say this while stressing that portability can be better obtained by using inline wrappers to system calls than it can through using the over glorified "standard".
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

  14. #29
    jasondoucette.com JasonD's Avatar
    Join Date
    Mar 2003
    Posts
    278
    Good point. I do agree that using wrappers is the best for portability. I am in the middle of developing a product at the moment to be as portable as possible, and was about to write some image loading routines when I came upon this thread. I am in the middle of trying to decide something for the following case:

    If the C runtime library is around 98% of the speed of the win32 api (in the specific case of whatever your program does with its files, which I believe it will be for my case) should I use it? Even with wrappers, when it comes time to port, it is one more section of code you needn't rewrite. What do you think?

  15. #30
    pronounced 'fib' FillYourBrain's Avatar
    Join Date
    Aug 2002
    Posts
    2,297
    my classes library typically makes no assumptions of external calls. templates, macros, and inline functions wrap most everything so that I can write completely portable code using the implementations of the library for each system (only linux and windows at the moment and linux is just getting underway)

    My feeling is that even if they are the same exact call on both systems it may still be better to wrap it. But then I like to make these things objects anyway so they are wrapped already.

    side note: I don't know if you've ever seen the IJG jpeg library or the gzip code but they both use macros to wrap the reading and writing. And for windows, they use API calls!!
    "You are stupid! You are stupid! Oh, and don't forget, you are STUPID!" - Dexter

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Create Copies of Files
    By Kanshu in forum C++ Programming
    Replies: 13
    Last Post: 05-09-2009, 07:53 AM
  2. *.cpp and *.h files understanding
    By ElastoManiac in forum C++ Programming
    Replies: 4
    Last Post: 06-11-2006, 04:45 AM
  3. Linking header files, Source files and main program(Accel. C++)
    By Daniel Primed in forum C++ Programming
    Replies: 3
    Last Post: 01-17-2006, 11:46 AM
  4. Multiple Cpp Files
    By w4ck0z in forum C++ Programming
    Replies: 5
    Last Post: 11-14-2005, 02:41 PM
  5. Folding@Home Cboard team?
    By jverkoey in forum A Brief History of Cprogramming.com
    Replies: 398
    Last Post: 10-11-2005, 08:44 AM