Thread: Timing basic operations in C++

  1. #1
    Registered User StevenGarcia's Avatar
    Join Date
    Nov 2006
    Posts
    22

    Timing basic operations in C++

    I am attempting to measure the time it takes for a for loop to iterate 1000 times and the time it takes for a multiplication operation.

    I am having some trouble with measuring the ops with enough precision. When I time a for loop interating 1000 times and when I time a multiplication operation, in both instances I get a time of 0. I read that this means that the ops are taking less than one ms. I've been trying to find a method for timing ops with greater precision but so far I have not had any luck. If anyone could point me in the right direction it would be much appreciated.


    Code:
    void matrix::time_for_ops()
    {
    	loop_timer_start = clock();     // start for loop timer
    
    	for (int i = 0; i < 1000; i++)
    	{
    		//do nothing
    	}
    	
    	loop_timer_end = clock();      //end for loop timer
    
            //=====================================
    
            multiplication_timer_start = clock();   //start multi timer
    
    	test = 3 * 3;
    
    	multiplication_timer_end = clock();     //end multi timer
    
            //======================================
    
    	
           // Output results
    
           cout << "Time to execute multiplication = ";
    
    	cout << (multiplication_timer_end - multiplication_timer_start)/CLOCKS_PER_SEC; 
    
            cout << endl;
    
    	cout << "Time for a for loop to execute 1000 times = ";
    	
    	cout << (loop_timer_end - loop_timer_start)/CLOCKS_PER_SEC << endl;
    
    	return;
    }

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    clock() is like using a sun-dial to time a bullet.

    1 millisecond might seem quick, but your several Ghz processor gets through close to 1 million instructions in that time.

    http://msdn2.microsoft.com/en-us/library/ms644904.aspx
    If you're also using the win32 API

    Or do a board search for "RDTSC", which is the pentium asm instruction which does the same thing.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User StevenGarcia's Avatar
    Join Date
    Nov 2006
    Posts
    22
    Hey, thanks. That will work great.

  4. #4
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Some other things you should know.....
    Code:
    	for (int i = 0; i < 1000; i++)
    	{
    		//do nothing
    	}
    Your compiler could remove this code all together - since is does in fact "do nothing".

    >> test = 3 * 3;

    The pre-processor is free to change this to "test = 9" as well.

    gg

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Codeplug View Post
    Some other things you should know.....
    Code:
    	for (int i = 0; i < 1000; i++)
    	{
    		//do nothing
    	}
    Your compiler could remove this code all together - since is does in fact "do nothing".

    >> test = 3 * 3;

    The pre-processor is free to change this to "test = 9" as well.
    Not the pre-processor[1], but the optimizer in any commercial level compiler produced, for a major processor architecture, in the last 10-15 years, will do "constant folding", which means constant expression evaluation at compile-time, so yes, the compiler will replace 3 * 3 with a constant of 9. Even with RDTSC, it's unlikely that you'll see the difference between two RDTSC instructions and a RDTSC with a multiply instruction in between. You need many multiply instructions to make it show enough clock-cycles to not be influenced by other random things.

    Note also that RDTSC may vary between cores on a multiprocessor or multicore system, so if you try to measure something on such as system with RDTSC, you need to make sure the code runs on only one processor at any give time - or you may get very random results.

    [1] The preprocessor is almost entirely used to process #include, #define, #if and comment removal - and nothing else.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    Registered User StevenGarcia's Avatar
    Join Date
    Nov 2006
    Posts
    22
    So, I've found some examples of how to use QueryPerformanceCounter(), but I've still getting a time of 0 seconds for my loop to 1000. I tried increasing the value to 100,000 but I still get 0 for my time reading. Any hints would be much appreciated.

    Could this be a result of my home computer having a dual core CPU?

    Code:
    void matrix::time_for_ops()
    {
    	int dummy = 0;
    
    	QueryPerformanceCounter(&start_tick);
    
    	for (int i = 0; i < 100000; i++)
    	{
    		dummy++;
    		
    	}
    	
    	QueryPerformanceCounter(&end_tick);
    
    	time_start.QuadPart = start_tick.QuadPart/ticksPerSecond.QuadPart;  //convert to seconds
    	
    	time_end.QuadPart = end_tick.QuadPart/ticksPerSecond.QuadPart;      //convert to seconds
    	
    	long long difference = time_end.QuadPart - time_start.QuadPart;           //find difference
    	
    	cout << "Time for a for loop to execute 1000 times = ";
    
    	cout << difference << " seconds.";
    	
    	return;
    }
    Last edited by StevenGarcia; 09-17-2007 at 04:06 PM.

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Why are you converting something which takes a few uS into seconds.
    Of course the result is going to be zero (or 1, or 2 or ... )

    Dump the raw data in start_tick and end_tick to see that you're getting something at least.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Registered User StevenGarcia's Avatar
    Join Date
    Nov 2006
    Posts
    22
    Thanks for the help everyone. I finally got a value I can work with.
    Code:
    void matrix::time_for_ops()
    {
    	int dummy = 0;
    
    	QueryPerformanceCounter(&start_tick);
    
    	for (int i = 0; i < 1000; i++)
    	{
    		dummy++;
    		
    	}
    	loop_timer_end = clock();
    
    	QueryPerformanceCounter(&end_tick);
    
    	cout << end_tick.QuadPart << endl;
    	cout << start_tick.QuadPart << endl;
    	cout << ticksPerSecond.QuadPart << endl;
    
    	float denominator = ticksPerSecond.QuadPart;
    	
    	cout << "Time for a for loop to execute 1000 times = ";
    
    	cout << (end_tick.QuadPart-start_tick.QuadPart)/denominator << " seconds." << endl;
    	
    	return;
    }
    my output for time to execute a for loop 1000 times = 5.02857e--006 seconds

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    My guess is the compiler optimised out all the loop code (since it has no dependencies) and all you really timed was this:
    loop_timer_end = clock();

    If you want to force the compiler to implement code which has no purpose except to waste CPU time, then do something like:
    volatile int dummy = 0;
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Salem View Post
    My guess is the compiler optimised out all the loop code (since it has no dependencies) and all you really timed was this:
    loop_timer_end = clock();

    If you want to force the compiler to implement code which has no purpose except to waste CPU time, then do something like:
    volatile int dummy = 0;
    Or make dummy a global variable, then call some function that is in a different .C file [that may use dummy - the compiler won't know whether it does or not if they are not compiled together], and print dummy afterwards.

    And of course, get rid of the clock() call, as it's completely unneeded.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. [ANN] New script engine (Basic sintax)
    By MKTMK in forum C++ Programming
    Replies: 1
    Last Post: 11-01-2005, 10:28 AM
  2. what are your thoughts on visual basic?
    By orion- in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 09-22-2005, 04:28 AM
  3. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  4. Matrix and vector operations on computers
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 11
    Last Post: 05-11-2004, 06:36 AM