Thread: How to measure the time in multi-core machines with pthreads?

  1. #1
    Registered User
    Join Date
    Mar 2008
    Posts
    16

    How to measure the time in multi-core machines with pthreads?

    After spending about an hour trying to figure out what was the problem that my threaded version of a program took more time than the single-core one, I think I rediscovered the wheel. Simply put, I can not use the clock () function to determine the "real" time I spent working in parallel. What do I mean? Example:
    A program creates some threads and each one of them will make argv[1] calls to the rand () function (just to do something so that time goes by ). Then in the end I compute the time recorded by clock () differences and differences with time(). Obviously the ratio below is almost two because my machine is dual core. But ...
    Question: How can I measure the exact time elapsed and not the approximation in seconds with time()?
    Code:
    $ g++ try_to_time.cpp -pthread -o try
    $ ./try 50000000
    Just started counting!
    All threads joined!
    Duration 1: 11.8488 secs.
    Duration 2: 
    * started : 1206834449 seconds after 1/1/1970.
    * finished: 1206834455 seconds after 1/1/1970.
    Duration 3: 6 secs.
    $
    And the program
    Code:
    $ cat try_to_time.cpp 
    #include <iostream>
    #include <iomanip>
    #include <cstdlib>
    #include <ctime>
    #include <pthread.h>
    
    using namespace std;
    
    #define NUM_THREADS 10
    #define DIGITS 4
    
    void * thread_function (void *);
    
    int main (int argc, char * argv [])
    {
    	time_t start1, start2, start3;
    	time_t end1, end2, end3;
    	int iterations, i, rc;
    	void * status;
    	double duration;
    
    	if (argc != 2) {
    		cout << "Usage error. I am expecting one positive integer." << endl;
    		exit (1);
    	}
    	iterations = atoi ( argv [1] );
    
    	// create opaque objects.
    	pthread_t * threads = new pthread_t [NUM_THREADS];
    	// Make threads Joinable for sure.
    	pthread_attr_t attr;
    	pthread_attr_init (&attr);
    	pthread_attr_setdetachstate (&attr, PTHREAD_CREATE_JOINABLE);
    
    	start1 = clock ();
    	start2 = time (NULL);
    	time (&start3);
    
    	cout << "Just started counting!" << std::endl;
    
    	for (i = 0; i < NUM_THREADS; i++) {
    		rc = pthread_create(&(threads[i]), &attr, thread_function, (void *) &iterations);
    		if (rc) {
    			cerr << "ERROR; return code from pthread_create() is " << rc << std::endl;
    			exit (2);
    		}
    	}
    
    	for (i = 0; i < NUM_THREADS; i++)
    		pthread_join ( threads [i], &status );
    	cout << "All threads joined!" << std::endl;
    
    	end1 = clock ();
    	end2 = time (NULL);
    	time (&end3);
    	
    	duration = static_cast<double> (end1 - start1);
    	if (duration < static_cast<double> (CLOCKS_PER_SEC)) {
    		cout << "Few iterations. Try again with a bigger number." << endl;
    		exit (3);
    	}
    
    	cout << "Duration 1: " << fixed << setprecision (DIGITS) 
                 << static_cast<double> (duration / CLOCKS_PER_SEC) << " secs." << endl;
    	cout << "Duration 2: " << endl;
    	cout << "* started : " << start2 << " seconds after 1/1/1970." << endl;
    	cout << "* finished: " << end2 << " seconds after 1/1/1970." << endl;
    	cout << "Duration 3: " << static_cast<int> (difftime ( end3, start3 ) ) << " secs." << endl;
    
    	return 0;
    }
    
    void * thread_function (void * input)
    {
    	int loops = *((int *) input);
    	int i;
    	
    	for (i = 0; i < loops; i++)
    		rand () &#37; RAND_MAX;		// Just a dummy operation.
    
    	pthread_exit (NULL);
    }
    Last edited by dimis; 03-30-2008 at 11:40 AM. Reason: to fold code so that it is not very wide

  2. #2
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Here's one lead after a Google'n: http://www.linuxforums.org/forum/lin...cess-time.html

    gg

  3. #3
    Registered User
    Join Date
    Mar 2008
    Posts
    16

    Smile solution

    Thanks Codeplug. However, these guys wanted to measure the time of each individual thread, which is not what I want. I figured out a solution though with gettimeofday () after searching for quite some time among header files. Here is a sample just for future reference for anyone who might be interested in:
    Code:
    $ gcc measure_time.c -o measure -pthread
    $ ./measure 50000000
    Clock started counting!
    All threads joined!
    Clock stopped counting!
    
    clock start   :       2481.
    clock end     :   12012496.
    CLOCKS_PER_SEC:    1000000.
    timeofday - start sec : 1206895792.
    timeofday - start usec: 154930.
    timeofday - end sec   : 1206895798.
    timeofday - end usec  : 270356.
    -----------------------------------------------------
    Time difference due to clock       :  12.010015 secs.
    Time difference due to gettimeofday:   6.115426 secs.
    $
    Code:
    /*
    ** file: measure.c
    */
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #include <sys/time.h>		/* gettimeofday() */
    #include <pthread.h>
    /*#include <sys/_structs.h>*/	/* defines time structures */
    
    #define NUM_THREADS 10
    #define DIGITS 4
    
    typedef struct {
    	int		secs;
    	int		usecs;
    } TIME_DIFF;
    
    TIME_DIFF * my_difftime (struct timeval *, struct timeval *);
    void * thread_function (void *);
    
    int main (int argc, char * argv [])
    {
    	int iterations, i, rc;
    	void * status;
    	struct timeval myTVstart, myTVend;
    	time_t clock_start, clock_end;
    	TIME_DIFF *	difference;
    	pthread_t threads [NUM_THREADS];
    	pthread_attr_t attr;
    
    	if (argc != 2) {
    		fprintf (stderr, "Usage error. I am expecting one positive integer.\n");
    		exit (1);
    	}
    	iterations = atoi ( argv [1] );
    
    	// Make threads Joinable for sure.
    	pthread_attr_init (&attr);
    	pthread_attr_setdetachstate (&attr, PTHREAD_CREATE_JOINABLE);
    
    	clock_start = clock ();
    	gettimeofday (&myTVstart, NULL);
    
    	printf ("Clock started counting!\n");
    
    	for (i = 0; i < NUM_THREADS; i++) {
    		rc = pthread_create(&(threads[i]), &attr, thread_function, (void *) &iterations);
    		if (rc) {
    			fprintf (stderr, "ERROR; return code from pthread_create() is &#37;d.\n", rc);
    			exit (2);
    		}
    	}
    
    	for (i = 0; i < NUM_THREADS; i++)
    		pthread_join ( threads [i], &status );
    	printf ("All threads joined!\n");
    
    	clock_end = clock ();
    	gettimeofday (&myTVend, NULL);
    	printf ("Clock stopped counting!\n\n");
    
    	printf ("clock start   : %10d.\n", (int) clock_start);
    	printf ("clock end     : %10d.\n", (int) clock_end);
    	printf ("CLOCKS_PER_SEC: %10d.\n", (int) CLOCKS_PER_SEC);
    	printf ("timeofday - start sec : %d.\n", (int) myTVstart.tv_sec);
    	printf ("timeofday - start usec: %6d.\n", (int) myTVstart.tv_usec);
    	printf ("timeofday - end sec   : %d.\n", (int) myTVend.tv_sec);
    	printf ("timeofday - end usec  : %6d.\n", (int) myTVend.tv_usec);
    	printf ("-----------------------------------------------------\n");
    	printf ("Time difference due to clock       : %10.6f secs.\n", 
                    (double) (clock_end - clock_start) / CLOCKS_PER_SEC);
    	difference = my_difftime (&myTVstart, &myTVend);
    	printf ("Time difference due to gettimeofday: %3d.%6d secs.\n", difference->secs, difference->usecs);
    
    
    	free (difference);
    	
    	return 0;
    }
    
    void * thread_function (void * input)
    {
    	int loops = *((int *) input);
    	int i;
    	
    	for (i = 0; i < loops; i++)
    		rand () % RAND_MAX;		// Just a dummy operation.
    
    	pthread_exit (NULL);
    }
    
    TIME_DIFF * my_difftime (struct timeval * start, struct timeval * end)
    {
    	TIME_DIFF * diff = (TIME_DIFF *) malloc ( sizeof (TIME_DIFF) );
    
    	if (start->tv_sec == end->tv_sec) {
    		diff->secs = 0;
    		diff->usecs = end->tv_usec - start->tv_usec;
    	}
    	else {
    		diff->usecs = 1000000 - start->tv_usec;
    		diff->secs = end->tv_sec - (start->tv_sec + 1);
    		diff->usecs += end->tv_usec;
    		if (diff->usecs >= 1000000) {
    			diff->usecs -= 1000000;
    			diff->secs += 1;
    		}
    	}
    	
    	return diff;
    }
    Last edited by dimis; 03-30-2008 at 11:42 AM. Reason: to fold code so that it is not very wide

  4. #4
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Why not just call getrusage() -- beware, most of the returned fields are full of garbage, but the user and system times are real.

  5. #5
    Registered User
    Join Date
    Mar 2008
    Posts
    16
    getrusage has the same trouble with clock (). [tested]

  6. #6
    Registered User
    Join Date
    Mar 2008
    Posts
    16
    The program with various things I tested until I found gettimeofday():

    Output:
    Code:
    $ ./try 50000000
    Clock ticks per second: 100
    Just started counting!
    All threads joined!
    Duration 1: 12.0655 secs.
    Duration 2: 
    * started : 1206900972 seconds after 1/1/1970.
    * finished: 1206900978 seconds after 1/1/1970.
    Duration 3: 6 secs.
    Duration 4: 6.0000.
    tms_utime for start  : 0.
    tms_utime for finish : 1201.
    tms_stime for start  : 0.
    tms_stime for finish : 5.
    tms_cutime for start : 0.
    tms_cutime for finish: 0.
    tms_cstime for start : 0.
    tms_cstime for finish: 0.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    rusage: ru_utime start: 0.
    rusage: ru_utime start: 1005.
    rusage: ru_stime start: 0.
    rusage: ru_stime start: 1960.
    rusage: ru_utime end  : 12.
    rusage: ru_utime end  : 13463.
    rusage: ru_stime end  : 0.
    rusage: ru_stime end  : 54932.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    gettime: tv_sec start : 0.
    gettime: tv_nsec start: 9.
    gettime: tv_sec end   : 0.
    gettime: tv_nsec end  : 0.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    timeofday - start sec : 1206900972.
    timeofday - start usec: 608830.
    timeofday - end sec   : 1206900978.
    timeofday - end usec  : 789331.
    $

    Code:
    Code:
    #include <iostream>
    #include <iomanip>
    #include <cstdlib>
    #include <ctime>
    #include <pthread.h>
    #include <sys/times.h>
    #include <sys/time.h>
    #include <sys/resource.h>
    //#include <sys/_structs.h>
    
    using namespace std;
    
    #define NUM_THREADS 10
    #define DIGITS 4
    
    void * thread_function (void *);
    
    int main (int argc, char * argv [])
    {
    	time_t start1, start2, start3;
    	time_t end1, end2, end3;
    	tms start_buffer, finish_buffer;
    	clock_t start, finish;
    	int iterations, i, rc;
    	void * status;
    	double duration;
    	rusage usage_start, usage_end;
    	timespec start_time, end_time;
    	struct timeval myTV1, myTV2;
    
    	if (argc != 2) {
    		cout << "Usage error. I am expecting one positive integer." << endl;
    		exit (1);
    	}
    	iterations = atoi ( argv [1] );
    
    	// create opaque objects.
    	pthread_t * threads = new pthread_t [NUM_THREADS];
    	// Make threads Joinable for sure.
    	pthread_attr_t attr;
    	pthread_attr_init (&attr);
    	pthread_attr_setdetachstate (&attr, PTHREAD_CREATE_JOINABLE);
    
    	start1 = clock ();
    	start2 = time (NULL);
    	time (&start3);
    	int const ticks_per_sec = sysconf(_SC_CLK_TCK);
    	cout << "Clock ticks per second: " << ticks_per_sec << endl;
    	start = times(&start_buffer);
    	if (getrusage (RUSAGE_SELF, &usage_start) != 0)
    		cerr << " getrusage returned error!" << endl;
    	//clock_gettime (CLOCK_REALTIME, &start_time);
    	gettimeofday (&myTV1, NULL);
    
    	cout << "Just started counting!" << std::endl;
    
    	for (i = 0; i < NUM_THREADS; i++) {
    		rc = pthread_create(&(threads[i]), &attr, thread_function, (void *) &iterations);
    		if (rc) {
    			cerr << "ERROR; return code from pthread_create() is " << rc << std::endl;
    			exit (2);
    		}
    	}
    
    	for (i = 0; i < NUM_THREADS; i++)
    		pthread_join ( threads [i], &status );
    	cout << "All threads joined!" << std::endl;
    
    	end1 = clock ();
    	end2 = time (NULL);
    	time (&end3);
    	finish = times(&finish_buffer);
    	if (getrusage (RUSAGE_SELF, &usage_end) != 0)
    		cerr << " getrusage returned error! [2]" << endl;
    	//clock_gettime (CLOCK_REALTIME, end_time);
    	gettimeofday (&myTV2, NULL);
    	
    	duration = static_cast<double> (end1 - start1);
    	if (duration < static_cast<double> (CLOCKS_PER_SEC)) {
    		cout << "Few iterations. Try again with a bigger number." << endl;
    		exit (3);
    	}
    
    	cout << "Duration 1: " << fixed << setprecision (DIGITS) << static_cast<double> (duration / CLOCKS_PER_SEC) << " secs." << endl;
    	cout << "Duration 2: " << endl;
    	cout << "* started : " << start2 << " seconds after 1/1/1970." << endl;
    	cout << "* finished: " << end2 << " seconds after 1/1/1970." << endl;
    	cout << "Duration 3: " << static_cast<int> (difftime ( end3, start3 ) ) << " secs." << endl;
    	cout << "Duration 4: " << static_cast<double> ((finish - start) / ticks_per_sec) << "." << endl;
    	cout << "tms_utime for start  : " << start_buffer.tms_utime << "." << endl;
    	cout << "tms_utime for finish : " << finish_buffer.tms_utime << "." << endl;
    	cout << "tms_stime for start  : " << start_buffer.tms_stime << "." << endl;
    	cout << "tms_stime for finish : " << finish_buffer.tms_stime << "." << endl;
    	cout << "tms_cutime for start : " << start_buffer.tms_cutime << "." << endl;
    	cout << "tms_cutime for finish: " << finish_buffer.tms_cutime << "." << endl;
    	cout << "tms_cstime for start : " << start_buffer.tms_cstime << "." << endl;
    	cout << "tms_cstime for finish: " << finish_buffer.tms_cstime << "." << endl;
    	cout << "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" << endl;
    	cout << "rusage: ru_utime start: " << usage_start.ru_utime.tv_sec << "." << endl;
    	cout << "rusage: ru_utime start: " << usage_start.ru_utime.tv_usec << "." << endl;
    	cout << "rusage: ru_stime start: " << usage_start.ru_stime.tv_sec << "." << endl;
    	cout << "rusage: ru_stime start: " << usage_start.ru_stime.tv_usec << "." << endl;
    	cout << "rusage: ru_utime end  : " << usage_end.ru_utime.tv_sec << "." << endl;
    	cout << "rusage: ru_utime end  : " << usage_end.ru_utime.tv_usec << "." << endl;
    	cout << "rusage: ru_stime end  : " << usage_end.ru_stime.tv_sec << "." << endl;
    	cout << "rusage: ru_stime end  : " << usage_end.ru_stime.tv_usec << "." << endl;
    	cout << "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" << endl;
    	cout << "gettime: tv_sec start : " << start_time.tv_sec << "." << endl;
    	cout << "gettime: tv_nsec start: " << start_time.tv_nsec << "." << endl;
    	cout << "gettime: tv_sec end   : " << end_time.tv_sec << "." << endl;
    	cout << "gettime: tv_nsec end  : " << end_time.tv_nsec << "." << endl;
    	cout << "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" << endl;
    	cout << "timeofday - start sec : " << myTV1.tv_sec << "." << endl;
    	cout << "timeofday - start usec: " << myTV1.tv_usec << "." << endl;
    	cout << "timeofday - end sec   : " << myTV2.tv_sec << "." << endl;
    	cout << "timeofday - end usec  : " << myTV2.tv_usec << "." << endl;
    	
    	return 0;
    }
    
    void * thread_function (void * input)
    {
    	int loops = *((int *) input);
    	int i;
    	
    	for (i = 0; i < loops; i++)
    		rand () % RAND_MAX;		// Just a dummy operation.
    
    	pthread_exit (NULL);
    }

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    If you want precise (ish) time, then gettimeofday() is probably the right choice. If you want a less precise, but more portable solution, a combinatin of time() and difftime() would work for you - that should work on just about any machine that supports time() - which is nearly anything with more than about 32KB of memory that has a C compiler.

    But you won't get more than seconds out of it.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by dimis View Post
    getrusage has the same trouble with clock (). [tested]
    Why are you interested in the time your process is NOT executing?

  9. #9
    Registered User
    Join Date
    Mar 2008
    Posts
    16
    Thanks matsp. I think that gettimeofday () is exactly what I want. In fact it has an intuitive name and it was silly I didn't look for it earlier.
    @brewbuck: I guess I don't get what you are suggesting again.

  10. #10
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    getrusage() gets actual cpu-execution time, in both user-space code (ru_utime) and kernel-space code (ru_stime). You add the two together to get total cpu-time.

    gettimeofday() gets the wall-time. So when you use this as a "stop-watch", the timings include all the time in which the OS decided not to schedule any of your threads on any cpu(s).

    The clock-resolution you require and the resolution provided by these interfaces will need to be considered as well.

    gg

  11. #11
    Registered User
    Join Date
    Mar 2008
    Posts
    16
    To be honest, my main concern was to print a message with the response time and not the cpu-time, so gettimeofday is exactly what I want. Still I don't see how you can measure the (sum of) time fragments that the cores were working with my process. The suggested method describes something overall for all of the cores. But doesn't say anything about overlaps on the execution on cores, and I guess there is no easy way of figuring this out. But even if someone could, then how am I supposed to tell something more basic as "how many cores does this machine have" (with a C function that is ... )

    And by the way, what does "gg" mean?
    Last edited by dimis; 03-31-2008 at 11:13 PM.

  12. #12
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> Still I don't see how you can measure the (sum of) time fragments that the cores were working with my process.
    The kernel does this for you. It's good if you want to know how much execution time your program took to run - and not just the "wall time" as perceived by the user. In theory, if the kernel were running *only* your process, then the two should be very close. But even in this case, the kernel has to give cpu time to internal threads, device drivers, etc...

    >> But doesn't say anything about overlaps on the execution on cores,...
    The link from my post #2 talks about how to get per-thread execution times. The link below has information on binding threads to a particular core (so you know you're getting overlap) - but letting the scheduler do this for you is usually best.

    >> how am I supposed to tell ... "how many cores does this machine have"
    http://cboard.cprogramming.com/showp...91&postcount=9

    >> what does "gg" mean?
    Those are my initials, but you can call me Codeplug

    gg

  13. #13
    Registered User
    Join Date
    Feb 2010
    Posts
    1

    Overhead of pthreads ?

    Hi Dimis,
    I tried your original example dated 03-30-2008 10:17 AM, and did a minor modification to that. Instead of fixing NUM_THREADs to 10, i took it as an argument from user. Then i ran the same example using 1,2 and 7 threads on an 8 CPU AMD Opteron machine with no other processes running on it.

    What is not clear to me is why the REAL time of 7 threads is much higher than 1 thread and why the total cpu time of all those threads is significantly higher than 1 thread run?

    I was expecting the CPU time of n threads would be nearly same as 1 thread and the real time to be much better as we increase the number of threads.

    Run 1:
    ./a.out 50000000 7

    Just started counting!
    All threads joined!
    Duration 1: 236.9800 secs.
    Duration 2:
    * started : 1267359761 seconds after 1/1/1970.
    * finished: 1267359807 seconds after 1/1/1970.
    Duration 3: 46 secs.

    Run 2:
    ./a.out 350000000 1
    Just started counting!
    All threads joined!
    Duration 1: 8.2200 secs.
    Duration 2:
    * started : 1267359807 seconds after 1/1/1970.
    * finished: 1267359815 seconds after 1/1/1970.
    Duration 3: 8 secs.

    Run 3:
    ./a.out 175000000 2
    Just started counting!
    All threads joined!
    Duration 1: 85.6500 secs.
    Duration 2:
    * started : 1267359815 seconds after 1/1/1970.
    * finished: 1267359862 seconds after 1/1/1970.
    Duration 3: 47 secs.

    Any thoughts?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. inputting time in separate compilation
    By sameintheend01 in forum C++ Programming
    Replies: 6
    Last Post: 03-13-2003, 04:33 AM
  2. I apologize. Good bye.
    By doubleanti in forum A Brief History of Cprogramming.com
    Replies: 14
    Last Post: 05-03-2002, 06:51 PM
  3. Is this really true or it's just science fiction?
    By Nutshell in forum A Brief History of Cprogramming.com
    Replies: 145
    Last Post: 04-09-2002, 06:17 PM
  4. time class
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 12-11-2001, 10:12 PM
  5. relating date....
    By Prakash in forum C Programming
    Replies: 3
    Last Post: 09-19-2001, 09:08 AM