Here is an in-depth question about pthreads:
I m currently writing a paper about how an algorithm performs. The main algorithm is simple. It is like:
for (i = start; i < finish; i++)
do_somethings;
I noticed that when using very many pthreads, like 100-200, it performs faster than using 8-10 pthreads, even though I m running it on a 8-core system (I 've tested in in many systems multi-core and not).
Even if I test this simple/useless algorithm:
for (int i=0; i < a; ++i) ;
with a = 1000000 / number_pthreads;
it runs faster with very many pthreads. If somebody has any idea why it would perform faster with 100 threads rather than 10 please enlighten me . Note that the total number of loops is the same.
The way I calculate time is:
Barrier
gettimeofday()
algorithm
Barrier
gettimeofday()