Here is an in-depth question about pthreads:

I m currently writing a paper about how an algorithm performs. The main algorithm is simple. It is like:

for (i = start; i < finish; i++)

do_somethings;

I noticed that when using very many pthreads, like 100-200, it performs faster than using 8-10 pthreads, even though I m running it on a 8-core system (I 've tested in in many systems multi-core and not).

Even if I test this simple/useless algorithm:

for (int i=0; i < a; ++i) ;

with a = 1000000 / number_pthreads;

it runs faster with very many pthreads. If somebody has any idea why it would perform faster with 100 threads rather than 10 please enlighten me . Note that the total number of loops is the same.

The way I calculate time is:

Barrier

gettimeofday()

algorithm

Barrier

gettimeofday()