{really sorry for cross-posting, i just didn't know (and still dont) hot to properly move the post. I have requested it to be deleted}
Well, here is the code:
Code:
int threads; int z=0;
pthread_barrier_t barrier;
void * a (void *args)
{
timeval f,s;
int a = 1000000000 / threads;
pthread_barrier_wait(&barrier);
gettimeofday(&s, NULL)
for (int i=0; i < a; ++i)
; //or use this z=z*z;
pthread_barrier_wait(&barrier);
gettimeofday(&f, NULL);
savetime(&f, &s, (double *)args);
return NULL;
}
int main(int argc, char ** argv)
{
double time;
threads = atoi(argv[1]);
pthread_t *thread = malloc(threads * sizeof(pthread_t));
pthread_barrier_init(&barrier, NULL, threads);
for (int i=0; i < threads; ++i)
pthread_create(&thread[i], NULL, a, &time);
for (int i=0; i < threads; ++i)
pthread_join(thread[i], NULL);
free(thread);
printf("%.6f\n", time);
}
(savetime just substracts the timevals and save the results at a double variable)
Results are like 0.4439 for 10threads and 0.3932 for 100threads. This is a test cost. My original code is more than 3000lines to post. But the problem is similar. For the same number of loops 100 threads perform faster. I might make some guesses but wanted to hear more ideas...