The example output was multiplying two 25x25 arrays with 25 threads available. Since then I've changed the code like this:
Code:
void *matmult(void *ptr) {
int i, j, pos = (int)ptr;
while (rowcount < m - 1) {
pthread_mutex_lock(&mutex);
if (rowcount < m - 1)
++rowcount;
else {
pthread_mutex_unlock(&mutex);
return NULL;
}
pthread_mutex_unlock(&mutex);
sleep(1); //lets other threads into queue before running and going back
printf("thread %d is processing row %d\n",pos,rowcount);
for (i = 0; i < k; ++i) {
C[rowcount][i] = 0;
for (j = 0; j < n; ++j)
C[rowcount][i] += B[rowcount][j] * A[j][i];
}
}
}
and the thread creation code like:
Code:
pthread_t *tid = new pthread_t[numthreads];
for (i = 0; i < numthreads; ++i)
pthread_create(&tid[i],NULL,matmult,(void*)i);
for (i = 0; i < numthreads; ++i)
pthread_join(tid[i],NULL);
These changes make it so that each thread is actually identified, and they step through the rowslike I had in mind. (0 1 2 3 0 1 2 3 0 1 2 3) etc. Still, I agree that replacing sleep with sched_yield would be a better solution, but that'll have to wait until I get back to school to try out.