I perhaps have to resort to multiple thread programming to reduce computational time. I am now studying how to use POSIX library.
I can access a cluster with 8 CPUs, each of which has one core. I was wondering how many threads to create would get approximately best time performance on my cluster? Is it the same number of CPUs? If I am using multiple thread library like POSIX, will the library take care of which thread run on which CPU or do I have to specify this in my code?

Thanks!