I'm looking for a simple way of sending a divided workload to a fixed (maximum) number of threads.
Here is how I do it when the workload is simply divided by the number of threads:
Code:
std::vector<std::thread> threads;
for(uint i = 0; i < numThreads; ++i) {
threads.push_back(std::thread(
&MyClass::myFunction, this, myData)
);
}
for(auto& t : threads) { t.join(); }
If the workload was so large, that I wanted to divide it into more pieces than I have hardware threads, it's probably not so good to do the following, right?
Code:
for(uint i = 0; i < numPieces; ++i) {
threads.push_back(...);
}
Is there a minimalist way of dispatching the workload (over time) without using more than numThreads at once? I tried to think of ways to join finished threads using mutexes, atomics, queues, but the code always becomes bloated rather quickly. Maybe someone here has experience with threads and can point to an elegant solution.
Thanks!