>> ... you may be able to normal use synchronization (an additional condition variable and mutex)
A condition variable is not needed - just a mutex that protects a boolean would be "normal synchronization".

I'm also interested in knowing the typical processing load of this device - how many threads in the pool - avg. number of threads that run concurrently (or avg. # that are idle).

Any particular throughput requirements?

How many real hardware threads/cores do you have?

Out of curiosity, what is the architecture?

Having a better understanding of how it works today will help with choosing a polling method vs. a signaling method.

gg