    if I use an openMP parallel for construct like
    #pragma omp parallel for num_threads(4)
    for(int i=0; i<10; ++i){}
    when are the 4 threads created, after the pragma and therefore eventually very often during process live time or is the run time clever enough to create a ready to use thread pool prior process start up?

    That depends on the OMP implementation. I would say that all implementations keep a thread pool handy, since launching and stopping threads is expensive.
