Thread: Parallel programming: how to tell threads are being scheduled across the cores?

  1. #1
    Kung Fu Kitty Angus's Avatar
    Join Date
    Oct 2008
    Location
    Montreal, Canada
    Posts
    115

    Parallel programming: how to tell threads are being scheduled across the cores?

    I'm using threads for the first time to maximize usage of a multi-core processor. I'd like to know if all the cores are being used to their maximum potential, but I don't know how.
    I was told that for a while MySQL had a flaw wherein it was not properly using the free Linux kernel to schedule its threads. However, I've never seen any calls that direct scheduling. I find this confusing, so another question I have is, is the onus on the programmer to cause threads to be scheduled on different cores?

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Under normal circumstances, threads are scheduled on all threads.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Kung Fu Kitty Angus's Avatar
    Join Date
    Oct 2008
    Location
    Montreal, Canada
    Posts
    115
    Quote Originally Posted by matsp View Post
    Under normal circumstances, threads are scheduled on all threads.
    Mats
    Threads are scheduled on all cores, you mean? So that's all there is to it? Does anyone know about this problem where the MySQL server didn't operate under normal circumstances? What about OpenMP? What's that for, if not normal circumstances?

  4. #4
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Angus View Post
    Threads are scheduled on all cores, you mean? So that's all there is to it? Does anyone know about this problem where the MySQL server didn't operate under normal circumstances? What about OpenMP? What's that for, if not normal circumstances?
    The only reason I can think that MySQL might have anecdotally been doing such a thing would be if it was deliberately messing around with scheduling. You have to TRY to break it. By default, threads are intelligently scheduled across cores.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Yes, brewbuck explains much better than me - if you just create threads, all cores will be used (as long as the OS itself is working correctly, of course).

    By "messing" with the OS's scheduling, you are HIGHLY likely to undo the work done to make the OS's scheduling optimal. There are some special cases when this is not the case, but those are about as common as the finding compiler bugs or OS bugs. If you do not KNOW FOR CERTAIN, then you most likely should not try to "outsmart" the OS

    There are tools that can tell you what the CPU load is on each CPU. "ps" will be one of those, but other tools may be "better".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6

  7. #7
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    I have a pair of Dell PowerEdge 6850 servers at work, and they each have four dual-core pentium-4 chips. each core allows hyperthreading, so I have a total of 16 logical cores per machine. I find that MySQL's InnoDB tables run much faster if I limit the engine to 3 cores. This was actually suggested by the techs at MySQL, when I had a support contract with them. It seems counter-intuitive, but I suspect that it has to do with file I/O and other hardware issues.

  8. #8
    Registered User
    Join Date
    Mar 2008
    Posts
    82
    if you have the "top" command, type it and press "1"
    this will give you individual load on your cores.

  9. #9
    Kung Fu Kitty Angus's Avatar
    Join Date
    Oct 2008
    Location
    Montreal, Canada
    Posts
    115
    Interesting. I guess it takes some experience to know what those numbers mean. After all, it's not enough to know that all the cores are being used, but that they are all being used at the same time.

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Angus View Post
    Interesting. I guess it takes some experience to know what those numbers mean. After all, it's not enough to know that all the cores are being used, but that they are all being used at the same time.
    If you see 100% load on all processor (or NEAR 100%), then that would indicate that all processors are being used.

    Do you have any reason to believe that NOT all cores are being used?

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Elkvis View Post
    I have a pair of Dell PowerEdge 6850 servers at work, and they each have four dual-core pentium-4 chips. each core allows hyperthreading, so I have a total of 16 logical cores per machine. I find that MySQL's InnoDB tables run much faster if I limit the engine to 3 cores. This was actually suggested by the techs at MySQL, when I had a support contract with them. It seems counter-intuitive, but I suspect that it has to do with file I/O and other hardware issues.
    Even modern operating systems aren't perfect at scheduling threads across cores. Threads might be jumping from core to core unnecessarily, which destroys CPU cache performance. Limiting the set of schedulable cores makes this jumping less likely, leading to better cache performance.

    You could consider that a "bug" in the OS, but a general purpose scheduler will never do the "right thing" in all possible situations, even though they try... really hard. Take a look at a scheduler sometime
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  12. #12
    Kung Fu Kitty Angus's Avatar
    Join Date
    Oct 2008
    Location
    Montreal, Canada
    Posts
    115
    Quote Originally Posted by matsp View Post
    If you see 100% load on all processor (or NEAR 100%), then that would indicate that all processors are being used.
    You're quite sure about that? I have some critical questions about the way such measurements are taken. Is it a reflection of the true mean CPU usage over the sample time, or is that subject to the granularity of the tool? What if there's a spike in CPU usage when one sample is taken, then nothing for a while, then another spike when the next sample is taken?
    Also, can someone answer the same question about the "perfmon" tool in WinXP? I'm also developing an app there which is real-time. It's using a DirectX sound buffer, which suffers from an underrun sometimes, and those times seem to coincide with other processes doing work, but I don't see much more than 20% cpu usage on perfmon.

    Do you have any reason to believe that NOT all cores are being used?
    Only insofar as I don't believe everything I read I was given the impression that the open Linux kernel doesn't schedule threads very well, but folk around here seem confident of otherwise, so I guess I'll accept that for now

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    CPU performance is metered in the kernel in some way. The USUAL way to do this is by measuring time from "schedule in" to "schedule out" for each process. The accounting displayed by "top" or some such is the sum of the CPU measurement for all processes EXCEPT the "idle"/"null" process. It is sampled over a period of time (e.g. 1 second), so if it says 90%, 0.9 seconds of CPU time was used by all processes except IDLE for that 1 second period. The remaining 0.1 second was in idle.

    Obviously, the measurment isn't necessarily extremely precise, so for example interrupt execution may count as the current running process, even if the interrupt work actually done was initiated by a different process (e.g. you have a web-server that queues up 1MB of HTML to send out, and then goes to sleep waiting for all that data to be sent off, and then a math process starts. Because we are using a rather inefficient network card, 3% of time is spent in the interrupt process, so whilst our math process indicates 99.5% CPU time usage, it actually used only 96.5%, because 3% was used by the network card, performing work for the web-server that is currently sleeping).

    Little spurts of CPU (within the sample period, e.g. 1 second) will show up averaged over the time-period, so if a process runs constantly for 200ms in that 1 second interval, it will show up as 20% CPU usage [assuming nothing else runs in that same time].

    As to scheduling in Linux, it will absolute certainly schedule things on all available processors UNLESS the process creating the thread goes out of it's way to PREVENT that (by setting the thread affinity). And whilst setting the thread affinity MAY help at times, it is most often NOT helpful. The MySQL example given is probably one of the few cases where this is beneificial, and the reason is that MySQL does a lot of locking and disk IO, whilst sharing some global data, so the different threads NEED to share the same data, and are often scheduled for short periods of time.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  14. #14
    pwns nooblars
    Join Date
    Oct 2005
    Location
    Portland, Or
    Posts
    1,094
    If you are considering scheduling your own and you get it to work REALLY well in your test environment you have to keep in mind your application isn't going to be the only one running and even if your app is rather resource intensive it may not be the most resource intensive so your scheduling will not only slow down your app (by trying to assign threads to full cores in some cases) but will slow down others that try to take advantage of the OS's built in scheduling.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 5
    Last Post: 10-17-2008, 11:28 AM
  2. Yet another n00b in pthreads ...
    By dimis in forum C++ Programming
    Replies: 14
    Last Post: 04-07-2008, 12:43 AM
  3. OpenMP parallel threads for matrix multiplier
    By collymitch in forum C Programming
    Replies: 0
    Last Post: 04-07-2005, 04:38 PM
  4. problem with win32 threads
    By pdmarshall in forum C++ Programming
    Replies: 6
    Last Post: 07-29-2004, 02:39 PM
  5. Block and wake up certain threads
    By Spark in forum C Programming
    Replies: 9
    Last Post: 06-01-2002, 03:39 AM

Tags for this Thread