Thread: questions on multiple thread programming

  1. #1
    Registered User
    Join Date
    Jan 2009
    Posts
    159

    questions on multiple thread programming

    I perhaps have to resort to multiple thread programming to reduce computational time. I am now studying how to use POSIX library.
    I can access a cluster with 8 CPUs, each of which has one core. I was wondering how many threads to create would get approximately best time performance on my cluster? Is it the same number of CPUs? If I am using multiple thread library like POSIX, will the library take care of which thread run on which CPU or do I have to specify this in my code?

    Thanks!

  2. #2
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    To get the best performance, create one thread per core. The library (or more specifically, the OS scheduler) will take care of which thread runs on which core. You don't need to worry about this.

  3. #3
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Quote Originally Posted by bithub View Post
    To get the best performance, create one thread per core. The library (or more specifically, the OS scheduler) will take care of which thread runs on which core. You don't need to worry about this.
    I would say 1 worker thread per CPU

    plus you can have several IO threads for example which will most of the time spend in waiting state while io operation is been completed
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  4. #4
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    I would say 1 worker thread per CPU
    Can you explain why you would use 1 thread per CPU instead of 1 thread per core? To software applications, there is no difference.

    plus you can have several IO threads for example which will most of the time spend in waiting state while io operation is been completed
    Why would you create several threads to wait for IO? For the best performance, async IO is the best way to go.

  5. #5
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Quote Originally Posted by bithub View Post
    Can you explain why you would use 1 thread per CPU instead of 1 thread per core? To software applications, there is no difference.

    Why would you create several threads to wait for IO? For the best performance, async IO is the best way to go.
    by cpu I mean core

    several IO? because you can have a logger thread writign logs to file,
    db connection threads - requestion data from remote db server
    sochet thread - processing some network connection etc...

    it is too complicated to put all these task on one io thread
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  6. #6
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> I can access a cluster with 8 CPUs, each of which has one core.
    Multi-threading is for a single machine. For distributed-parallel programming you'll need something like MPI. Unless you're running a distributed OS providing SSI like openMosix.

    gg

  7. #7
    Registered User
    Join Date
    Jan 2009
    Posts
    159
    Thanks!
    Yes I remember people called my server "cluster". But how to check if it is distributed system and running a distributed OS providing SSI? And which shall I use, multi-threading or distributed-parallel programming?

    Here is the info of my server from uname:
    Linux 2.6.9-78.0.8.ELsmp #1 SMP Wed Nov 19 19:42:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
    And also info from cat /proc/cpuinfo
    processor : 0
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 0
    siblings : 2
    core id : 0
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7321.81
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

    processor : 1
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 4
    siblings : 2
    core id : 4
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7315.15
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

    processor : 2
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 3
    siblings : 2
    core id : 3
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7315.02
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

    processor : 3
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 7
    siblings : 2
    core id : 7
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7315.10
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

    processor : 4
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 0
    siblings : 2
    core id : 0
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7315.04
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

    processor : 5
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 4
    siblings : 2
    core id : 4
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7314.70
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

    processor : 6
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 3
    siblings : 2
    core id : 3
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7314.99
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

    processor : 7
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    model name : Intel(R) Xeon(TM) MP CPU 3.66GHz
    stepping : 1
    cpu MHz : 3657.792
    cache size : 1024 KB
    physical id : 7
    siblings : 2
    core id : 7
    cpu cores : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cid cx16 xtpr
    bogomips : 7315.06
    clflush size : 64
    cache_alignment : 128
    address sizes : 40 bits physical, 48 bits virtual
    power management:

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    That is one machine for all intents and purposes.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    Registered User
    Join Date
    Jan 2009
    Posts
    159
    Hi Matsp,
    Thanks!
    From where did you tell it is not a distribution system? Does "cluster" only refer to distributed system?
    So I should stick to multi-threading programming?

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by lehe View Post
    Hi Matsp,
    Thanks!
    From where did you tell it is not a distribution system? Does "cluster" only refer to distributed system?
    So I should stick to multi-threading programming?
    A cluster/distributed system will not show that you have 8 cores in /proc/cpuinfo. It will only show the cores of THAT system, which will be one or two perhaps.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  11. #11
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    The term "cluster" implies more than one physical machine (PC). You have a single machine with 8 cores, so Posix "multi-threading" (with pthreads or processes) is appropriate in this case.

    >> I was wondering how many threads to create would get approximately best time performance on my cluster?
    The answer can depend on what you're doing, but a typical answer is one thread per core - as bithub and vart mentioned.

    >> will the library take care of which thread run on which CPU or do I have to specify this in my code?
    The typical answer is that you let your OS/library take care of this for you.

    gg

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Codeplug View Post
    >> will the library take care of which thread run on which CPU or do I have to specify this in my code?
    The typical answer is that you let your OS/library take care of this for you.
    And unless you have SPECIFIC knowledge from understanding the problem nature, the OS in question and processors in question, it is nearly always a waste of time to try to improve on what the OS/library does.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Multithreading (flag stopping a thread, ring buffer) volatile
    By ShwangShwing in forum C Programming
    Replies: 3
    Last Post: 05-19-2009, 07:27 AM
  2. Thread confusion
    By pyrolink in forum C Programming
    Replies: 0
    Last Post: 01-29-2006, 09:42 PM
  3. C++ Threading?
    By draggy in forum C++ Programming
    Replies: 5
    Last Post: 08-16-2005, 12:16 PM
  4. [code] Win32 Thread Object
    By Codeplug in forum Windows Programming
    Replies: 0
    Last Post: 06-03-2005, 03:55 PM
  5. Replies: 12
    Last Post: 05-17-2003, 05:58 AM