To multithread or not ?

This is a discussion on To multithread or not ? within the C++ Programming forums, part of the General Programming Boards category; Hallo all, I am still very new to C. I wrote a program that basically does evaluations on all binary ...

  1. #1
    Registered User
    Join Date
    May 2008
    Posts
    58

    To multithread or not ?

    Hallo all,

    I am still very new to C. I wrote a program that basically does evaluations on all binary numbers between 0 and 2^27. When aggressively compiled, the program is already quite fast (takes 40 sec on my 64bit , 2core MacBook). Would multithreading significantly improve the execution speed in a program which basically does through all binary numbers? If yes, is <pthread.h> the library to use? Here it is:
    Code:
    #include <bitset>
    #include <iostream>
    #include <math.h>
    
    int main()
    {
    static const unsigned n = 27;
    static char *c[ n ] = {"aut","bel","bul","cyp","cze","deu","dnk","esp","est","fin","fra","gbr","grc","hun","irl","ita","ltu","lux","lva","mlt","nld","pol","prt","rom","svk","svn","swe"};
    static unsigned int w[ n ] = {10,12,10,4,12,29,7,27,4,7,29,29,12,12,7,29,7,4,4,3,13,27,12,14,7,4,10};
    static float p[ n ] = {8.34,10.59,7.6,0.8,10.32,82.18,5.47,45.27,1.34,5.31,63.92,61.09,11.18,10.03,4.39,59.21,3.37,0.48,2.27,0.41,16.39,38.09,10.67,21.45,5.4,2.02,9.22};
    
    static const unsigned nc = 1 << n;
    
    double pr = 0.0f;
    
    unsigned int sumv = 0;
    unsigned int sumw = 0;
    float sump = 0.0f;
    
    double COLEMAN[ 2 ] = {0.0};
    double BANZHAF[ n ][ 2 ] = {{0.0}};
    
    bool win_ni = false;
    bool win_lb = false;
    
    unsigned int i;
    for ( i = nc; i--; )
    {
    	std::bitset < n > v = i;
    	sumv = v.count();
    	if ( sumv < 15 ) continue;
    	sumw = 0;
    	sump = 0;
    
    	pr = pow( 0.5 , sumv ) * pow( 0.5 , n - sumv );
    
    	win_ni = false;
    	win_lb = false;
    
    	unsigned int k;
    	for ( k = n; k--; ) { if ( v[ n - k - 1 ] ) { sumw += w[ k ]; sump += p[ k ]; } }
    
    	if ( sumv >= 18 && sumw >= 255 && sump >= 308.01 ) { COLEMAN[ 0 ] += pr; win_ni = true; }
    	if ( sump >= 322.92 || sumv >= 24 ) { COLEMAN[ 1 ] += pr; win_lb = true; }
    
    	if(win_ni || win_lb)
    	{
    		unsigned int j;
    		for ( j = n ; j--; )
    		{
    			v.flip( j );
    			sumv = v.count();
    			sumw = 0;
    			sump = 0;
    
    			k = 0;
    			for ( k = n; k--; ) { if ( v[ n - k - 1 ] ) { sumw += w[ k ]; sump += p[ k ]; } }
    
    			if ( win_ni && ( sumv < 18 || sumw < 255 || sump < 308.01 ) ) BANZHAF[ j ][ 0 ] += pr;
    			if ( win_lb && ( ( sumv < 15 || sump < 322.92 ) && sumv < 24 ) ) BANZHAF[ j ][ 1 ] += pr;
    
    			v.flip( j );
    		}
    	}
    }
    
    printf( "Coleman  ; Efficiency\n" );
    printf( "Nice 27  ; Lisbon 27\n" );
    printf( "%f ; %f\n" , COLEMAN[ 0 ], COLEMAN[ 1 ] );
    printf( "Banzhaf ; Voting ; Powers\n" );
    printf( "State ; Nice 27  ; Lisbon 27\n" );
    i = 0;
    for ( i = n; i--; ) { printf("  %s ; %f ; %f\n" , c[ n - i - 1 ], BANZHAF[ i ][ 0 ], BANZHAF[ i ][ 1 ] ); }
    
    return 0;
    }

  2. #2
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,189
    Well, since its a bit late to try to decipher what you are doing, perhaps you could explain what you mean by 'goes through all the binary numbers'.

    But in general, anything that can be written as a for loop can be multithreaded for a performance gain. There are exceptions, such as extremely short loops where teh thread setup time exceeds the gain. If you are only doing 2^27 calculations I would suggest that at 40 seconds execution time, that wont be an issue. You can probably get that down to ~16 seconds assumming you have hyperthreading enabled.
    Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

  3. #3
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,167
    assuming all calculations can be done in parallel (ie, iterations do not depend/use the results of prior iterations), you will be able to get a speedup proportional to the number of threads. If you make two threads, that would make it run in almost half the time. "Almost" because the two cores share the same link to memory, as well as caches.

    As for threading library, I recommend either pthread or Boost::Thread. Boost::Thread seems to be just a bit more portable (I couldn't find an implementation of pthread for 64-bit windows), and is more natively C++. Pthread is a C library. Pthread is tried and true though, being in POSIX OSes since prehistoric times, and it is a bit more feature-rich compared to Boost::Thread in its current state. Also, Boost::Thread will probably be made standard C++ (like the rest of the Boost library) in a few revisions.

    abachler, please don't spread misinformation.
    anything that can be written as a for loop can be multithreaded for a performance gain
    Not true. Not if results depend on prior results.
    Code:
    int fib[100] = { 1, 1 };
    for (i = 2;  i < 100; ++i) {
         fib[i] = fib[i-1] + fib[i-2];
    }
    It's not possible to parallelize that.

    If you are only doing 2^27 calculations I would suggest that at 40 seconds execution time, that wont be an issue.
    Huh...

    You can probably get that down to ~16 seconds assumming you have hyperthreading enabled.
    That is totally irrelevant. Hyperthreading requires the program to be multi-threaded. It basically simulates two cores on one. The speedup is more like 20&#37;, not >100% (where did that 16 sec estimation come from?). And the Core 2 doesn't even have hyperthreading.

  4. #4
    Registered User
    Join Date
    May 2008
    Posts
    58
    ok, thanks for the comments. I will give multithreading a shot

    SK

  5. #5
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,189
    Quote Originally Posted by cyberfish View Post
    assuming all calculations can be done in parallel (ie, iterations do not depend/use the results of prior iterations), you will be able to get a speedup proportional to the number of threads. If you make two threads, that would make it run in almost half the time. "Almost" because the two cores share the same link to memory, as well as caches.
    Wrong. 4 logitcal processors ( on a dual core with HT) do not equal 4 times the procesing power. the additional logical processor from HT in practice increases performance about 30&#37; as much as an actual additional processor. Hence 40/2.6 = ~16

    Just adding more threads will not give you a performance increase per se, it depends on several factors, nto the least of which is how many processign units you have and how efficiently the task parallelizes. Mosts tasks decrease in single iteration performance measures when parallelized. Ahmdals law applies here.

    abachler, please don't spread misinformation.
    hence -
    Quote Originally Posted by abachler
    But in general, anything that can be written as a for loop can be multithreaded...
    I notice you left out the important qualifier statement, are you trolling?


    Not true. Not if results depend on prior results.
    Code:
    int fib[100] = { 1, 1 };
    for (i = 2;  i < 100; ++i) {
         fib[i] = fib[i-1] + fib[i-2];
    }
    It's not possible to parallelize that.
    yes, it is. Its just not a trivial process, I never claimed it to be trivial, only to be generally possible.

    Huh...


    That is totally irrelevant. Hyperthreading requires the program to be multi-threaded. It basically simulates two cores on one. The speedup is more like 20%, not >100% (where did that 16 sec estimation come from?). And the Core 2 doesn't even have hyperthreading.
    2 physical cores * 1.3 = 2.6, 40/2.6 = ~16

    Again, you are leaving out qualifiers in order to troll.

    Code:
     
    i[3] = i[1] + i[2]                          i[4] = i[2] + i[3]
    Core A                                       Core B
    mov eax , i[2]                             nop
    mov edx , i[1]                             mov eax , i[2]
    add eax , edx                              nop
    mov i[3] , eax                             nop
    nop                                           mov edx , i[3] 
    nop                                           add eax , edx
    nop                                           mov i[4] , eax
    and you just completed in 7 clock cycles what woudl have taken 8 otherwise. Not a huge performance gain, but its can be parallelized. Granted I coudl write an even further optimized single thread version that would blow away the parallel version, but that is for another time.
    Last edited by abachler; 06-15-2008 at 10:54 AM.
    Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

  6. #6
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,167
    Wrong. 4 logitcal processors ( on a dual core with HT) do not equal 4 times the procesing power. the additional logical processor from HT in practice increases performance about 30&#37; as much as an actual additional processor. Hence 40/2.6 = ~16
    My bad. I thought you meant you can speed up a 40sec single-threaded program to 16sec just by enabling Hyperthreading, which is what your first post seems to suggest.

    Just adding more threads will not give you a performance increase per se, it depends on several factors, nto the least of which is how many processign units you have and how efficiently the task parallelizes. Mosts tasks decrease in single iteration performance measures when parallelized. Ahmdals law applies here.
    I was assuming the OP is interested only in spawning two threads (one for each core).

    I notice you left out the important qualifier statement, are you trolling?
    My apologies, but I thought loops that depend on results of previous iterations are very common that that ought to be pointed out.

    Again, you are leaving out qualifiers in order to troll.
    Mind pointing out what qualifiers?

    I am not sure what the last part of your post is supposed to point out.

    And I repeat: The Core 2, which the OP uses, does not support Hyperthreading.

  7. #7
    Super Moderator VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,590
    Multi-threading, as has been pointed out, does not always equal better performance. It also introduces several other issues that may or may not be obvious at this point in your design. If you design for multi-thread from the outset then it's not a big deal. However some designs are not well suited to multi-threading. It's not quite as simple as firing off a thread and letting it run. It IS this easy to make a new thread, it's NOT as easy to ensure you don't get any race conditions between threads.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Again on multithread...
    By BrownB in forum C++ Programming
    Replies: 6
    Last Post: 07-13-2006, 01:58 AM
  2. Multithread security problem
    By naruto in forum Windows Programming
    Replies: 2
    Last Post: 12-09-2004, 08:45 PM
  3. multithread programming in C
    By coo_pal in forum C Programming
    Replies: 3
    Last Post: 05-12-2003, 06:56 AM
  4. Multithread Mainia
    By Inquirer in forum C++ Programming
    Replies: 4
    Last Post: 10-20-2002, 11:20 PM
  5. Major Multithread Problem :: MFC
    By kuphryn in forum Windows Programming
    Replies: 1
    Last Post: 05-07-2002, 09:58 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21