Thread: MSD vs. LSD radix sort

  1. #1
    Registered User Micko's Avatar
    Join Date
    Nov 2003
    Posts
    715

    MSD vs. LSD radix sort

    Hello, After reading Prelude's sorting tutorial (www.eternallyconfuzzled.com) I wrote MSD radix sort function for sorting array of integeres.
    Code:
    void MSD_radix_sort_r ( int a[], int aux[], int first, int last, int d )
    {
           if ( d >= 0 )
    	{  
    		int i, count[UCHAR_MAX + 2] = {0}; 
    		
        		for ( i = first; i <= last; i++ ) 
        		{
          			count[digit ( a[i], d ) + 1]++;
    		} 
        		for ( i = 1; i <= UCHAR_MAX; i++ )
    		{	 
          			count[i] += count[i - 1];
    		} 
        		for ( i = first; i <= last; i++ )
    		{ 
          			aux[count[digit ( a[i], d )]++] = a[i];
    		} 
        		for ( i = first; i <= last; i++ )
    		{ 
    			a[i] = aux[i - first];
    		} 
    		
        		MSD_radix_sort_r ( a, aux, first, first + count[0] - 1, d - 1 );
        	
        		for (i = 0; i < sizeof ( int ); i++) 
    		{
          			MSD_radix_sort_r ( a, aux, first + count[i], first + count[i + 1] - 1, d - 1 ); 
    		}	
    		
    	}
    }
    void MSD_radix_sort ( int a[], int first, int last )
    {
    	int d = sizeof ( int ) - 1;
    	int* aux = malloc ( ( last - first + 1 ) * sizeof ( int ) );
    	
    	MSD_radix_sort_r ( a, aux, first, last, d );	
    	
    	free ( aux );
    }
    where digit is macro defined as:
    #define digit(kljuc,i) ( ( kljuc ) >> ( ( i ) * CHAR_BIT ) & UCHAR_MAX )

    As to be expected because of the MSD (most significant digit) algorithm only after first few steps array is almost sorted.
    I tested both LSD and MSd algorithm (LSD is implemented in the tutorial) with random numbers between 0 and 999999. I notice that
    there is a significat difference in execution time. However both execution time for 500000 elements were rather small:
    LSD 0.078s
    MSD 0.048s
    so I don't know if this is reliable.

    Here's my complete program:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #include <limits.h>
    
    #define N 500000
    #define digit(kljuc,i) ( ( kljuc ) >> ( ( i ) * CHAR_BIT ) & UCHAR_MAX )
    typedef void ( *pFunkcija ) ( int [], int, int ); 
    
    void MSD_radix_sort_r ( int a[], int aux[], int first, int last, int d )
    {
    	if ( d >= 0 )
    	{  
    		int i, count[UCHAR_MAX + 2] = {0}; 
    		
        		for ( i = first; i <= last; i++ ) 
        		{
          			count[digit ( a[i], d ) + 1]++;
    		} 
        		for ( i = 1; i <= UCHAR_MAX; i++ )
    		{ 
          			count[i] += count[i - 1];
    		} 
        		for ( i = first; i <= last; i++ )
    		{ 
          			aux[count[digit ( a[i], d )]++] = a[i];
    		} 
        		for ( i = first; i <= last; i++ )
    		{ 
    			a[i] = aux[i - first];
    		} 
    		
        		MSD_radix_sort_r ( a, aux, first, first + count[0] - 1, d - 1 );
        	
        		for (i = 0; i < sizeof ( int ); i++) 
    		{
          			MSD_radix_sort_r ( a, aux, first + count[i], first + count[i + 1] - 1, d - 1 ); 
    		}	
    		
    	}
    }
    void MSD_radix_sort ( int a[], int first, int last )
    {
    	int d = sizeof ( int ) - 1;
    	int* aux = malloc ( ( last - first + 1 ) * sizeof ( int ) );
    	
    	MSD_radix_sort_r ( a, aux, first, last, d );	
    	
    	free ( aux );
    }
    void LSD_radix_sort ( int a[], int first, int last )
    {
    	int i, j, d = sizeof ( int ) - 1 ;	
    	int* aux = malloc ( ( last - first + 1 ) * sizeof ( int ) );
    
    	for ( j = 0; j <= d  ; j++ )
    	{	
    		int count[UCHAR_MAX + 2] = { 0 };
    		
    		for ( i = first; i <= last; i++ )
    		{
    			count[digit ( a[i], j ) + 1]++;
    		}
    		for ( i = 1; i < UCHAR_MAX + 1; i++ )
    		{
        			count[i] += count[i - 1];
    		}
    		for ( i = first; i <= last; i++ )
    		{
        			aux[count[digit ( a[i], j )]++] = a[i];
    		}
    		for ( i = first; i <= last; i++ )
    		{
    			a[i] = aux[i - first];
    		}	
    	}  
    	free ( aux );
    }
    
    double measure_time ( pFunkcija sortiraj, int a[], int first, int last )
    {
    	clock_t start = clock ( );
    	
    	sortiraj ( a, first, last );
    
    	return ( double )  ( clock ( ) - start ) / CLOCKS_PER_SEC ;
    }
    double uniformna_distribucija ( int seed )
    {
    	return seed * ( 1.0 / ( RAND_MAX + 1.0 ) );
    }
    void generate_random ( int a[], int len , int num )
    {
    	int i;
    	
    	srand ( ( unsigned int ) ( time ( NULL ) ) );
    	for ( i = 0; i < len; i++ )
    	{
    		a [i] = ( int ) ( uniformna_distribucija ( rand ( ) ) * num );
    	}
    } 
    
    
    int main ( void ) 
    {
    	int a[N];
    	generate_random ( a, N, 999999 );
    	printf ( "Time of execution: %gs\n", measure_time ( LSD_radix_sort, a, 0, N - 1 ) );  
    	system ( "pause" );
    }
    I doubt these results are reliable, but I don't have other way to test it, because with more than 500000 elements I get seg fault.
    I use Dev-Cpp.
    I know that real strength of MSD sort is when applied on strings, but here is simple implementation applied to int array.
    What do you think about this implementation?
    Can you comment this?

    Thank you very much
    Last edited by Micko; 11-03-2005 at 03:50 PM.
    Gotta love the "please fix this for me, but I'm not going to tell you which functions we're allowed to use" posts.
    It's like teaching people to walk by first breaking their legs - muppet teachers! - Salem

  2. #2
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >because with more than 500000 elements I get seg fault
    You're probably hitting the internal limit of an array size. If you want more, you can dynamically allocate the memory. That's what I do when I need a HUGE block of memory for testing. It burns up your virtual memory though, so expect the computer to run reeeeeeeeaaaaaallllllllly slow. I actually upgraded my computer recently because I was doing a lot of hash function testing...

    >What do you thik about this implementation?
    The big advantage of MSD is that it can handle variable length items. If you're just sorting integers, LSD is perfectly acceptable, though MSD will typically run faster because it comes to a conclusion about the placement of an item sooner.
    My best code is written with the delete key.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. My own itoa()
    By maxorator in forum C++ Programming
    Replies: 18
    Last Post: 10-15-2006, 11:49 AM
  2. Using quicksort and radix sort for an anagram program
    By RazielX in forum C Programming
    Replies: 2
    Last Post: 05-03-2004, 09:33 AM
  3. radix sort and radix exchange sort.
    By whatman in forum C Programming
    Replies: 1
    Last Post: 07-31-2003, 12:24 PM
  4. Radix Sort, Strings, and Linked Lists
    By dark paladin in forum C++ Programming
    Replies: 2
    Last Post: 04-24-2003, 03:24 PM
  5. Radix Sort question
    By ripper079 in forum C++ Programming
    Replies: 5
    Last Post: 01-06-2003, 06:58 AM