Quicksort review

This is a discussion on Quicksort review within the C++ Programming forums, part of the General Programming Boards category; I currently have a custom written Quicksort algorithm. But I thought I'd ask for someone to look over it, because ...

  1. #1
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182

    Quicksort review

    I currently have a custom written Quicksort algorithm. But I thought I'd ask for someone to look over it, because there is at least one thing I am unsure of.
    The code passes all the tests I have for it, but I cannot say if it's right or not.
    The question is: which of these checks are required and which ones are not? I have commented out a great deal, and like I mentioned, it passes all my tests, but that doesn't mean it will always pass all tests, so...

    Another thing is that I tried to parallellize the algorithm, but it turned out to be slower. Not quite sure why. Perhaps someone has some suggestions or ideas...

    Code:
    template<template<typename> class PivotType, typename RandomIt>
    void _Quicksort(RandomIt begin, RandomIt end, int Threads)
    {
    	if (end - begin <= 1)
    		return;
    
    	RandomIt pivot = PivotType<RandomIt>::GetPivot(begin, end); // Gets the pivot
    	RandomIt left = begin, right = end - 2;
    	std::iter_swap(pivot, end - 1); // Swaps the two values pointed by the two iterators
    	pivot = end - 1;
    
    	//while (/*left <= right && *//*right >= begin && *//*left < end*/1)
    	for (;;)
    	{
    		while (*left < *pivot /*&& *//*left <= right && *//*left < end*/) ++left;
    		while (*right > *pivot /*&& right >= left && */&& right > begin) --right;
    		if (left > right) break;
    		std::iter_swap(left, right);
    		if (right == begin || left == end - 1) break;
    		/*if (left < end) */++left;
    		/*if (right > begin) */--right;
    	}
    
    	std::iter_swap(left, pivot);
    
    	//boost::thread* pThreads[2] = { NULL, NULL };
    	//if (Threads-- > 0)
    		//pThreads[0] = new boost::thread(boost::bind(&_Quicksort<PivotType, RandomIt>, begin, left, 0));
    	//else
    		Quicksort<PivotType>(begin, left);
    	//if (Threads-- > 0)
    		//pThreads[1] = new boost::thread(boost::bind(&_Quicksort<PivotType, RandomIt>, left + 1, end, 0));
    	//else
    		Quicksort<PivotType>(left + 1, end);
    	//if (pThreads[0]) pThreads[0]->join();
    	//if (pThreads[1]) pThreads[1]->join();
    
    	//Quicksort<PivotType>(begin, left);
    	//Quicksort<PivotType>(left + 1, end);
    }
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

  2. #2
    The larch
    Join Date
    May 2006
    Posts
    3,573
    One thing you can do is to recurse only on one half of the range, which would halve the number of threads you are going to need. You can also sort small ranges with insertion sort, so as not to create a huge number of threads for very short ranges.
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  3. #3
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182
    The plan is to spawn two threads only for the first two main recursions (by setting Threads = 2), but this is also slower, for some reason.
    Also tried adding insertion sort (or a call to std::sort) for relatively low number of elements to sort, but with no noticeable speed gain.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,269
    It seems to work correctly when adapted for use in my sorting demo program.
    I'm not surprised that you don't need all of the commented out checks. My Quicksort has even fewer and I've throughtly tested it. I'm only able to get away with that though because I make a copy of the pivot rather than merely swapping it around.

    Performing a final insertion sort pass instead of quicksorting right down to the smallest portions reduces the comparison count, but increases the number of assignments done. It's usually a few percent faster, but it depends on how long the data type takes to compare vs to assign.

    What you'd be doing right now is creating two new threads and then leaving the original thread just sitting there waiting, or performing two recursive calls. As Anon noted, I would only create at most one thread after each partition, then reuse the same thread for one of the portions, and create a thread or recurse for the other portion. Then after iterating you have the join. Well actually by iterating on the larger portion, you end up with multiple threads to join, but some of them are from the later partioning steps that occur during the iterative processing.

    I'd also have a cutoff below which I just perform a recursive call directly for whenever the thread creation overhead makes it not worth it. You don't have to limit yourself to only creating threads on the first split.
    Something like this (untested):
    Code:
    template<template<typename> class PivotType, typename RandomIt>
    void _Quicksort(RandomIt begin, RandomIt end)
    {
    	std::vector<boost::thread*> threadVec;
    	while (end - begin < 1)
    	{
    		RandomIt pivot = PivotType<RandomIt>::GetPivot(begin, end); // Gets the pivot
    		RandomIt left = begin, right = end - 2;
    		std::iter_swap(pivot, end - 1); // Swaps the two values pointed by the two iterators
    		pivot = end - 1;
    
    		for (;;)
    		{
    			while (*left < *pivot) ++left;
    			while (*pivot < *right && begin < right) --right;
    			if (right < left) break;
    			std::iter_swap(left, right);
    			if (right == begin || left == end - 1) break;
    			++left;
    			--right;
    		}
    
    		std::iter_swap(left, pivot);
    
    		// Find the smaller portion to create a new thread for, or recurse on.
    		// The larger portion will be done via iteration
    		if (end - (left + 1) < left - begin)
    		{
    			if (end - (left + 1) > 50) // arbitrary cutoff before deciding to use an extra thread
    				threadVec.push_back(new boost::thread(boost::bind(&_Quicksort<PivotType, RandomIt>, left + 1, end)));
    			else
    				_Quicksort<PivotType, RandomIt>(left + 1, end);
    			end = left;	// iterate on the smaller portion
    		}
    		else
    		{
    			if (begin, left > 50) // arbitrary cutoff before deciding to use an extra thread
    				threadVec.push_back(new boost::thread(boost::bind(&_Quicksort<PivotType, RandomIt>, begin, left)));
    			else
    				_Quicksort<PivotType, RandomIt>(begin, left);
    			begin = left + 1;	// iterate on the smaller portion
    		}
    	}
    	for (std::vector<boost::thread*>::const_reverse_iterator it = threadVec.begin(); it != threadVec.end(); ++it)
    		it->join();
    }
    One could additionally keep a count of how many additional threads have been created in total, and use recursion whenever that count is reached, but that would require locking (perhaps InterlockedIncrement) since any of the threads might wish to create another thread.

    You'll also notice my habbit of converting greater-than item comparisons into less-than item comparisons such that a less-than operator is all that is required for the sort.

    I didn't increase the constant of the outmost while loop to then perform a final insertion sort pass in this instance because the final insertion sort is then done in with just one thread which is bound to be slower. One could perform mini insertion sorts as you encounter small partitions though.

    How many cores do you have?
    Last edited by iMalc; 05-01-2010 at 04:24 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #5
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182
    Quote Originally Posted by iMalc View Post
    It seems to work correctly when adapted for use in my sorting demo program.
    I'm not surprised that you don't need all of the commented out checks. My Quicksort has even fewer and I've throughtly tested it. I'm only able to get away with that though because I make a copy of the pivot rather than merely swapping it around.
    Yeah, likewise, this has also been thoroughly tested.
    What difference would making a copy of the pivot do?

    Performing a final insertion sort pass instead of quicksorting right down to the smallest portions reduces the comparison count, but increases the number of assignments done. It's usually a few percent faster, but it depends on how long the data type takes to compare vs to assign.
    The tests were done on simple ints.

    You'll also notice my habbit of converting greater-than item comparisons into less-than item comparisons such that a less-than operator is all that is required for the sort.
    That is a good idea, actually. I didn't think of that.

    How many cores do you have?
    2.

    This is an interesting concept. But I still think it tries to spawn way too many threads. Something needs to be done about that.
    In some scenarios, Quicksort could call itself a ridiculous amount of times.
    Ah, I don't know. This needs to be tested, I suppose.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

  6. #6
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    3,826
    >_<

    $#%&!

    I'm sorry I didn't notice this yesterday. I may have been able to run it through my suite.

    *shrug*

    Anyway! Elysia, I strongly suggest you use an instantiated template, a type, with a nested accessor instead of template template parameter. (This as the nested `rebind' of the standard allocators.) The approach gives you a lot for very little effort.

    Soma

  7. #7
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182
    Quote Originally Posted by phantomotap View Post
    Anyway! Elysia, I strongly suggest you use an instantiated template, a type, with a nested accessor instead of template template parameter. (This as the nested `rebind' of the standard allocators.) The approach gives you a lot for very little effort.
    Would you show an example of such an approach?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

  8. #8
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    3,826
    O_o

    I already did. You literally need to look at the allocator parameter of the STL container classes, how that parameter is used, and how the technique manifests itself.

    Soma

  9. #9
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182
    I'll take another look. But at a first glance, I have no idea how it's accomplished. Vector takes a type, yet there is no default type for allocator parameter, and the allocator parameter is not required.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

  10. #10
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    3,826
    *shrug*

    I'll be around for about another hour. If you don't have it by then, I'll post an explanation.

    Soma

  11. #11
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182
    Hehehe. Digging through the source of M$'s STL is a difficult thing, but let me see if I can find anything.

    So let's see... by looking at vector, this is what I gather so far:
    - It takes a type for the allocator (let's call is _Alloc).
    - It has a member of type _Alloc<_Ty> (where _Ty is the vector type). This despite being able to pass in a type for the allocator, ie std::allocator<some_weird_type>.
    - It calls the allocator object's member function allocate to allocate some memory.
    What I fail to see is how this is better (if this is right).
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

  12. #12
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    3,826
    *shrug*

    There exists an almost one-to-one translation for almost any use case for template template parameters.

    I can think of far to many limits of template template parameters to even list. Using a nested accessor gives you a lot of extra utility for almost no extra work.

    Soma

    Code:
    #include <iostream>
    
    namespace example1
    {
    	template
    	<
    		typename T
    	>
    	struct policy
    	{
    		static void execute
    		(
    			const T & t_f
    		)
    		{
    			std::cout << t_f << '\n';
    		}
    	};
    
    	template
    	<
    		typename type_T,
    		template <typename> class policy_C = policy
    	>
    	struct use_policy
    	{
    		static void execute
    		(
    			const type_T & t_f
    		)
    		{
    			typedef policy_C<type_T> policy_type;
    			policy_type::execute(t_f);
    		}
    	};
    }
    
    namespace example2
    {
    	template
    	<
    		typename T
    	>
    	struct policy
    	{
    		template
    		<
    			typename TR
    		>
    		struct rebind
    		{
    			typedef policy<TR> other;
    		};
    		static void execute
    		(
    			const T & t_f
    		)
    		{
    			std::cout << t_f << '\n';
    		}
    	};
    
    	struct void_type{};
    
    	template
    	<
    		typename type_T,
    		typename policy_T = policy<void_type>
    	>
    	struct use_policy
    	{
    		static void execute
    		(
    			const type_T & t_f
    		)
    		{
    			typedef typename policy_T::template rebind<type_T>::other policy_type;
    			policy_type::execute(t_f);
    		}
    	};
    }
    
    namespace example3
    {
    	template
    	<
    		typename T
    	>
    	struct policy
    	{
    		template
    		<
    			typename TR
    		>
    		struct rebind
    		{
    			typedef policy<TR> other;
    		};
    		static void execute
    		(
    			const T & t_f
    		)
    		{
    			std::cout << t_f << '\n';
    		}
    	};
    
    	template <> struct policy<void>
    	{
    		template
    		<
    			typename TR
    		>
    		struct rebind
    		{
    			typedef policy<TR> other;
    		};
    	};
    
    	template
    	<
    		typename type_T,
    		typename policy_T = policy<void>
    	>
    	struct use_policy
    	{
    		static void execute
    		(
    			const type_T & t_f
    		)
    		{
    			typedef typename policy_T::template rebind<type_T>::other policy_type;
    			policy_type::execute(t_f);
    		}
    	};
    }
    
    namespace example4
    {
    	template
    	<
    		typename T0,
    		typename T1 = void,
    		typename T2 = void,
    		typename T3 = void,
    		typename T4 = void,
    		typename T5 = void,
    		typename T6 = void,
    		typename T7 = void
    	>
    	struct policy
    	{
    		template
    		<
    			typename TR
    		>
    		struct rebind
    		{
    			typedef policy<TR> other;
    		};
    		static void execute
    		(
    			const T0 & t_f
    		)
    		{
    			std::cout << t_f << '\n';
    		}
    	};
    
    	template <> struct policy<void>
    	{
    		template
    		<
    			typename TR
    		>
    		struct rebind
    		{
    			typedef policy<TR> other;
    		};
    	};
    
    	template
    	<
    		typename type_T,
    		typename policy_T = policy<void>
    	>
    	struct use_policy
    	{
    		static void execute
    		(
    			const type_T & t_f
    		)
    		{
    			typedef typename policy_T::template rebind<type_T>::other policy_type;
    			policy_type::execute(t_f);
    		}
    	};
    }
    
    int main()
    {
    	example1::use_policy<int>::execute(5);
    	example2::use_policy<int>::execute(5);
    	example3::use_policy<int>::execute(5);
    	example4::use_policy<int>::execute(5);
    	return(0);
    }

  13. #13
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182
    Whoa, you showed about 4 examples of doing almost the same thing.
    So what is it the nested accessor accomplishes in this example which nested template parameters does not, except a more tricky syntax?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

  14. #14
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,269
    Quote Originally Posted by Elysia View Post
    This is an interesting concept. But I still think it tries to spawn way too many threads. Something needs to be done about that.
    In some scenarios, Quicksort could call itself a ridiculous amount of times.
    Ah, I don't know. This needs to be tested, I suppose.
    The number of recursive calls, or thread spawns) is reduced by two things.
    1. It only recurses, or spawns a thread on the smaller portion
    2. There is a cutoff of 50 before another thread is spawned

    Number 1 is also the typical solution to preventing specifically crafted worst-case input ordering from blowing the stack. The case that uses the most stack becomes the same as the case where the sorting partitions equally, and is hence the fastest.
    For number 2, I didn't actually run this code (I haven't installed boost) so I don't know what a good cut-off for the minimum number before creating an additional thread is. This is why I would also do something like:
    Code:
    localThreadCount = InterlockedIncrement(threadCount);
    if (end - (left + 1) > 50 && localThreadCount < 8) // Assuming we want a limit of 8 threads.
        threadVec.push_back(new boost::thread(boost::bind(&_Quicksort<PivotType, RandomIt>, left + 1, end)));
    else
    {
        InterlockedDecrement(threadCount);
        _Quicksort<PivotType, RandomIt>(left + 1, end);
    }
    end = left;	// iterate on the smaller portion
    And similarly, use InterlockedDecrement in the loop performing the joins.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  15. #15
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,182
    I did some research, and no matter how I did, multi-threading was always slower, sometimes by as much as 50%.
    I made the tests take about 10s, and the multi-threaded approach clearly used 100% cpu and took 15s vs the 10s without multi-threading.
    I simply copied the first partition step, spawned two threads, and let it call itself recursively.
    So... I don't know. Might be a hardware issue.

    However, the idea to use iteration for one part of the next partition step was a nice idea. It saved about a second, or 8% or some such. Nice speed boost.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.
    For information on how to enable C++11 on your compiler, look here.
    よく聞くがいい!私は天才だからね! ^_^

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. problems with quicksort!
    By coni in forum C Programming
    Replies: 3
    Last Post: 10-03-2009, 02:29 AM
  2. Scalability problems with Quicksort
    By hbejkosa in forum C++ Programming
    Replies: 3
    Last Post: 12-26-2008, 10:26 PM
  3. Tutorial review
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 11
    Last Post: 03-22-2004, 08:40 PM
  4. quicksort problem
    By spinner in forum C Programming
    Replies: 9
    Last Post: 05-04-2003, 02:02 PM
  5. Quicksort
    By Nutshell in forum C Programming
    Replies: 1
    Last Post: 01-15-2002, 08:42 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21