Thread: Confused about "Patience Sorting"

  1. #1
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74

    Unhappy Confused about "Patience Sorting"

    Patience Sorting is described as an O(nlogn) sorting algorithm in Wikipedia.
    Patience sorting - Wikipedia, the free encyclopedia
    But my Patience Sorting is running O(nsqrtn). I don't know why. Would someone have a look on my implementation?

  2. #2
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    This is the original implementation, causing O(nsqrtn) time complexity.
    Code:
    template<typename DataType> class PatienceSorter
    {
    public:
        static void sort(vector<DataType>& elements)
        {
            int insertPos, currentPile, pileCount = 0;
            vector<vector<DataType> > pileList;
    
            pileList.resize(elements.size());
            for(insertPos = 0; insertPos < elements.size(); insertPos++)
            {
                currentPile = 0;
                while(!(pileList.at(currentPile).empty()) && pileList.at(currentPile).back() < elements.at(insertPos))
                    currentPile++;
                pileList.at(currentPile).push_back(elements.at(insertPos));
    
                pileCount += (currentPile >= pileCount);
            }
    
            for(insertPos = 0; insertPos < elements.size(); insertPos++)
            {
                currentPile = -1;
                for(int i = 0; i < pileCount; i++)
                    if(!pileList.at(i).empty())
                        if(currentPile == -1 || pileList.at(currentPile).back() > pileList.at(i).back())
                            currentPile = i;
                elements.at(insertPos) = pileList.at(currentPile).back();
                pileList.at(currentPile).pop_back();
            }
            return;
        }
    };

  3. #3
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    Then I use a binary search according to the wiki page. The first part of the algorithm is working O(nlogn), but the second part of the algorithm is still O(nsqrtn).
    Code:
    template<typename DataType> class PatienceSorter
    {
    public:
        static void sort(vector<DataType>& elements)
        {
            int insertPos, currentPile, pileCount, emptyPileCount;
            vector<vector<DataType> > pileList;
    
            pileList.resize(elements.size());
            pileList.front().push_back(elements.front());
            pileCount = 1;
    
            for(insertPos = 1; insertPos < elements.size(); insertPos++)
            {
                int middle, left = 0, right = pileCount - 1;
                while(left <= right)
                {
                    middle = (left + right) / 2;
                    elements.at(insertPos) > pileList.at(middle).back() ? left = middle + 1 : right = middle - 1;
                }
                pileCount += (left >= pileCount);
                pileList.at(left).push_back(elements.at(insertPos));
            }
    
            emptyPileCount = 0;
            for(insertPos = 0; insertPos < elements.size(); insertPos++)
            {
                currentPile = emptyPileCount;
                elements.at(insertPos) = pileList.at(currentPile).back();
                pileList.at(currentPile).pop_back();
    
                if(pileList.at(currentPile).size() == 0)
                {
                    emptyPileCount++;
                    continue;
                }
    
                if(currentPile + 1 < pileCount && pileList.at(currentPile).back() > pileList.at(currentPile + 1).back())
                {
                    int rightBound = pileCount - 1;
    
                    int middle, left = currentPile + 1, right = pileCount - 1;
                    while(left <= right)
                    {
                        middle = (left + right) / 2;
                        pileList.at(currentPile).back() > pileList.at(middle).back() ? left = middle + 1 : right = middle - 1;
                    }
                    if(left >= pileCount)
                    {
                        left = pileCount - 1;
                    }
                    mergePiles(pileList.at(currentPile), pileList.at(left));
    
                    emptyPileCount++;
                }
            }
            return;
        }
    private:
        static void mergePiles(vector<DataType>& sourcePile, vector<DataType>& destinationPile)
        {
            vector<DataType> newPile;
            int sourceScanPos = 0, destinationScanPos = 0;
    
            while(sourceScanPos < sourcePile.size() && destinationScanPos < destinationPile.size())
            {
                if(sourcePile.at(sourceScanPos) > destinationPile.at(destinationScanPos))
                    newPile.push_back(sourcePile.at(sourceScanPos++));
                else
                    newPile.push_back(destinationPile.at(destinationScanPos++));
            }
            newPile.insert(newPile.end(), sourcePile.begin() + sourceScanPos, sourcePile.end());
            newPile.insert(newPile.end(), destinationPile.begin() + destinationScanPos, destinationPile.end());
    
            destinationPile = newPile;
            return;
        }
    };

  4. #4
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    The same thing happens to "Strand Sort". It seems Strand Sort is an O(nsqrtn) sorting algorithm but wikipedia says it is an O(nlogn) algorithm.
    Strand sort - Wikipedia, the free encyclopedia

  5. #5
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    How are you determine the algorithmic complexity of your implementation?

    You do realize that it is the largest overall growth potential that is measured? An example might be a linear, O(n), function to preprocess input before the main processing phase. The preprocessing phase might be expensive, but it quickly becomes a negligible portion of the overall performance characteristics if the main processing phase is O(n^2). If you count every aspect of the implementation, or fail to consider the phase with the largest growth potential, you get skewed results.

    (I don't have the time or the inclination just now to read your source.)

    Soma

  6. #6
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    I determined the time complexity mainly by their performance. This is my test result:
    Code:
    Sorting 100000 numbers ranged from 0 to 2147483647
    ================ Test Result ================
    Size Filename          Time		Space
    2509 A-Quick.cpp       0.176s		728k
    2556 A-Median.cpp      0.192s		728k
    2379 A-Weakheap.cpp    0.204s		739k
    2450 A-Merge.cpp       0.208s		809k
    1665 A-Comb.cpp        0.212s		728k
    1930 A-Shell.cpp       0.212s		728k
    2259 A-Heap.cpp        0.216s		728k
    3232 A-Tree2.cpp       0.228s		729k
    3324 A-Bookmark2.cpp   0.240s		736k
    2560 A-Tree.cpp        0.244s		2258k
    5344 A-Library.cpp     0.264s		748k
    2964 A-Bookmark.cpp    0.292s		2415k
    3997 A-Tournament.cpp  0.328s		2789k
    4214 A-Skiplist.cpp    0.396s		3390k
    3936 B-Patience2.cpp   0.568s		736k
    2464 B-Pool.cpp        0.612s		1310k
    2255 B-Patience.cpp    0.800s		768k
    2555 B-Beap.cpp        0.864s		728k
    2619 B-Strand.cpp      1.480s		729k
    Code:
    Sorting 250000 numbers ranged from 0 to 2147483647
    ================ Test Result ================
    Size Filename          Time		Space
    2509 A-Quick.cpp       0.444s		728k
    2556 A-Median.cpp      0.464s		728k
    2379 A-Weakheap.cpp    0.528s		738k
    1665 A-Comb.cpp        0.552s		728k
    2259 A-Heap.cpp        0.560s		730k
    2450 A-Merge.cpp       0.564s		819k
    1930 A-Shell.cpp       0.584s		728k
    2560 A-Tree.cpp        0.700s		2229k
    3232 A-Tree2.cpp       0.768s		729k
    3324 A-Bookmark2.cpp   0.772s		734k
    5344 A-Library.cpp     0.840s		834k
    2964 A-Bookmark.cpp    0.888s		2291k
    3997 A-Tournament.cpp  0.928s		2751k
    4214 A-Skiplist.cpp    1.148s		3336k
    4065 B-Patience2.cpp   1.928s		30390k
    2464 B-Pool.cpp        2.488s		1769k
    2255 B-Patience.cpp    3.464s		864k
    2619 B-Strand.cpp      5.576s		729k
    2555 B-Beap.cpp        7.072s		728k
    I also counted the repeated times of the inner loop and I'm sure patience and strand sorting(my implementation) is O(nsqrtn).

  7. #7
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    No one can help me?

  8. #8
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    I've looked over your source.

    I can not, at a glance, determine what you are seeing. A cursory look at those times does not relate to O(n * log(n)) or O(n * sqrt(n)). It looks to me as if you are paying a O(c) cost, a O(n * log(n)) cost, a potential O(n + (n - 1) + (n - 2) + [...] + 0) cost, and a O(log n) cost. This "totals" to O(n * log(n)) regardless of the final result.

    I've never worked specifically with "Patience Sort". If I have time I may build a "purpose built" version for you to compare your work with. I imagine that the problem is as simple as your use of a general purpose expandable array structure when you need, at least, a general purpose expandable array and simple stack. The `std::vector<std::vector<???>>' setup you are using would be costly, depending on the implementation, if you ever need to "move" a stack--as would be needed for O(n * log(n)) performance. A "purpose built" version would perform trivially better, but the O(n * log(n)) growth is still more relevant.

    My advice, in the intervening time, would be to use a `std::vector<???>' (where ??? is the element type to be sorted) of size `$$$' (where $$$ is the number of elements to be sorted), a `std::vector<std::vector<??? *> *>' of size `$$$', and a `std::vector<std::vector<??? *>>' of size `$$$'. The cost would still be O(n). (It is just a larger factor than normal.) This would allow you to allocate memory only once, trivially reorder stacks, and trivially "grow" the stacks. The problem with this sort of implementation is indirection. The cost "per n" is higher, but this cost is constant and never scales. With your solution the cost is lower, variable, and potentially scales.

    The reason people may not be helping you may be a simple thing indeed. It doesn't look like you have a bad implementation of the core algorithm. It seems like you've only chosen "bad" primitives for the implementation of the core algorithm. You could probably get closer to O(n * log(n)) by only replace these primitives with "purpose built" primitives.

    Soma

  9. #9
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    I may not have posted help with this one yet, but I've been watching this thread with interest!

    I tried implementing patience sorting about a month ago. I didn't do the binary search bit to find the best pile though, so the runtime of mine would be even worse than yours.
    I was under the impression that a van Emde Boas tree was necessary to reach the theoretical O(n*logn) upper bound, but I could be wrong about that.

    Anyway, besides timing the runs, have you considered simply measuring the number of comparisons and assignments? To do this, simply provide a templated type whose overloaded operators increment global counters. If the number of assignments and comparisons is or isn't n*logn, then that might point to where the problem lies.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  10. #10
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    Building piles can be O(nlogn) now. The problem is how to take numbers from the piles in O(nlogn) time.

  11. #11
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    I was under the impression that a van Emde Boas tree was necessary to reach the theoretical O(n*logn) upper bound, but I could be wrong about that.
    The "stack" placement bit is obviously O(e * log(e)) if you have a O(log(p)) search available. (Where `e' is the number of elements and `p' is the number of piles and `p' is never larger than, but may be equal to, `e'.) For each `e' you have a `log(p)'.

    With the noted modification, as explained in the "Wikipedia", the "recovery" bit is, not as obviously, O(e * log(e)) if you have a O(log(p)) search available. First you "recover" the `n' lowest value elements of the first "pile". (This is obviously O(n) since the "piles" are already "sorted".) After recovery of the top `n' elements from the first pile, you remove, conceptually or actually, the first "pile" from the set of "piles". (This is either O(c) or uninterested because you don't actually need to do it.) You then search, using your O(log(p)) algorithm, for where to put the "pile" you've just removed. (You are effectively using your O(log(p)) search as an O(log(p)) insert algorithm for an already sorted set.) After moving the removed "pile" to its proper place, the "piles" are, again, sorted from lowest to highest value. You have `p' "piles" that must remain sorted and a O(p) algorithm for keeping them sorted. For each `p' you have a `log(p)'.

    [Edit]
    For each `p' you have a `log(p)'.
    This was an obvious mistake.
    For each `e' you have a `log(p)'.
    It should be this. (And `p' is never larger than `e'.)
    [/Edit]

    I'm not going to post the "lump sum" "big O" for the fiddly bits because they are irrelevant in the face of the largest growth potential which is O(n * log(n)).

    My understanding is that the van Emde Boas tree is only useful for implementations with keys that have a "prefixed" state. (A key state that can be divided into smaller keys.) With such an implementation, using the van Emde Boas tree, pulling the data off the tree in order, with an overall O(n * log(n)) time, becomes a trivial part of the second stage. You can still "hit" the overall O(n * log(n)) time without the tree, but the second stage is no longer trivial.

    If the number of assignments and comparisons is or isn't n*logn, then that might point to where the problem lies.
    *shrug*

    The "top" values in the stack, assuming the first part of the algorithm is successful, is guaranteed to be in order. A modified binary search, with an extra comparison, would give a false reading with such a simple metric but would perform better with data sets with generally random values having many sorted subsets. (I'm not positive. It is just a strong "feeling".)

    The problem is how to take numbers from the piles in O(nlogn) time.
    The question is, how are you not getting O(n * log(n)) time?

    I've looked over your source again and I see nothing wrong with the implementation. You are almost certainly not getting the true O(n * log(n)) because of the primitives you are using. Even the traditional, and most common, iterative implementation of "Binary sort" is not truly O(n * log(n)) because of exactly two operations overhead per invocation. It is small overhead, but in any simple metric, accumulated over millions of times, the algorithm will appear skewed--not really O(n * log(n)).

    I have no interest in dissecting your source. You are obviously capable of doing it yourself. Insert some additional checks, such as what iMalc suggested, and post a complete testbed. If you do that and still have no idea where to begin I'll take another look. But I'm not promising anything because those times are reasonably O(n * log(n)) even if not truly O(n * log(n)).

    Soma
    Last edited by phantomotap; 03-22-2010 at 02:09 AM.

  12. #12
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    You mean testing the code like this?
    Code:
    template<typename DataType> class PatienceSorter
    {
    public:
        static void sort(vector<DataType>& elements)
        {
            int insertPos, pileCount;
            vector<vector<DataType> > pileList;
    
            int testPart1 = 0, testPart2 = 0;
    
            pileList.resize(elements.size());
            pileList.front().push_back(elements.front());
            pileCount = 1;
    
            for(insertPos = 1; insertPos < elements.size(); insertPos++)
            {
                int middle, left = 0, right = pileCount - 1;
                while(left <= right)
                {
                    testPart1++; // test the inner loop
                    middle = (left + right) / 2;
                    elements.at(insertPos) > pileList.at(middle).back() ? left = middle + 1 : right = middle - 1;
                }
                pileCount += (left >= pileCount);
                pileList.at(left).push_back(elements.at(insertPos));
            }
    
            int firstPile = 0;
    
            for(insertPos = 0; insertPos < elements.size(); insertPos++)
            {
                elements.at(insertPos) = pileList.at(firstPile).back();
                pileList.at(firstPile).pop_back();
    
                if(pileList.at(firstPile).empty())
                {
                    firstPile++;
                    continue;
                }
    
                int currentPile = firstPile;
                while(currentPile + 1 < pileCount && pileList.at(currentPile).back() > pileList.at(currentPile + 1).back())
                {
                    swap(pileList.at(currentPile), pileList.at(currentPile + 1));
                    currentPile++;
                    testPart2++; // test the inner loop
                }
            }
            fprintf(stderr, "Part1: %d Part2: %d\n", testPart1, testPart2);
            return;
        }
    };
    This is the result when sorting 10000 numbers:
    Part1: 69311 Part2: 540780

    It's obviously that the first part is O(nlogn) and the second part is O(nsqrtn).

  13. #13
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    It's obviously that the first part is O(nlogn) and the second part is O(nsqrtn).
    I've seen that you are of more than willing diligence, but this is no longer valuable. You aren't listening. I'll tell you the problem again, but this time I'll be bloody specific. I've actually hinted several times.

    The problem is your use of `std::vector<std::vector<???>>'. This is not an opinion. This is fact. You can not insert a value at an arbitrary location into an array in O(log(n)) time. For the second phase of the algorithm to be O(n * log(n)) time the insertion must be O(log(n)) or better. You must use primitives that support a O(log(n)) insertion. This is not a suggestion. This is a statement of fact.

    The primitives you use in the implementation of an algorithm determines its performance just as much as the natural performance of the algorithm. If you implement an O(n^2) algorithm using a primitive with poor performance you may wind up with a O(2^n) algorithm. I've seen it.

    For this reason, that the performance of the primitives count, you can't just any random tree, priority queue, or stack. You can use a heap. You can use a decent priority queue. You can use a list if you understand how to implement binary search with nothing but forward iterators. (Even then you'd pay the cost of advancing iterators by a known value.) You can even use a van Emde Boas tree if you like. It doesn't matter so long as it has a O(log(n)) or better insert and a O(log(N)) or better search. And one doesn't imply the other.

    You are lucky I was interested in the algorithm; I wouldn't have bothered repeating myself otherwise. If you want the best performance out of this algorithm, you are going to have to use the right tools.

    Soma

  14. #14
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    Quote Originally Posted by phantomotap View Post
    I've seen that you are of more than willing diligence, but this is no longer valuable. You aren't listening. I'll tell you the problem again, but this time I'll be bloody specific. I've actually hinted several times.
    I'm so sorry I speak poor English and sometimes I can't understand what you mean.

    But I really wonder if a vector<vector<DataType> > will really slow down the algorithm. Of course I can use a tree or a heap. But why I not directly use heap sort or tree sort if I can use them? I just think patience sort has its own way to do that.

    Any way, I will use a list<> instead of vector<vector<> > to have a try. Thank for your patience reply.

  15. #15
    Registered User
    Join Date
    Mar 2010
    Location
    China
    Posts
    74
    Oh, I can hardly bring up an O(nlogn) method to merge those stupid piles. In fact, using a heap or something else can do that effectively. But what I want is the "pure" patience sort.

    Or maybe patience sort is really an O(nsqrtn) sorting algorithm, just as the following site shows.
    http://www.cs.princeton.edu/courses/...s/patience.pdf

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. New to C++ and confused by boolean and ifs and else
    By jconner in forum C++ Programming
    Replies: 10
    Last Post: 08-02-2006, 03:29 AM
  2. Confused about Memory
    By gL_nEwB in forum C++ Programming
    Replies: 22
    Last Post: 06-20-2006, 07:32 PM
  3. Confused
    By (TNT) in forum C# Programming
    Replies: 1
    Last Post: 11-23-2005, 04:49 PM
  4. confused.. in selecting my line of deapth
    By jawwadalam in forum A Brief History of Cprogramming.com
    Replies: 4
    Last Post: 05-04-2003, 01:21 PM
  5. Extern Question, really confused
    By SourceCode in forum C Programming
    Replies: 10
    Last Post: 03-26-2003, 11:11 PM