Thread: Sorting methods

  1. #1
    Bored Programmer
    Join Date
    Jul 2009
    Location
    Tomball, TX
    Posts
    428

    Sorting methods

    Hey,

    Since I never learned programming in school I figured I'd visit some of the common problem sets that most programs put you through. One of these problems sets was learning all the different sorting methods.

    On the wikipedia quick sort page it says that insertion sort is faster on smaller arrays and alot of quick sorts use insertion sorts when they get down into the small arrays in the recursion. I was thinking about adding insertion sort into my quick sort algorithm to see if I can increase its speed.

    So far I have clocked it at sorting 10 million randomly generated integers in 10-11 seconds. I wanted to see if I could increase that time, but every simulated test I have done shows that the quicksort is outperforming the insertion sort or at least equivalent even on small arrays of 10 or less. Since both finish the sorts in under a second its hard to notice a difference, and when I put a integar to show how many loops are done the quicksort is still a smaller number.

    Any suggestions?

    Code:
    void ReadAndSort::QuickSort(int left, int right)
    {
      int pivot = Numbers[(left+right)/2];
      int x = left;
      int y = right;
      while(x <= y)
      {
        while(Numbers[x] < pivot)
          x++;
        while(Numbers[y] > pivot)
          y--;
        if(x <= y)
        {
          Swap(Numbers[x], Numbers[y]);
          x++;
          y--;
        }   
        timesran++; 
      }
      if(left < y)
        QuickSort(left, y);
      if(x < right)
        QuickSort(x, right); 
    }
    
    void ReadAndSort::InsertionSort()
    {
      double Timeslooped = 0;
      int j = 0;
      for(int x = 1; x < Numbers.size()-1; x++)
      {
        j = x;
        while( j > 0 && Numbers[j-1] > Numbers[j] )
        {
          int Temp = Numbers[j];
          Numbers[j] = Numbers[j-1];
          Numbers[j-1] = Temp;
          j--;
          Timeslooped++;
        }
      }
      cout<<"Insertion sort took "<<Timeslooped<<" iterations."<<endl;
    }
    Just to be clear the ReadAndSort class reads all the numbers in a file then sorts them then saves back to the file the newly sorted result. Numbers is a vector that it uses to hold all the numbers while sorting.

    Also if you notice any glaring weaknesses in the code please let me know. Thanks!
    Last edited by Lesshardtofind; 12-20-2012 at 02:28 PM.
    Virtual reality hello world http://www.rodneybrothers.com/vr/vrh...rld/index.html in html and javascript.
    Viewable with dodocase, google cardboard, OR, and other compatible VR gear.

  2. #2
    Registered User
    Join Date
    Apr 2006
    Posts
    2,149
    If your small arrays get sorted too quickly, time how long it takes to sort a whole bunch of them.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

  3. #3
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    You're implementation of QuickSort doesn't currently call InsertionSort for small subarrays, so how are you testing this?
    See my version of QuickSort here and try varying the CUTOFF macro. Make it small enough and InsertionSort is no longer used; see how much slower that is.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  4. #4
    Bored Programmer
    Join Date
    Jul 2009
    Location
    Tomball, TX
    Posts
    428
    I am sorry I think you misunderstood me.

    I was testing QuickSort and Insertion sort side by side to see if the Insertion sort was actually faster than the Quicksort on small arrays just to see if it was worth adjusting the code.

    Very usefull link though! Looks like you spent alot of time on this.
    Last edited by Lesshardtofind; 12-20-2012 at 11:29 PM.
    Virtual reality hello world http://www.rodneybrothers.com/vr/vrh...rld/index.html in html and javascript.
    Viewable with dodocase, google cardboard, OR, and other compatible VR gear.

  5. #5
    Bored Programmer
    Join Date
    Jul 2009
    Location
    Tomball, TX
    Posts
    428
    Hmm I added it to the QuickSort and it ended up being 2-3 seconds slower on 10,000,000 numbers maybe there is something wrong with my insertion sort.
    Virtual reality hello world http://www.rodneybrothers.com/vr/vrh...rld/index.html in html and javascript.
    Viewable with dodocase, google cardboard, OR, and other compatible VR gear.

  6. #6
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472
    I am curious to ask iMalc why he used for( ; ; ) to implement one of the loops in the algorithm, is there a difference between this and while(1) ? is it somehow faster? Given iMalcs expertise with sorting algorithms i think there must be a good reason, or is it just personal preference?
    Thought for the day:
    "Are you sure your sanity chip is fully screwed in sir?" (Kryten)
    FLTK: "The most fun you can have with your clothes on."

    Stroustrup:
    "If I had thought of it and had some marketing sense every computer and just about any gadget would have had a little 'C++ Inside' sticker on it'"

  7. #7
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by rogster001
    I am curious to ask iMalc why he used for( ; ; ) to implement one of the loops in the algorithm, is there a difference between this and while(1) ? is it somehow faster? Given iMalcs expertise with sorting algorithms i think there must be a good reason, or is it just personal preference?
    They are equivalent in meaning, and are likely to result in exactly the same code. That said, a book on optimisation in C -- old when I read it years ago -- mentioned that some compilers -- old when the book was written -- may generate code that performs an unnecessary test given the while(1) version, whereas the for ( ;; ) version would always be recognised as a deliberate infinite loop, and hence should be preferred.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #8
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472
    ok thanks, i suppose if there is the chance that no evaluation as such is perfomed on the 'for' version then that would be preferred .

    Thiniking about it it may be logical to expect for( ;; ) not to evoke an evaluation - there is no true/false implied or automagically assigned is there - whereas while(1) straight away says 'i am true' - so one might expect this to be compiled as a test - gee whiz, just give me that computer science professorship already :->
    Last edited by rogster001; 12-21-2012 at 10:14 AM.
    Thought for the day:
    "Are you sure your sanity chip is fully screwed in sir?" (Kryten)
    FLTK: "The most fun you can have with your clothes on."

    Stroustrup:
    "If I had thought of it and had some marketing sense every computer and just about any gadget would have had a little 'C++ Inside' sticker on it'"

  9. #9
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    I've seen a compiler complain about "conditional expresion is constant" before (I probably had the warning level up high) which is one reason I started using for(;, the other reason as already noted is that in a debug build say, the compiler might not optimise out the test, and the speed of debug builds if of at least some importance.

    You might find that the code for Numbers.size()-1 is not turning out as fast as storing that in a variable might be. Then of course there's that cout statement in there which is sure to affect things. Best remove that when timing them.
    I'd also remove timeslooped and timesran for timing them since you wouldn't have those in a final implementation.
    At least that's some ways it might be slowing down the insertion sort to the speed of a quicksort on small arrays.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  10. #10
    Bored Programmer
    Join Date
    Jul 2009
    Location
    Tomball, TX
    Posts
    428
    You might find that the code for Numbers.size()-1 is not turning out as fast as storing that in a variable might be.
    Are you meaning that adjusting an int by hand to reflect the size would be a quicker int for the computer to compare to than grabbing a vector.size()?
    To use the inserstion inside the bubble sort I had to add a (left, right) parameter to it as well so this isn't what slowed them down while working together, but nonetheless a good point.

    Then of course there's that cout statement in there which is sure to affect things. Best remove that when timing them.
    I'd also remove timeslooped and timesran for timing them since you wouldn't have those in a final implementation.
    At least that's some ways it might be slowing down the insertion sort to the speed of a quicksort on small arrays.
    Yea I commented most of the useless stuff out other than the timesran variable as I used it to see how many total iterations each sort style performed.
    I am also using an array to time like this:
    Code:
    #include <time.h>
    ....
    ...
    int clock[2];
    clock [0] = time(NULL);
    ParticularSort(left, right);
    clock [1] = time(NULL);
    int TimeToRun = clock[1]-clock[0];
    .....
    cout<<"It took "<<TimeToRun<<" seconds to perform blah sort on "<<Numbers.size()<<" numbers."<<endl;
    .....
    Since it is a .h I'm assuming its an old file and maybe one of you C++ gurus have a suggestion for tracking the time that would be more accurate? Milliseconds would help me break down performance on the sorts that take under a second to perform.

    Thanks a ton for the information already and if you feel like checking it out I made a few sorts in JavaScript at
    www.lesshardtofind.com/bubble.html

    I'm still trying to figure out how to make my quicksort visual.. as you can't display with javascript while in a loop which is going to make recursion tricky! I got an idea though.

  11. #11
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by Lesshardtofind View Post
    Are you meaning that adjusting an int by hand to reflect the size would be a quicker int for the computer to compare to than grabbing a vector.size()?
    What iMalc is suggesting is that saving the value to an int and evaluating that int multiple times will not be slower than calling Numbers.size() multiple times. Apart from potential function call overhead, there are very few implementation choices for std::vector<>::size() that will be faster than retrieving the value.

    Although - if using that approach - bear in mind that vector's size() method returns a size_t - which can hold values that an int cannot.

    Quote Originally Posted by Lesshardtofind View Post
    Since it is a .h I'm assuming its an old file and maybe one of you C++ gurus have a suggestion for tracking the time that would be more accurate? Milliseconds would help me break down performance on the sorts that take under a second to perform.
    Firstly, you need to be clear in your requirement. What you are seeking is a high resolution timer. Accuracy and resolution are different things.

    Either way, you would be better off increasing the number of test cases than looking for a more accurate or higher resolution timer. High-resolution timers also have significant jitter, so the accuracy of the timer tends to go down as resolution increases, unless your host system has hard realtime characteristics (and if you were using such a system, you wouldn't be asking the question in the first place).

    But, if you want to characterise timings for an algorithm, it is usually easier to increase the number of runs (or test cases) than it is to find and control a timer with high resolution and high accuracy. All you need to do to get back to individual timings is divide by the number of runs. One other benefit is that variation of timer accuracy or precision will tend to be smoothed out. The only cost is the time needed to generate the additional test cases or runs, the time waiting for them to complete, and the processor cycles.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 21
    Last Post: 07-15-2012, 05:20 PM
  2. Can you inherit Methods into other methods?
    By bobbelPoP in forum C++ Programming
    Replies: 5
    Last Post: 08-02-2008, 08:32 AM
  3. get set methods
    By hasek in forum Windows Programming
    Replies: 2
    Last Post: 03-13-2004, 12:59 AM
  4. Sorting words with a fast, effincient sorting method
    By Unregistered in forum C++ Programming
    Replies: 19
    Last Post: 07-12-2002, 04:21 PM
  5. Methods for Sorting Structures by Element...
    By Sebastiani in forum C Programming
    Replies: 9
    Last Post: 09-14-2001, 12:59 PM