Thread: Sorting (asymptotic) complexity problem

  1. #1
    Beginning game programmer Petike's Avatar
    Join Date
    Jan 2008
    Posts
    64

    Question Sorting (asymptotic) complexity problem

    Hi all,
    I have some problems with "asymptotic notations". I have already learned the thery about that but there is still something unclear for me "in practice".

    Let me show you some simple example:
    Let's say we have the "Selection" sorting algorithm. The fact is it has the time complexity of "O(n^2) (squared)". But what does it exactly mean?

    Does it mean that at worst case this algorithm makes n^2 comparisons or n^2 selections or n^2 assignments or n^2 "something else"?
    In other words, how can I simply and logically find out the complexity of some algorithm?

    Thanks.
    Petike

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Just n^2 operations - it could be anything, or any scalar number of anything.

    3 compares, a move and an addition would count as one "operation".

    quicksort typically has more operations per iteration than say bubble sort. But because it's O(nLogn) rather than O(n^2), it soon starts winning over simpler (but more expensive) approaches.
    Last edited by Salem; 01-19-2009 at 01:08 AM. Reason: fix late night typos
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    The term O(n^2) means that if you double the number of elements the algorithm works on, it will be four times more work.

    This is because these algoorithms usually has a double for-loop (or some such) in the middle of the algorithm, where both loops (essentially) range from 0 to n-1.

    But the O(n^2) doesn't necessarily make a good comparison for the actual time it takes for one algorithm to solve a problem compared to another algorithm - they can vary dramatically even tho' they both have O(n^2) - one may be very dumb in the number of copies/compares or other operations it takes to do something. Of course, an algorithm that is O(log(n) * n) will (almost certainly) be dramatically quicker even if it is clumsier than some O(n^2) algorithm, if n is larger than a few dozen.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #4
    Beginning game programmer Petike's Avatar
    Join Date
    Jan 2008
    Posts
    64
    Thanks to both.
    It is now much clearer to me.
    Petike

  5. #5
    Registered User
    Join Date
    Jun 2008
    Posts
    62
    one other thing to note about big Oh notation is that it doesn't really describe the exact amount of time increase new elements will have, instead it describes more the shape of the curve (if that makes any sense).

    So for example, you won't see a O (n^3 + 4n^2 + 6n) algorithm, instead, it will be just written as O (n^3). That is one reason (along with others) that two n^3 algorithms with the same initial execution speeds will deviate from each other.

  6. #6
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    The reason for that is that once you start talking less significant factors in the running time such as an x-squared part of something that is x-cubed, or the constant factor of something that is O(n), you start running into things that are varying solely due to implementation differences, i.e. how well optimised the code is.
    You might have two algorithms where you conclude that ones takes n*n + 3n + 9 operations, and another that is n*n + 4n + 5 operations, and decide that this is why the first one is faster. However you then switch compilers and the second one comes out faster for the exact same source code. Due to things like that, all you can really say is that since they are both O(n*n), they take approximately the same running time.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I agree with what iMalc says. Another way to put it: consider O(n^2) compared with O(n^2 + n) when n is 1000 (1 million and 1 million one thousand) and then change to 10000 - It's now ten million for the first case, and ten million ten thousand for the second case. In the first case, it's 0.1%, in the second case the + n contributes less 0.01% - now, if you measure that precisely, you don't want O(n), you want "CPU clock-cycles" as the measure. And change your data ever so slightly (replace all Jones with Smith on the name list), and you will almost certainly change the overall execution time by more than 0.01%!

    Big-O notation is for "the things that change dramatically", not the details. It's a "big picture" tool.

    Consider a real-time OS where the task-switch is O(n), where n is number of tasks. Now change it the task-switch code so that it is O(1) - it doesn't take much to convince you that it is better - even if the ACTUAL task-switch code itself is 10% slower [assuming you have more than 1 task, that is].

    Another real case is "open file" in FAT vs. NTFS - if you have a directory that is sorted by name, it take O(log(n)) to find the right file. FAT has files stored in the order they are created, so the time to find the right one is O(n) - because you may have to search through every one of them.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Beginning game programmer Petike's Avatar
    Join Date
    Jan 2008
    Posts
    64

    Question One another question...

    Hi again,
    I would have another question.
    It should not be too hard to find out the best and the worst case of some algorithm.
    But how can I find out the average case of some algorithm?
    In other words, does there exist any general rule to finding out the average case of algoritms?

    Thanks.
    Petike

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Just give it several sets of data that isn't the worst and isn't the best - or you can statistically calculate it from the general behaviour of the code - for example if your best case is an already sorted array, and the worst case is one that is sorted in opposite order, then something with a random set of data will be "average", right? So if "every other element needs swapping", you get "half as many operations each outer loop".

    But often, worst case is the only one that really matters, because that is what you have to design to cope with, unless you know for some reason that you will never ever have the worst case.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Problem with arrays structures sorting.
    By pitifulworm in forum C Programming
    Replies: 42
    Last Post: 02-09-2009, 12:31 PM
  2. Sorting problem?
    By audinue in forum C Programming
    Replies: 5
    Last Post: 01-06-2009, 02:16 PM
  3. Array Sorting problem
    By ___________ in forum C++ Programming
    Replies: 4
    Last Post: 07-22-2008, 12:17 AM
  4. What is problem about the sorting?
    By Mathsniper in forum C Programming
    Replies: 2
    Last Post: 04-17-2005, 07:00 AM
  5. Sorting text file problem...
    By John-m in forum C Programming
    Replies: 3
    Last Post: 10-01-2002, 04:51 PM