Thread: more of an algorithm question than a C question

  1. #1
    Registered User Dave Jay's Avatar
    Join Date
    Mar 2002
    Posts
    33

    more of an algorithm question than a C question

    How would you, if it's indeed possible, find the exact median of an unordered list, on the fly?

    For example, how would you find the median of the following list (which is 5) without sorting the list and keeping it in memory:


    5, 7, 4, 11, 1, 3, 8, 2, 12
    Last edited by Dave Jay; 03-26-2002 at 06:35 PM.

  2. #2
    Registered User
    Join Date
    Nov 2001
    Posts
    162
    Is it possible to do that without sorting the numbers? I can't even figure it out in my head without sorting the numbers, much less a computer.

  3. #3
    Unregistered
    Guest
    If I remember what a median is correctly, it is the middle number out of a group of sorted numbers. If this is correct then try something like this:

    assuming the list is in an array

    int size = // number elements in array
    int half = size / 2;
    int less, equal;

    for(int x = 0; x < size; x++)
    {
    less = equal = 0;
    for(int y = 0; y < size; y++)
    {
    if(x == y) continue;
    if(array[x] < array[y]) less++;
    else if(array[x] == array[y]) equal++;
    }
    if((less + equal/2) == half) break;
    }

    I have not tested this, but I believe it will work. The median should have half of the values less than it. You need to keep track of the values equal to the test value for the case where multiple occurance of the median are present.

  4. #4
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    Get the number of nodes in the list, search for the next highest number in the list until you reach the very middle number, which is length / 2 + 1 if the number of items is odd, that's the median. If the list is an even number of nodes then it's a bit harder, aquire the two middle numbers with the same technique and then add them together and divide the sum by 2, that's the median.

    >5, 7, 4, 11, 1, 3, 8, 2, 12
    There are nine numbers, the median is 5 because the number at 9 / 2 is 4.5, add 1 and it's 5.5. Used as an integer, 5.5 is truncated to 5 which is the center of the list.

    >Is it possible to do that without sorting the numbers?
    Anything is possible, it just takes a little more work without sorting the numbers.

    -Prelude
    My best code is written with the delete key.

  5. #5
    Registered User Dave Jay's Avatar
    Join Date
    Mar 2002
    Posts
    33

    Appreciate the responses but

    For "unregistered":

    You were "assuming the list is in an array", which I had said cannot be in memory. The numbers are spit out on the fly, you read them once individually and then they are gone.


    For "Prelude" :

    "search for the next highest number in the list until you reach the very middle number"

    Again, there's nothing to search. Perhaps my wording was bad.



    I am working with very large loop generated numbers. Once the loop iterates, the value is lost. It's not practical or efficent to store all of them (multi millions) . I was, however, able to devise something by recognizing range values. Again, I appreciate the time you both have taken in considering a solution.

    Dave

  6. #6
    Registered User
    Join Date
    Nov 2001
    Posts
    241
    you could sort on the fly, and then just use the length of the list to find the middle term.

  7. #7
    Registered User
    Join Date
    Oct 2001
    Posts
    197
    Can you exactly define "median"?
    I donīt know what it should mean exactly.

    klausi
    When I close my eyes nobody can see me...

  8. #8
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >Again, there's nothing to search. Perhaps my wording was bad
    Yes, you said "unordered list" which caused me to believe that this was a linked list. Without keeping any of the values in memory there's no way I can think of, unless you want to store the values in a file. However, if your median doesn't have to be one of the numbers input then you can simply keep a running value of the largest number. When you reach the end of input, find the median of the largest number and you're done. This is probably the simplest way to get the result you want on the fly.

    >It's not practical or efficent to store all of them (multi millions)
    If the median must be one of the numbers that you actually input then it will be considerably harder and probably not possible without placing the values in memory or it would be dreadfully inefficient by placing the values in a file.

    -Prelude
    My best code is written with the delete key.

  9. #9
    Registered User
    Join Date
    Mar 2002
    Posts
    28
    after reading all these replies....wouldn't it be simpler to just sort the numbers...

    it would make life simple...but personally i live by the KISS philosophy...so i'm slightly biased.
    curiousity killed the cat, but it
    makes for one hell of a programmer.

    Alien.

  10. #10
    Registered User
    Join Date
    Jan 2002
    Posts
    49
    Are you getting multi millions of different values or a (rather) small set of different values ?

  11. #11
    Registered User Dave Jay's Avatar
    Join Date
    Mar 2002
    Posts
    33
    klausi : "Can you exactly define "median"? "

    The median is simply the middle number of a sorted list.
    When there is an even number of numbers, the median is the mean of the two middle numbers.


    GertFaller:

    I think I know what you're getting at. If the values are known to be relatively similar, you can tally all the occurrences of each value in a corresponding array.

    For example, if you were exposed to a 100 million entries, but you knew that their respective values were all randomly in the range 1 - 100, you can create an array to hold each respective entry's occurrence. The first index will hold the number of 1's encountered, the second index the number of 2's, the third index the number of 3's...the last index holding the number of 100's. A statement such as list[value]++ can be placed in the loop to mark an occurrence.
    At the end of the iterations, you could add the values of the array, starting from the first index until you have the value = half your list length:

    for(i = first_index, sum = 0; sum < list_length / 2; i++)
    sum += list[i];

    The index i will be your median in an odd length list, even length will need slight modification.
    Last edited by Dave Jay; 03-27-2002 at 07:21 PM.

  12. #12
    Registered User
    Join Date
    Oct 2001
    Posts
    197
    Thanks, Dave Jay!
    When I close my eyes nobody can see me...

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Quick Algorithm Question
    By cjwenigma in forum C++ Programming
    Replies: 3
    Last Post: 11-01-2007, 10:39 AM
  2. Design layer question
    By mdoland in forum C# Programming
    Replies: 0
    Last Post: 10-19-2007, 04:22 AM
  3. STL algorithm question
    By Reggie in forum C++ Programming
    Replies: 1
    Last Post: 04-22-2003, 09:04 AM
  4. opengl DC question
    By SAMSAM in forum Game Programming
    Replies: 6
    Last Post: 02-26-2003, 09:22 PM
  5. More a math question than an algorithm
    By Gustaff in forum C Programming
    Replies: 1
    Last Post: 01-28-2003, 01:10 PM