Thread: calculating the mean

  1. #1
    Registered User
    Join Date
    Mar 2007
    Posts
    33

    calculating the mean

    I need an algorithm to calculate the mean of a list of values. The values are some CPU times, some of these values are very different from the rest. I want to calculate the mean without these values. Anyone knows about an algorithm to make this??

    Thank you.

  2. #2
    Registered User
    Join Date
    Jun 2006
    Posts
    130
    What do you mean by very different?

  3. #3
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Something like:
    Code:
    #define CPUMAX 60
    #define CPUMIN 40
    
    int mylist[] = {55, 27, 4, 11, 47, 59};
    int i = 0;
    unsigned long int total = 0;
    size_t counted = 0;
    
    for(i = 0; i < sizeof(mylist) / sizeof(mylist[0]); i++)
    {
        /* check if mylist[i] is in bound of your acceptable values */
        if(mylist[i] >= CPUMIN && mylist[i] <= CPUMAX)
        {
            total += mylist[i];
            counted++;
        }
    }
    printf("The mean is &#37;d", total / counted);
    Or you can use variance (Stats), I dunno I'm not that great with statistics.
    Last edited by zacs7; 08-27-2007 at 01:48 AM.

  4. #4
    Registered User
    Join Date
    Mar 2007
    Posts
    33
    I mean for example that all the values are around 0.0001 and a few 0.0008 for example or bigger. I think I'm going to calculate the variance for each value and try to mantain the variance very close to 0. Thanks for the answers!

  5. #5
    Cogito Ergo Sum
    Join Date
    Mar 2007
    Location
    Sydney, Australia
    Posts
    463
    He means outliers, it's been a while since I did statistical maths, but you don't really need it, you just need to set some if statement exceptions i guess?

    Code:
    if(value < x){
    
    add to array and do mean calculation
    
    }

  6. #6
    Registered User
    Join Date
    Mar 2007
    Posts
    33
    Yes is that! But the main problem is tu set the limit where a priori you don`t know. I'm searching for a method based on minimal variance. Thanks for your attention!

  7. #7
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    The way to minimise your variance is to only count one value If you do that, the variance is zero by definition.

    Seriously, you need to specify some criterion by which you can identify an outlier. Minimising variance of the set of values you retain is not a suitable criterion.

  8. #8
    Cogito Ergo Sum
    Join Date
    Mar 2007
    Location
    Sydney, Australia
    Posts
    463
    Well following from what grumpy said, define the bounds for which the values become outliers based on the greatest and least of the values that are not outliers.

    Suppose we have a list of values:

    2,50,75,98,101,467

    We can see that 2 and 467 are outliers

    So we can estimate a range for which we can expect outliers to be in.

    Suppose we say, that values greater than the double of the greatest number not an outlier, which means 202 is the upper bound and a value that is half of the least number not an outlier which is anything less than 25.

    This is ofcourse dependent on what the values really mean, here it's just arbitrary, I don't know what CPU times look like exactly, but I'm guessing they are small, so you must take a close look at what values are generated and run a series of tests to see the different values you do get and based on that estimate what the outliers could be.

    There probably is a much better way to do this if you look at the equations for standard deviation and some equations off statistical maths, they aren't very hard to understand, I don't know about implementing them though

  9. #9
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Or just do what JFonseka said (which my example outlines)...

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by bartleby84 View Post
    I need an algorithm to calculate the mean of a list of values. The values are some CPU times, some of these values are very different from the rest. I want to calculate the mean without these values. Anyone knows about an algorithm to make this??

    Thank you.
    Can I ask why you want to compute the mean of a bunch of CPU times? It's a common fallacy to think that you should run a task N times and then average the times -- what you should really do is simply take the smallest.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Calculating : high numbers
    By MiraX33 in forum C++ Programming
    Replies: 9
    Last Post: 06-08-2006, 11:08 PM
  2. Calculating CPU Usage
    By vitaliy in forum Linux Programming
    Replies: 3
    Last Post: 08-21-2005, 09:38 AM
  3. Recursion
    By Lionmane in forum C Programming
    Replies: 11
    Last Post: 06-04-2005, 12:00 AM
  4. Taking input while calculating
    By Unregistered in forum C Programming
    Replies: 1
    Last Post: 07-12-2002, 04:47 PM
  5. Calculating window sizes
    By Mox in forum Windows Programming
    Replies: 3
    Last Post: 11-08-2001, 09:17 PM