Thread: Problems with Summing lots of doubles

  1. #1
    Registered User
    Join Date
    Jun 2008
    Location
    Northern Va
    Posts
    18

    Post Problems with Summing lots of doubles

    Hi,
    I'm working with a very large number of doubles, each with a value between 0 and around 10. I'm not sure of the best implementation to get around floating point precision so that I can take a good average over all of the values. I was thinking of sorting the values and then breaking them up, but I don't know of a good way to find roughly what the float is before my computer spits out inf.

    This is what I've tried so far:
    Code:
     void *func = &dComp;
          qsort(moment,sSize,sizeof(double),func);
            
         //compute average
          unsigned long long int sum=0;
          for(i=0;i<sSize;i++) 
             {
             sum=sum + rint(100*moment[i]);
             }
          
          double fSum = (1.0*sum)/100.0;
    sSize can hopefully be around 10,001 and is the same as 1+ num points generated. When I exceed 64bits with accuracy only to the 1000th's, I get 00000000 for everything. I was thinking perhaps of chunking every 100 or 1000 or something like that into an array and then averaging those by dividing each entry by the number of total entries, but I feel like there is a more portable and flexible solution.

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    The precision in floating point would be BETTER than the integer value as a 64-bit integer.

    Are you actually seeing your average being wrong because of this, or are you just trying to prevent a problem that isn't actually there?

    Doubles are precise to about 15 digits, even on repeating calculatons like this.

    10001 * 10 is the largest value you can expect from your sum, so that will take up about 6 digits, you still have another 9 or so digits for remainder of the precision. That is much more precise than calculating an integer value of the float value times 100.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Jun 2008
    Location
    Northern Va
    Posts
    18
    My average is actually non-existent when I run this. Using an U L L int yields a bunch of 0s. When I use doubles, I get 'inf'. I need to use the average later to get a nice histogram, but since the simplest solution runs into precision problems. I tried using the code below, but it still spits out an inf.
    Code:
     double average=0.0, aveBins[400];   
          j=0;z=0;
          for(i=0;i<sSize;i++)
             {
             assert(j!=400);
             aveBins[j]+=moment[i];
             if(z>99 && j<400)
                {
                j++;
                z =0;
                }
             z++;
             }
             
          for(j=0;j<400;j++)
             {
             aveBins[j]/=sSize;
             average+=aveBins[j];
             }
          
          printf("Average Field: %lf",average );
    I thought that a calculation to find the best bin size would be more suitable, but I'm not sure if there's a good way to find the size of an array in C. This is what I would use, but when I implemented it, my averages were smaller than expected. Example
    Code:
    for(i=0 ; i < sizeof(aveBins)/sizeof(aveBins[0]) ; i++)
    Again, thank you in advance for your help.

  4. #4
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    This code doesn't initialize aveBins[400], so who knows what's already in there. You know that each aveBin shouldn't get above 4000 or so, so if you're willing to write all those out to a file, you'll see what you get.

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Something is wrong in what you originally stated: This works just fine:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    #define SIZE 10001
    
    double frand(double maxval)
    {
    #if 0
    	// Set above to 1 for testing the max value possible. 
    	return maxval;
    #else
    	return ((double)rand() / RAND_MAX) * maxval;
    #endif
    }
    
    int main() 
    {
    	double a[SIZE];
    	double sum = 0; 
    	int i;
    	for(i = 0; i < SIZE; i++)
    	{
    		a[i] = frand(10.0);
    	}
    	for(i = 0; i < SIZE; i++)
    	{
    		sum += a[i];
    	}
    	printf("sum = %f, avg = %f\n", sum, sum / SIZE);
    	return 0;
    }
    The sum, when using random values is about 50000, and the average 5.0 something - which is exactly what one would expect.

    If you get other values, I'd suggest you add a bit of code like this:
    Code:
    	double max = -100000.0, min = 1000000.0;
    	for(i = 0; i < SIZE; i++)
    	{
    		if (max < a[i]) max = a[i];
    		if (min > a[i]) min = a[i];
    	}
    print max and min after the loop, to see if it's the expected range. I suspect you are having "garbage" in your array.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    Registered User
    Join Date
    Jun 2008
    Location
    Northern Va
    Posts
    18
    Right on the money there! Sure enough, after looking through my data (quicksorted for ease of use), there were 7 or 8 'inf's. Is there an explicit way to test for if a value is inf? I'm going to use something along these lines, but something akin to isnan() would be nice.
    Code:
    if(value == value+1) 
      ignore the value
    Thanks, and good catch.

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    C99 has isfinite() and isinf() in math.h.

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    By what method are your application acquiring the numbers? Are you reading from a file, or using some other method?

    If you are reading numbers from a file or such, my suggestion would be to check the result of reading the numbers, and reject the input if it's out of range.

    If that's not the method you are using, then you need to figure out "why are the values #INF" - because that's clearly not a valid value.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. No clue how to make a code to solve problems!
    By ctnzn in forum C Programming
    Replies: 8
    Last Post: 10-16-2008, 02:59 AM
  2. parse doubles from socket read?
    By willy in forum C Programming
    Replies: 4
    Last Post: 05-28-2008, 05:32 AM
  3. Problems With XP
    By xds4lx in forum A Brief History of Cprogramming.com
    Replies: 1
    Last Post: 04-14-2002, 04:45 AM
  4. DJGPP problems
    By stormswift in forum C Programming
    Replies: 2
    Last Post: 02-26-2002, 04:35 PM
  5. Coding Problems
    By RpiMatty in forum C++ Programming
    Replies: 12
    Last Post: 01-06-2002, 02:47 AM