Thread: Help with rand()

  1. #1
    Registered User
    Join Date
    Dec 2009
    Location
    Colorado
    Posts
    41

    Help with rand()

    I am working on a program that uses rand(). I seed it with
    Code:
    srand( (unsigned) time(NULL))
    and then do some arithmetic to scale rand from [0,RAND_MAX] to [-1,1]. I have generated 100,000 random numbers with this and looked at a histogram and everything looked normal. The problem I am running into is when I sum these values. I keep getting a Gaussian not centered on 0. Does anyone have any ideas why this is happening? Is this the result of a bias in the pseudo-random number generator?

    More specifically, I am looking at the distribution of 10,000 sums of 100,000 elements ([-1,1]) each and I get a Gaussian centered at -500 every time I run the program. The sample size I am using shouldn't be an issue. Thanks for the help

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    How are you doing the scaling? I note that it is possible to introduce a bias when doing that, and perhaps it matters for what you are trying to do. You might want to post the smallest and simplest runnable program that demonstrates the problem.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Yeah you haven't posted the useful bit.
    Show us what you're doing, rather than telling us.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  4. #4
    Registered User
    Join Date
    Dec 2009
    Location
    Colorado
    Posts
    41
    woops sorry about that. Here is the code with the scaling I am doing:

    Code:
    #include <stdio.h>
    #include <time.h>
    #include <string.h>
    
    FILE * f1;
    
    int main (int argc, const char * argv[]) {
        srand( (unsigned) time(NULL));
    	
    	f1 = fopen("randtest.txt", "w");
    
    	for (int j = 0; j < 10000; j++) {
    		double sum = 0;
    		for (int i = 0; i < 100000; i++) {
    			double n = (rand() % 200)/100.00 - 1;
    			sum += n;
    			
    		}
    		fprintf(f1,"%lf\n",sum);
    	}
        return 0;
    }
    From a mathematical standpoint I don't see why this would introduce a bias (although I don't do a whole lot of work with mod). I am pretty certain this must be the problem point. Thanks in advance.
    Last edited by waterborne; 12-21-2009 at 11:58 PM.

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by waterborne
    From a mathematical standpoint I don't see why this would introduce a bias (although I don't do a whole lot of work with mod).
    If RAND_MAX+1 is not perfectly divisible by 200 (and this is probably the case), then some of the numbers in the range [0, 200) will have a higher probability of being generated, assuming a uniform distribution for rand().

    Still, why not post the smallest and simplest runnable program that demonstrates the problem?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    Dec 2009
    Location
    Colorado
    Posts
    41
    Quote Originally Posted by laserlight View Post
    If RAND_MAX+1 is not perfectly divisible by 200 (and this is probably the case), then some of the numbers in the range [0, 200) will have a higher probability of being generated, assuming a uniform distribution for rand().

    Still, why not post the smallest and simplest runnable program that demonstrates the problem?
    That makes sense and I imagine that is what is happening.

    The code I posted should run. I run that code and then have a notebook in Mathematica import the data and update a histogram plot.

    Do you have a better suggestion for scaling the random number generation to [-1,1]? Thanks!

  7. #7
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    rand() % 200 gives a number from 0 to 199.
    Divide that by 100 and you get 0 to 1.99
    Subtract 1 and you get -1 to 0.99 (Not -1 to +1)
    You probably want mod 201, not 200
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  8. #8
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Quote Originally Posted by waterborne View Post
    That makes sense and I imagine that is what is happening.

    The code I posted should run. I run that code and then have a notebook in Mathematica import the data and update a histogram plot.

    Do you have a better suggestion for scaling the random number generation to [-1,1]? Thanks!
    You're getting integers in the range 0 - 199 so the mean will be 99.5, not 100. Divide by 100, subtract 1 and multiply by 100,000 and your mean *should* be -500. You probably want something like this:

    Code:
    for (int j = 0; j < 10000; j++) {
      double sum = 0;
      double frac = 201.0 / (unsigned int)(RAND_MAX + 1);
      for (int i = 0; i < 100000; i++) {
        double n =  (int)(rand() * frac) / 100.0 - 1;
        sum += n;
      }
    }
    Last edited by R.Stiltskin; 12-22-2009 at 06:09 PM. Reason: cast to unsigned int in 3rd line

  9. #9
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Why are you trying to divide by zero?
    Code:
    (unsigned int)(RAND_MAX + 1)
    MAX + 1 being unsigned should be zero.


    Quzah.
    Hope is the first step on the road to disappointment.

  10. #10
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Quote Originally Posted by quzah View Post
    Why are you trying to divide by zero?
    Code:
    (unsigned int)(RAND_MAX + 1)
    MAX + 1 being unsigned should be zero.
    I don't think so. rand() returns an int, so RAND_MAX is the largest int that the architecture supports and (unsigned int)(RAND_MAX + 1) is 1 more than that.

    But (because of twos-complement) (int)(RAND_MAX + 1) = -(unsigned int)(RAND_MAX + 1).
    Last edited by R.Stiltskin; 12-22-2009 at 08:52 PM.

  11. #11
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by R.Stiltskin View Post
    I don't think so. rand() returns an int, so RAND_MAX is the largest int that the architecture supports and (unsigned int)(RAND_MAX + 1) is 1 more than that.

    But (because of twos-complement) (int)(RAND_MAX + 1) = -(unsigned int)(RAND_MAX + 1).
    On a system with 32-bit ints, where RAND_MAX is 0xFFFFU it will be fine.
    But on a system with 32-bit ints, where RAND_MAX is 0xFFFFFFFFU, it will overflow back to zero, no matter how you look at it.
    Note that those are typical values for RAND_MAX, so quzah's point is entirely valid.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  12. #12
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Quote Originally Posted by iMalc View Post
    ...But on a system with 32-bit ints, where RAND_MAX is 0xFFFFFFFFU, it will overflow back to zero, no matter how you look at it.
    Note that those are typical values for RAND_MAX, so quzah's point is entirely valid.
    How can that be? int rand(void) by definition returns an int, not an unsigned int, between 0 and RAND_MAX, so on a system with 32 bit ints RAND_MAX can be no more than 0x7FFFFFFFh, reserving the high-order bit for negative numbers. Add 1 to that and the result overflows to 0x80000000 which is -2147483648d as an int, but cast it to an unsigned int and it is 2147483648d.

    edit: but I probably should have written "((unsigned int)RAND_MAX + 1u)" to avoid compiler warnings.
    Last edited by R.Stiltskin; 12-22-2009 at 10:52 PM.

  13. #13
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Put this in your toolbox:

    Code:
    double RandRange( double min, double max )
    {
        return (double)rand() / RAND_MAX * ( max - min ) + min;
    }
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  14. #14
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Quote Originally Posted by brewbuck View Post
    Put this in your toolbox:

    Code:
    double RandRange( double min, double max )
    {
        return (double)rand() / RAND_MAX * ( max - min ) + min;
    }
    Yes, it's much simpler if you want doubles with an arbitrary number of decimal places. That sidesteps the overflow issue nicely, but (judging by his code) the OP seemed to want numbers with exactly 2 decimal places (or equivalently, a range of ints divided by 100.0). This RandRange() doesn't adapt to that so easily (must avoid introducing a bias against the max value).
    Last edited by R.Stiltskin; 12-22-2009 at 11:42 PM.

  15. #15
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by R.Stiltskin View Post
    Yes, it's much simpler if you want doubles with an arbitrary number of decimal places. That sidesteps the overflow issue nicely, but (judging by his code) the OP seemed to want numbers with exactly 2 decimal places.
    Then I'd call it like this:

    Code:
    double x = round( RandRange( -100.0, 100.0 ) ) / 100.0;
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. rand() implementation
    By habert79 in forum C Programming
    Replies: 4
    Last Post: 02-07-2009, 01:18 PM
  2. Wm_timer
    By Ducky in forum Windows Programming
    Replies: 21
    Last Post: 09-26-2008, 05:36 AM
  3. Issue w/ Guess My Number Program
    By mkylman in forum C++ Programming
    Replies: 5
    Last Post: 08-23-2007, 01:31 AM
  4. rand() to choose?
    By wagman in forum C++ Programming
    Replies: 2
    Last Post: 03-27-2002, 01:43 AM
  5. rand () a little confusion
    By Led Zeppelin in forum C Programming
    Replies: 3
    Last Post: 03-19-2002, 10:13 PM