need random number generator

**dnj23** · 07-26-2010

hello,

I need a good RNG. I have tried rand() with srand(time(NULL)) and it does not work accurately. I tested it with simulating a craps pass line bet 1 billion times and some of the figures aren't accurate. I get a statistical house edge figure of -1.54% instead of the correct figure of -1.41%. I am pretty sure it is coded correctly. I copied and pasted it on the bottom. Any advice would be appreciated.

Code:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define MAX_GAMES 1000000000

int roll_dice(void);
int play_passline(unsigned[], unsigned[]);
void print_stats(const unsigned[], const unsigned[], unsigned long, unsigned long);

int main()
{
  unsigned long games = 0, total_rolls = 0;
  unsigned roll_wins_at[21] = {0}, roll_loses_at[21] = {0};

  srand(time(NULL));

  printf("This program simulates %d pass line bet(s) in a game of craps : \n\n", MAX_GAMES);

  do {
    total_rolls += play_passline(roll_wins_at, roll_loses_at);
  } while (++games < MAX_GAMES);

  print_stats(roll_wins_at, roll_loses_at, games, total_rolls);

  return 0;
}

int roll_dice(void)
{
  int die1, die2;

  die1 = 1 + rand() % 6;  
  die2 = 1 + rand() % 6; 
   
  return die1 + die2;
}

int play_passline(unsigned roll_wins_at[], unsigned roll_loses_at[])
{
  /* plays and resolves one game, returns number of rolls it took for one game */
  
  int point, sum = roll_dice(), roll_count = 1; 

  switch (sum)  {
    case 7: case 11: 
	  if (roll_count <= 20) 
		roll_wins_at[roll_count]++;
	  else 
		roll_wins_at[0]++;   /* assigned > 20 rolls for a win case @ index 0 */ 
	  break;   
	case 2: case 3: case 12:
	  if (roll_count <= 20) 
		roll_loses_at[roll_count]++;
	  else 
		roll_loses_at[0]++;  /* assigned > 20 rolls for a loss case @ index 0 */
	  break;
	default:    /* point case */
	  point = sum;
	  do {
        sum = roll_dice();
	    roll_count++;
		if (sum == point) {
		  if (roll_count <= 20) 
		    roll_wins_at[roll_count]++;
	      else 
		    roll_wins_at[0]++;   /* assigned > 20 rolls for a win case @ index 0 */ 
		}
	    else if (sum == 7) {
		  if (roll_count <= 20) 
		    roll_loses_at[roll_count]++;
	      else 
		    roll_loses_at[0]++;  /* assigned > 20 rolls for a loss case @ index 0 */
		}
      } while (sum != 7 && sum != point);	  
	  break;
  }

  return roll_count;
}

void print_stats(const unsigned roll_wins_at[], const unsigned roll_loses_at[], unsigned long games, 
                 unsigned long total_rolls)
{
  unsigned long total_wins = 0, total_loses = 0;
  int i;

  for (i = 0; i <= 20; i++) {
    total_wins += roll_wins_at[i];
    total_loses += roll_loses_at[i];
  }

  printf("Statistics for %d games of pass line craps games \n"
	     "--------------------------------------------------\n\n", MAX_GAMES);

  printf("Games : %8lu\n Wins : %8lu %8.4lf\nLoses : %8lu %8.4lf\n%% Exp : %8.2lf\n", games, total_wins, 
	     (double) total_wins / games, total_loses, (double) total_loses / games, 
		  ((double) total_wins - total_loses) / games * 100);
  
  printf("\nAverage length of game : %.2lf rolls\n\n", (double) total_rolls / games);

  printf("Games won/lost on number of rolls :\n\nRolls    Won      Lost   %% rolled\n");
  for (i = 1; i <= 20; i++) 
	printf("%3d %8u %8u %8.2lf\n", i, roll_wins_at[i], roll_loses_at[i], 
	       (double) (roll_wins_at[i] + roll_loses_at[i]) / games * 100.0);
   
  printf(">20 %8u %8u %8.2lf\n\n", roll_wins_at[0], roll_loses_at[0], 
	     (double) (roll_wins_at[0] + roll_loses_at[0]) / games * 100.0);
}

**User Name:** · 07-26-2010

Pseudo random number generators

**dnj23** · 07-26-2010

cool page, thanks.

**User Name:** · 07-26-2010

Yeah, Agner's an effing genius. Apparently, he's done a lot of research on RNGs.

**iMalc** · 07-27-2010

Originally Posted by dnj23

I need a good RNG. I have tried rand() with srand(time(NULL)) and it does not work accurately. I tested it with simulating a craps pass line bet 1 billion times and some of the figures aren't accurate. I get a statistical house edge figure of -1.54% instead of the correct figure of -1.41%. I am pretty sure it is coded correctly. I copied and pasted it on the bottom. Any advice would be appreciated.

I don't think "accurate" is the right term there. You probably mean "unbiased".

However, it is obvious to me that the problem is not just with rand() itself so I can tell you'll probably be back here shortly when it still does not work. The problem is that you are introducing bias by using the quick and dirty method of using rand() % N.

The problem is that here N is 6 which does not divide evenly into RAND_MAX. Whilst that method is good enough for most purposes, it is not statistically unbiased. A correct way to do it is to replace the rand() % N with the rejection method. I.e:

Code:

do {
    die1 = rand();
} while (die1 < (RAND_MAX / 6) * 6);
die %= 6; // safe to do this now

E.g. If RAND_MAX is 32767 then the equation on the right hand side will come out to 32766 which is a multiple of 6 and the result is now unbiased.

**dnj23** · 07-27-2010

Originally Posted by iMalc

I don't think "accurate" is the right term there. You probably mean "unbiased".

Correct, I meant unbiased.

Code:

  do {
      die1 = rand();
  } while (die1 < (RAND_MAX / 6) * 6);
die %= 6; // safe to do this now

I don't quite get the code. All I see is the while loop assigning die1 either RAND_MAX or
RAND_MAX - 1 when it terminates. On my compiler, this would be either 32767 or 32766. This is
done albiet very inefficently and flirts with indefinite postponement.

Then die1 %= 6 (I belive you meant die1) is assigned either 0 or 1.

I probably don't see what you meant.

**laserlight** · 07-27-2010

Well, it looks like you caught two mistakes: firstly, (die1 < (RAND_MAX / 6) * 6) should be (die1 >= (RAND_MAX / 6) * 6), and then there's the die which should be die1.

**iMalc** · 07-27-2010

First off, a whoops on the untested code, secondly well done on finding the bugs guys!
Does it make a bit more sense now?:

Code:

do {
    die1 = rand();
} while (die1 >= (RAND_MAX / 6) * 6);
die1 %= 6; // safe to do this now

And don't worry about the possibility of this looping several times. The average number of loops would be something less than 1.0001 times, and the chances of it looping again get rapidly smaller. With typicaly rand() it would probably not even be capable of looping even as much as five times.

**dnj23** · 07-28-2010

Originally Posted by iMalc

The problem is that here N is 6 which does not divide evenly into RAND_MAX. Whilst that method is good enough for most purposes, it is not statistically unbiased. A correct way to do it is to replace the rand() % N with the rejection method.

Now I understand what you guys are talking about.

rand() % 6 produces 1 too many 0 and 1 outcomes, with the totals being:

0 : 5462
1 : 5462
2 : 5461
3 : 5461
4 : 5461
5 : 5461

total = 32768

your method restores equality:

0 : 5461
1 : 5461
2 : 5461
3 : 5461
4 : 5461
5 : 5461

total = 32766

Thanks for that insight. I would of never have figured that out. I never gave any thought to RAND_MAX and took it
for granted that scaling rand() with % n just worked and did its magic.

Although this accounted for some of the error, I am still getting incorrect figures. I was, however, able to get
correct figures coding the dice rolling function differently. Consider the following two functions:

Code:

int roll_dice(void)
{
  int die1, die2;
  
  do {
    die1 = rand();
  } while (die1 >= (RAND_MAX / 6) * 6);
  
  die1 = die1 % 6 + 1; 

  do {
    die2 = rand();
  } while (die2 >= (RAND_MAX / 6) * 6);
  
  die2 = die2 % 6 + 1; 

  return die1 + die2;
}

Code:

int roll_dice2(void)
{
  /* This produces more accurate dice sum distribution: */

  int dice, sum[36] = { 2,3,3,4,4,4,5,5,5,5,6,6,6,6,6,7,7,7,7,7,7,
	                    8,8,8,8,8,9,9,9,9,10,10,10,11,11,12 };
  do {
    dice = rand();
  } while (dice >= (RAND_MAX / 36) * 36);

  return sum[dice % 36];
}

They both do the same thing - return the sum of two random dice, but with different methods. The first function randomizes two separate die values and returns their sum, the second function randomizes a summation selection (randomly picks 1 valid sum from all possible sum values initialized in an array). The sum values are also repeated in correct amounts to model the probability occurrences correctly.

For some reason, the second function models the probability distribution for the sum of two dice better. It consistently outperforms the first function with 1 billion calls.

**Adak** · 07-28-2010

I'm not sure just what you want, but a perfectly even distribution, and a random distribution are not the same thing, at all.

Although a "generally even" distribution would be expected from a large number of random outcomes, a "perfectly even" number of outcomes for each choice, would be quite rare.

It looks like you have a perfectly even distribution, and maybe that's what you want. If you want to confirm what I'm asserting, just pick up some real die, and roll some reasonable number of throws, tracking the results just like you have here in your post.

You'll see just what I mean. The total of the six values will rarely all be equal. Unless your die's are loaded, of course.

**iMalc** · 07-29-2010

I think that now the difference comes down to the weakness of rand() itself in two ways:

The first one generates more calls to rand() and this shortens the period of the PRNG. Instead of producing almost 2^32 results before the sequence repeats, it will be less than 2^31.

Secondly, rand() calls probably aren't completely independent. I.e. if the first rand() call gives a particular value, then of all the possible values for the next call, some of them may now be more likely than others. A better PRNG such as mersenne twister would be far less vulnerable to that kind of statistical anomaly.

By reducing the number of calls to rand(), you minimise both of these problems. It could also be an effective optimisation. I'm actually rather impressed with that solution - Well done!

**dnj23** · 07-29-2010

Originally Posted by Adak

You'll see just what I mean. The total of the six values will rarely all be equal. Unless your die's are loaded, of course.

Understood, what I mean is the probability of occurrence for each of the six values should be accurate up to a certain decimal place, given a sufficient number of rolls.

Originally Posted by iMalc

I think that now the difference comes down to the weakness of rand() itself in two ways:

The first one generates more calls to rand() and this shortens the period of the PRNG. Instead of producing almost 2^32 results before the sequence repeats, it will be less than 2^31.

Secondly, rand() calls probably aren't completely independent. I.e. if the first rand() call gives a particular value, then of all the possible values for the next call, some of them may now be more likely than others. A better PRNG such as mersenne twister would be far less vulnerable to that kind of statistical anomaly.

I will experiment around. Thanks for your input iMalc, you've been very informative.

**User Name:** · 07-29-2010

If ur on a *nix, /dev/random is about as random as it gets. It uses random environmental factors such as radio noise, to generate random numbers. It works like a file, just open it and read sizeof(type) bytes from it. random(4): kernel random number source devices - Linux man page

Thread: need random number generator

Thread Tools

Search Thread

Display

need random number generator

Similar Threads

Issue w/ Guess My Number Program

Good Random Number Generator

How do I restart a random number sequence.

how to link random number generator with cpu?

Seeding Random Number Generator