# Thread: My random numbers all seem to factors of 7!

1. ## My random numbers all seem to factors of 7!

I've been meaning to learn C for a long time, so I started with a simple montecarlo simulation of the premium bonds. I couldn't work out how to generate random numbers bigger than RAND_MAX, so split my random numbers win/lose, and then win how much. Something seemed wrong, because I wasn't seeing the same results I got in other languages, so I added a count of the random numbers generated:

This is code I wrote (on MacOS Mohave using CLANG to compile):

Code:
```#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>

int cmpfunc (const void * a, const void * b) {
return ( *(int*)a - *(int*)b );
}

int main()
{
int mybonds, period, numberoftrials, ernie, prize, win, i, j, k, m;
int winchance = 24500;
int totalprizes = 3513356;
int sum = 0;
static int randomnumbers;
memset(randomnumbers, 0, sizeof(randomnumbers));

srand(time(0));

printf("Number of bonds? : ") ;
scanf("%d", &mybonds) ;
printf("Number of months? : ");
scanf("%d", &period);
printf("Number of trials? : ") ;
scanf("%d", &numberoftrials) ;
int *results = (int*) malloc(numberoftrials * sizeof(int));

for (i = 0; i < numberoftrials; i++) {
results[i] = 0;
for (m = 0; m < period; m++) {
for (j = 0; j < mybonds; j++) {
if (rand() > RAND_MAX / winchance) {
results[i] += 0;
} else {
ernie = rand() % totalprizes;
randomnumbers[ernie] += 1;
if (ernie > 63374) {
results[i] += 25;
} else if (ernie > 35815) {
results[i] += 50;
} else if (ernie > 8256) {
results[i] += 100;
} else if (ernie > 2232) {
results[i] += 500;
} else if (ernie > 224) {
results[i] += 1000;
} else if (ernie > 103) {
results[i] += 5000;
} else if (ernie > 44) {
results[i] += 10000;
} else if (ernie > 19) {
results[i] += 25000;
} else if (ernie > 7) {
results[i] += 50000;
} else if (ernie > 1) {
results[i] += 100000;
} else {
results[i] += 1000000;
}
}
}
}
}
qsort(results, numberoftrials, sizeof(int),cmpfunc);
printf("\n");
printf("Worst result: %d\n", results);
printf("Best result: %d\n", results[numberoftrials - 1] );
printf("Median result %d\n", results[numberoftrials / 2 + 1] );
for (k = 0; k < totalprizes; k++) {
sum += randomnumbers[k];
if (randomnumbers[k] > 0) {
printf("%d : %d\n", k, randomnumbers[k]);
}
}
printf("number of winners = %d\n ", sum);

free(results) ;
return 0;
}```
And this is the output:

Number of bonds? : 50000
Number of months? : 12
Number of trials? : 10000

Worst result: 225
Best result: 100725
Median result 625
7 : 3
14 : 2
21 : 3
28 : 3
175 : 3
182 : 3
189 : 2
196 : 2
203 : 3
350 : 3
357 : 2
364 : 3
371 : 3
525 : 3
etc, etc
2440698 : 12245
2457505 : 13585
2474312 : 114779
2491119 : 802
number of winners = 657216

I haven't checked every single winning number, but so far every one I've picked is a multiple of 7.

I did try putting in a new srand(time(0)) seed before the second rand(), but this just dramatically reduced the number of random numbers generated. Where have I gone wrong? 2. The normal rand function is more random in the higher bits than in the lower bits.

Your logic is testing the lower bits only!

Tim S. 3. Thanks Tim,
It's still behaving oddly, but in a different way, so I may not have implemented your suggestion properly.
I changed line 38 to
ernie = (rand()>>9) % totalprizes;
but repeated trials are again all producing the same random numbers as each other (despite seeding srand with time), although they're no longer multiples of 7.
I also tried masking the lower order bits with ernie = (rand() & 0x1FFFE0000) % totalprizes; That produced very similar results.
David 4. Code:
`ernie = (rand()>>9) % totalprizes;`
I would try
Code:
`ernie = (rand()>>4) % totalprizes;`
Tim S. 5. But, you will likely have to live with bad random results.

NOTE: There are other random functions. Maybe the others people can point to a better one.

Tim S. 6. Code:
```                if (rand() > RAND_MAX / winchance) {
results[i] += 0;
} else {
ernie = rand() % totalprizes;```
The attempt to somehow filter the numbers is a complete waste of effort.
The rand() you use in the else has no relation to the rand() you tried to filter in the if.

As opposed to say
Code:
```                int r = rand();
if (r > RAND_MAX / winchance) {
results[i] += 0;
} else {
ernie = r % totalprizes;```
What is the value of RAND_MAX on your machine?

When I compiled your code, I got a nice even distribution anyway.
Code:
```\$ awk 'NR <= 4 || \$NF > 3' tmp.txt
Number of bonds? : Number of months? : Number of trials? :
Worst result: 200
Best result: 50850
Median result 625
844969 : 4
1017408 : 4
number of winners = 245358```
Your basic libc rand() is barely good enough for student dice or card playing homework.

Any serious use needs better generators.
Mersenne Twister - Wikipedia
which is available in
GNU Scientific Library - Wikipedia 7. Originally Posted by stahta01
The normal rand function is more random in the higher bits than in the lower bits.

Your logic is testing the lower bits only!
We're talking about rand() % 3513356 though, which means the lower 21 to 22 bits, so I must admit that I found it hard to believe that the PRNG chosen could be so bad. Therefore, I did a bit of investigation... Originally Posted by onewire
This is code I wrote (on MacOS Mohave using CLANG to compile):
I happen to have the same setup on my work laptop (although we don't actually program our projects in C), and was curious so I decided to test, and basically got similiar results. I simplified your program down to this test program:
Code:
```#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define winchance 24500
#define totalprizes 3513356
#define THRESHOLD 127773

int main(void)
{
srand(time(0));
int div7_count = 0;
int not_div7_count = 0;
for (int i = 0; i < INT_MAX; ++i) {
if (rand() <= THRESHOLD) {
int n = rand() % totalprizes;
if (n % 7 == 0) {
div7_count++;
} else {
not_div7_count++;
}
}
}
printf("%d divisible by 7; %d otherwise\n", div7_count, not_div7_count);
printf("%d\n", RAND_MAX / winchance);
return 0;
}```
The typical results that I got were reasonably in the range of:
Code:
```127772 divisible by 7; 0 otherwise
87652```
Increasing THRESHOLD resulted in a non-zero not_div7_count. So, it looks like for the particular rand() implementation chosen for the standard library implementation of this particular version of clang (Apple LLVM version 10.0.1 (clang-1001.0.46.4)) on this particular platform, whenever you have a pseudorandom number that is sufficiently small (less than 40000000 or so), the next pseudorandom number generated modulo 3513356 will more likely be divisible by 7, such that when the previous pseudorandom number is no greater than 127773, the next pseudorandom number modulo 3513356 will almost certainly be divisible by 7.

What you could do is to avoid having two rand() calls per iteration, hence avoiding this weakness in the PRNG, at least to some extent. You would also make it less likely to exhaust the period of the PRNG.

Nonetheless, I agree with stahta01: you're probably much better off just using a known PRNG for Monte Carlo rather than trying to save yourself from the problems of a poor rand() implementation. As for which PRNG to choose: I tend to be fond of the Mersenne Twister family of PRNGs because that was what I was introduced to with PHP nearly two decades ago, and its Wikipedia article can provide a starting point for more modern algorithms that you might want to try.

Additionally, I suggest that instead of a magic number like 24500 for winchance, express winchance as a probability (either as a double, or as an integer percentage out of 100). 8. Thank you everyone for all the help. I can see I'm going to have to learn a little more and work out how to "borrow" someone else's random number generator. Popular pages Recent additions ernie, int, number, result, results[i] 