Random Number

**Prelude** · 05-27-2004

Originally Posted by elad

swoopy, I'm not sure why Preludes indicates that rand()/RAND_MAX is less likely to give a string of equal values on repeated values than rand()/modulus, either; but it clearly isn't the reason I initially came up with, as you pointed out. Maybe the problem is I'm misinterpretting what Prelude wrote in the first place.

The problem is/was in the random number generator. The low order bits (which modulus uses) tend/tended to be far less random than the higher order bits (which division by RAND_MAX uses). It's basically a hack to get around the limitations of rand, a function that doesn't require the algorithm to be good.

**elad** · 05-27-2004

I'm sure there are more curtains to get to the genie on this one. Maybe it will lead to realms really unintelligible by us mortals, instead of just flirting on the fringes like this one, but I can't help from trying to turn the next curtain to see what's there.

Prelude--So, to try to restate your answer to see if I can even come close to understanding it: rand() uses one protocol to generate the high order bits of the number it returns and second protocol to generate the lower order bits of the number it generates. The former protocol tends to generate more random results than does the latter. That seems fair enough. Now the part that's confusing me. Are you saying that modulus uses a protocol to generate it's results and it makes use of a different set of bits in the divisor and modulant(?sp) than the bits used by division (by RAND_MAX)? Doesn't modulus use division on the entire number, like I do, to get the result it returns? And, for that matter, doesn't division use all the digits (bits, whatever) of the dividend and divisor to generate it's answer, too? I've had to deal with enough black boxes in my days, so if you say, "yes, and trust me on this, you don't want to go there", I will. But at the moment I'm curious to see what's behind the next curtain.

**Prelude** · 05-28-2004

Originally Posted by elad

I'm sure there are more curtains to get to the genie on this one. Maybe it will lead to realms really unintelligible by us mortals, instead of just flirting on the fringes like this one, but I can't help from trying to turn the next curtain to see what's there.

Prelude--So, to try to restate your answer to see if I can even come close to understanding it: rand() uses one protocol to generate the high order bits of the number it returns and second protocol to generate the lower order bits of the number it generates. The former protocol tends to generate more random results than does the latter. That seems fair enough. Now the part that's confusing me. Are you saying that modulus uses a protocol to generate it's results and it makes use of a different set of bits in the divisor and modulant(?sp) than the bits used by division (by RAND_MAX)? Doesn't modulus use division on the entire number, like I do, to get the result it returns? And, for that matter, doesn't division use all the digits (bits, whatever) of the dividend and divisor to generate it's answer, too? I've had to deal with enough black boxes in my days, so if you say, "yes, and trust me on this, you don't want to go there", I will. But at the moment I'm curious to see what's behind the next curtain.

There are good algorithms and bad algorithms. The bad algorithms are typically linear congruential generators because good constants are hard to find. The problem is that the low order bits of the random number returned tend to repeat more often than the high order bits.

The desired effect of modulus for rand() % N is to reduce the value returned by rand to the range of [0..N-1) by calculating the remainder of division by N. The division effectively shifts the high bits of the value into nothingness until the value is within the requested range:

Code:

#include <iostream>
#include <cstdlib>

using namespace std;

void bits(unsigned int val);

int
main()
{
  int i = RAND_MAX;

  while (i) {
    bits(i /= 2); // The definition of bits is in a later example
    cout<<endl;
  }
}

Now, when the low order bits repeat often then a value of N in rand() % N that's a power of two will use those exact bits as the reduced value. For example, if N = 2 and the lowest bit is 0, 1 and 0 for a given sequence of three, the final value will be 0, 1 and 0, respectively. If the lowest bit repeats infinitely between 0 and 1 (a situation that does happen as I'll show in a moment) then the random numbers of rand() % 2 will not be random. They'll switch between 0 and 1 giving sequences of 0,1,0,1,0,1,0,1,...

Why division works instead of modulus is a little more complicated. When you divide a number returned by rand by RAND_MAX in floating-point context, the result is a number between 0 and 1 with the pseudo-randomness in the precision, but no obvious repetition in the value (exercise: Why?

). Then by multiplying by N, you force the value into the range you desire by effectively shifting the values left into integral range. The effect is the same even for values of N that are powers of 2.

The following code is an example program that prints the repeating values of modulus using a poor linear congruential generator with the bits (lowest order to highest order) of the unchanged random number followed by the pseudo-random values of division without modulus and the bits of the original value. The code also describes another way to use the high order bits without casting to double. (exercise: How does it work?):

Code:

#include <iostream>
#include <limits>

using namespace std;

namespace {
  const int RAND_INT_MAX = numeric_limits<unsigned short>::max();
}

void bits(unsigned int val);
int  rand_int(unsigned int& seed);

int
main()
{
  unsigned int seed1 = 1;
  unsigned int seed2 = 1;
  int          base = 2;

  for (int i = 0; i < 20; i++) {
    int r1 = rand_int(seed1);
    int r2 = rand_int(seed2);
    cout<< r1 % base <<':';
    bits(r1);
    cout<<'\t'<< r2 / (RAND_INT_MAX / base + 1) <<':';
    bits(r2);
    cout<<endl;
  }
}

void
bits(
  unsigned int val
  )
{
  int bits = numeric_limits<char>::digits + 1;
  int int_bits = sizeof(unsigned short) * bits;

  for (int i = 0; i < int_bits; i++) {
    cout<< !!(val & 1);
    val >>= 1;
  }
}

int
rand_int(unsigned int& seed)
{
  seed = seed * 1103515245 + 12345;
  return seed % (RAND_INT_MAX + 1);
}

**elad** · 05-28-2004

>>The bad algorithms are typically linear congruential generators because good constants are hard to find.

I knew there was a technical explanation in there somewhere! To me this sounds as otherworldly as when I use a phrase like "Gastrointestinal dysfunction due to Dibothriocephalous is mainly due to altered peristalsis." It can sure be interesting when someone explains the inner workings of something, however. Thanks.

I'll have to play with the bit shifting stuff later. Not my favorite pastime and not something I can do during idle moments at work, but I'll give it a shot later.

**JasonD** · 05-30-2004

Another problem with using % is that some numbers, such as 6 in the case of a dice game, are not divisible into RAND_MAX+1 (the number of possible return values from rand() ). This means the distribution of numers returned will not be even. If you're hoping for 6's as often as 1's, then forget it. Is this significant to worry about? Perhaps not... but if you had money riding on it, you would care.

The best random number generator I have found is the Mersenne Twister. It has a period of 2^19937-1, and you can seed it with any number of bits. This means there are more than 2^32 possible unique sequences to be generated, as is the case with a 32-bit seed. I am using it in my current project, and it works well.

**Davros** · 05-30-2004

>I don't understant why rand() is PSEUDO-number generator,

Because it is not really random. The numbers produced are deterministic, in that the same sequences can be repeated given knowledge of the seed used to start the generator.

With sufficient knowledge, future outputs from a pseudo-random generator can be predicted, and given that they are predictable, they cannot be random.

However, for most purposes, the ability to produce numbers which appear to be random is sufficient.

**Thantos** · 05-31-2004

During lab time for my ASM class we had an interesting converstation about this. We came to the realization that nothing is random. We thought long and hard and even went so far as to look at electron and molecular movement and we still couldn't find a truely random event. As such there must not be a way to generate a truely random set of numbers

**JasonD** · 05-31-2004

Originally Posted by C+++C_forever

ok thanks

, but how are they predictable? i mean, how could this function be implemented so that it gives predictable results?

Give me the same random number generator and the same starting seed as you use, and I can tell you what numbers will appear when. Imagine if a VLT used such a method, and you found out how it generated numbers? Sometimes knowing the actual generator and seed is not important - if the next number generated uses only the last number as a seed, then the maximum period of such a generator is limited to the number of different numbers the variable holding this number/seed can store. It may even be shorter. This means you could predict future events just by looking at the numbers themselves.

**Davros** · 05-31-2004

>ok thanks , but how are they predictable?

Given a sufficiently large list of numbers from a pseudo random generator, it is possible to predict what is coming next without explicit knowledge of the algorithm or the seed. I believe this can be done by plotting the values in 'attractor space'.

>i mean, how could this function be implemented so that it gives predictable results?

Any deterministic random number generator is predictable, i.e. rand(). There are devices which you can plug into your computer in order to generate non-deterministic random numbers. I understand that these devices used either radioactive decay or background radio noise as the source of the random number generation.

**JasonD** · 05-31-2004

Originally Posted by Davros

There are devices which you can plug into your computer in order to generate non-deterministic random numbers. I understand that these devices used either radioactive decay or background radio noise as the source of the random number generation.

Take a look at this site:
http://www.lavarnd.org/

**JasonD** · 05-31-2004

Originally Posted by C+++C_forever

look, i understand that it is predictable and not random. The question is WHY???????? How do they write they function?

i mean, is it like this ?

int rand() {
return seed * 2;
}

? thanks

That would be a very simple one that doesn't work well, but, yes, that is the general idea (it should also store the new result back into seed, however). It is a formula that uses the last number generated (which is the seed if none have been generated, yet) to produce a new value.

Here is a simple one I made very quickly, in a matter of minutes, a while back to temporarily take the place of an improved generater that would be implemented later:

Code:

// Note: _rotl and _rotr are OS specific
// The decimal constants are prime numbers.
// The hexadecimal constants are meant to have a somewhat similar number of 1's and 0's in binary.
m_value = ((m_value * 5653   ^ _rotl(m_value,12)) + 710459 * _rotr(m_value,13) + 0xB82A4D7E ) ^ 
	((m_value * 148157 ^ _rotl(m_value,23)) + 15881  * _rotr(m_value, 7) + 0x82F54EC5 )
	+ 0xC85B4A18; // using 0xC85B4A19 here repeats MUCH more quickly

Notice that m_value computes a new value from itself and stores it to itself.

For a much better, and much more complicated, implementation of a random number generator, take a look at the Mersenne Twister page I linked to above.

**Davros** · 05-31-2004

>Take a look at this site:
http://www.lavarnd.org/

The LavaCan is Interesting. However, I recall seeing tiny devices which plug into the RS232 port and cost around $20 from RS Components.

**pianorain** · 06-02-2004

Originally Posted by Thantos

As such there must not be a way to generate a truely random set of numbers

One of my math professors kept a book called "The List of Random Numbers". The entire book (some 400 pages) was a chart of random numbers. I guess it didn't occur to the writer that once the numbers were ordered, they were no longer random. Must have been pretty simple to write the book. I'm pretty sure most of the people that frequent this board could write a program to write a sequel for it in under 20 minutes.

**JasonD** · 06-02-2004

The book's contents may be truly random numbers, like those obtained by throwing dice, rather than pseudo random numbers generated by a computer. Basically, one set can be reproduced exactly, the other can not. I would assume a book would use a set that can not. I think that they would still be classified as random numbers, as you can start anywhere in the book, and as long as you don't already have the list memorized, you cannot determine which number will appear next - just like the die.

And even if we live in a deterministic universe, it would be very difficult to set up the environment - the entire universe - exactly as before to achieve the same results. In fact, we would not even realize that we were repeating the experiment as we would all have the same thoughts and memories as the first time.

**anonytmouse** · 06-02-2004

If you need a 'better' random number on windows check out CryptGenRandom().

CryptGenRandom documentation

The data produced by this function is cryptographically random. It is far more random than the data generated by the typical random number generator such as the one shipped with your C compiler.

This function is often used to generate random initialization vectors and salt values.

Software random number generators work in fundamentally the same way. They start with a random number, known as the seed, and then use an algorithm to generate a pseudo-random sequence of bits based on it. The most difficult part of this process is to get a seed that is truly random. This is usually based on user input latency, or the jitter from one or more hardware components.

With Microsoft CSPs, CryptGenRandom uses the same random number generator used by other security components. This allows numerous processes to contribute to a system-wide seed. CryptoAPI stores an intermediate random seed with every user. To form the seed for the random number generator, a calling application supplies bits it might have—for instance, mouse or keyboard timing input—that are then added to both the stored seed and various system data and user data such as the process ID and thread ID, the system clock, the system time, the system counter, memory status, free disk clusters, the hashed user environment block. This result is SHA-1 hashed, and the output is used to seed an RC4 stream, which is then used as the random stream and used to update the stored seed. If an application has access to a good random source, it can fill the pbBuffer buffer with some random data before calling CryptGenRandom. The CSP then uses this data to further randomize its internal seed. It is acceptable to omit the step of initializing the pbBuffer buffer before calling CryptGenRandom.