Originally Posted by
johnggold
Firstly, you will find it wasn't me that cast the first stone - I've been defending.
The problem has come about in that you appeared to be defending the randomness of the implementation you posted. It was perfectly sensible to correct you there.
Originally Posted by
johnggold
This is all getting out of hand. I can never understand why shouting and abuse is necessary or desirable (imalc).
There was no shouting at any point. Capitals indicates shouting. I bolded a mere 3 words for emphasis where it was due, that's it. I shall switch to using underline for emphasis.
It would be helpful to me if you could point out what I said that you found to be the most abusive and I shall endevaour to censor myself moreso in future.
Originally Posted by
johnggold
I take it that you used the routine recursively
You're probably using the word "recusively" incorrectly there. A recursive function is one that calls itself "recursively". Inductively a random number here is based on the previous, which is based on the one before that etc, but this does not entail recursion. You simply mean "repeatedly", and yes all of us here know how to seed a PRNG once and only once, prior to generating several random numbers.
It turns out that you're actually not defending the randomness of that method, you're actually defending its speed and balance. I do not doubt its speed in the slightest. Balance is not a measure of randomness, in fact given that it is at odds with independence of outcomes, it is effectively a measure of non-randomness.
I have no problem with you defending how balanced it is, but you need to acknowledge that you are therefore not defending how random it is. In effect you are virtually defending how non-random it is! (That's fine, it simply means we're both right)
Yes a Mersenne twister requires more operations than an LCRNG and hence is a little slower. Not much slower, but certainly a little. I don't think I've ever personally needed to use it because I understand how to use rand() effectively, and avoid producing bias where it matters. I would only use something better when randomness is much more important than normal. Enough said.
In case you stil aren't convinced about the non-randomness of the code you posted, the following is a technical discussion which gives you some experiments you can run which demonstrate many of the flaws in its actual randomness.
Here is another LCRNG. This is meant to be compiled on a system where an int is 32-bit:
Code:
unsigned int holdrand;
void srand(unsigned int seed) {
holdrand = seed;
}
int rand(void) {
return (((holdrand = holdrand * 214013L + 2531011L) >> 16) & 0x7FFF);
}
As you can see it contains the same multiplication, addition, and bitmask operations. It appears slightly obduscated because it's all written in one line, but we wont worry about that as we're all C or C++ programmers and can tell what it does regardless. I didn't write this it's taken from the web. Search for those two magic numbers and you'll find hundreds of copies of this all over the place.
Because holdrand is 32-bits and the magic numbers are carefully chosen, this generator cycles through 2^32 internal states. Because it too only generates numbers less than 0x7FFF each time, you can tell that even without reducing the range of the result, that when it produces a specific value e.g. 1234 that whatever number follows it can be different from what followed it last time. Because the LCRNG you posted only has 2^15 internal states, if it outputs 1234 followed by 567 (for example), then next time it generates 1234, it will again be followed by 567. This is why a longer sequence is more important for randomness. The more you subsequently reduce the range the more it reduces the noticeability of this pattern of course, but if you want numbers in a large range, this effect becomes highly noticeable.
Some 64-bit systems may use a LCRNG with a 64-bit internal state, this is somewhat superrior again as it allows for generating more bits of randomness using the same number of CPU operations, just with larger constants etc.
Another thing I mentioned earlier is the method of reducing the range to what the caller wants. Your method uses this technique, though in fact you've unfortunately built it into the same method:
Code:
int random(int min, int max)
{
int range = max-min+1;
return min + rand() % range;
}
You haven't indicated that you are familar with the rejection technique, so here is an implementation of it:
Code:
int random(int min, int max)
{
int range = max-min+1;
int cutoff = (RAND_MAX / range) * range;
int val;
do {
val = rand();
} while (val >= cutoff);
return min + val % range;
}
Yes it's clearly not faster. It's not supposed to be. It simply ensures that the resulting distribution is uniform. This is for where accuracy of the distribution is more important than speed. If you want to see the difference, try generating 1,000,000 random values in the range 1 to 10000 and then produce a histogram of the results. This easily shows how vastly non-uniform the first method is.
The implementation you posted is also missing the bit shift of a slightly more modern LRCNG. It is well known that the lower bits are less random, a clear source of non-randomness in that implementation. Given it is for a 16-bit system however, this is understandable, since there aren't really enough bits to throw away the less random ones. If you take the least-significant bit of the result of your method (after stripping away the code for reducing the range, or generating numbers with a range of 0 to 32767), you will see a very predicatable pattern. The code I posted does not suffer from this anywhere near as much.
There's at least three concrete bits of evidence as to why the algorithm you posted is significantly less random than something a little bit more modern. I already knew all of this very well before you posted.
Rather than going into automatic defense mode you need to realise that of all the people out there on the net, you will eventually run into someone who does know more about a particular topic, and you should be prepared to listen and take stuff in occasionally. Of course you can expect someone else's ego to take a hit when you immediately state that someone else does not have your experience, expecially when jumping to that conclusion based on such a small initial post.
Lastly, I refer you to this page Statistical randomness - Wikipedia, the free encyclopedia which contains several types of statistical randomness tests that can very easily spot the nonrandomness in the generator you posted. Due to the method of range reduction used, your code does not even pass the basic frequency test.
In this case though, your code fails a simple visual inspection according to the knowledge I already have, and using such tests is unnecessary overkill.