Thread: random isn't really random?

  1. #46
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    All pseudo random generators have a finite cycle length.

    Let's say, this deliberately short sequence.
    1 5 9 3 2 8 4 7 6

    So you call srand(0) and you get 1 5 9 3 2 8 4 7 6
    If you did srand(4), you'd get 2 8 4 7 6 1 5 9 3

    But if you keep changing the seed too often, what you do end up doing is risking landing on some sub-sequence you've already used. This most definitely isn't random.
    Lets say 1 5 9 3 2, then you call srand() again, and get 9 3 2 8

    Sure if you're only getting 10000 entries out of a potential sequence of 2^32, then the chance is minimal (but not zero).

    Any decent run-length from a good random number generator (which rand() isn't) is going to be just as good as any other sequence. Calling srand() multiple times at BEST does nothing useful, and has a number of potential pit-falls which are not easy to predict (or detect).

    > here's what I got my re-seeding srand() everytime with the output of the MT19937 prng:
    Why are you messing with rand() when you've got MT to play with?

    The standard library rand()/srand() functions are essentially useless for serious statistical or cryptographic work.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  2. #47
    Registered User
    Join Date
    May 2008
    Posts
    81
    Quote Originally Posted by Salem View Post
    But if you keep changing the seed too often, what you do end up doing is risking landing on some sub-sequence you've already used. This most definitely isn't random.
    Lets say 1 5 9 3 2, then you call srand() again, and get 9 3 2 8
    that's a great point. notice though, that you only risk running same "sub-sequnce" if you accept multiple outputs for each seed. however, if you're re-seeding randomly for 'each' output, there is no risk of ending up with a previously received sequence, since you effectively reset srand for each output.

    so its not: srand() -> 1 5 9 3 2, "then you call srand() again" srand() -> 9 3 2........
    but: srand() -> 1, srand() -> 8, srand() -> 2, srand() -> x,........

    Quote Originally Posted by Salem View Post
    Any decent run-length from a good random number generator (which rand() isn't) is going to be just as good as any other sequence. Calling srand() multiple times at BEST does nothing useful, and has a number of potential pit-falls which are not easy to predict (or detect).

    > here's what I got my re-seeding srand() everytime with the output of the MT19937 prng:
    Why are you messing with rand() when you've got MT to play with?
    the reason why I was using this rather unusual combination of MT and rand() was in the hope that the combination would somehow 'compound' the randomness.


    in conclusion:

    Code:
    for(i=0;i<999,999;i++)
    {   srand(MT.out);
        finalout = rand();
    }
    is MAYBE, but probably not better than:
    Code:
    final_out = MT.out;
    but is definately no LESS effective than:
    Code:
    srand(MT.out);
    for(i=0;i<999,999;i++)
    {   
         finalout = rand();
    }
    and all of the above are definately better than (as stated in example from above):
    Code:
    for(i=0;i<10;i++)
    {
      srand(MT.out);
    
      for(j=0;j<99,999;j++)
            final_out = rand();
    }
    Last edited by shawnt; 06-09-2008 at 03:17 PM.

  3. #48
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    So you said it yourself - just skip rand() altogether. Use the random number from your other generator.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  4. #49
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    You say seeding srand with MT.out and then calling rand is MAYBE better than just using MT.out. But you haven't proven that it is not worse either.

    I actually don't know the internals of pseudo-random number generators myself, but logic tells me that using them the way they were intended to be used is most likely the best choice, especially when others more knowledgeable on the subject than I are saying the same thing.

    Your logic is that it might be better because it might make the output more random. I'm still curious as to what exactly you mean by that.

  5. #50
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Since you are SO interested, here is how one rand implementation is done:
    Code:
            _ptiddata ptd = _getptd();
            return( ((ptd->_holdrand = ptd->_holdrand * 214013L
                + 2531011L) >> 16) & 0x7fff );
    srand simply sets ptd->_holdrand to the value you seed it with.
    So in essence, calling srand destroys the current uniqueness of the number.
    So does this say if using rand() is a good idea?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  6. #51
    Registered User
    Join Date
    May 2008
    Posts
    81
    Quote Originally Posted by Daved View Post
    You say seeding srand with MT.out and then calling rand is MAYBE better than just using MT.out. But you haven't proven that it is not worse either.

    I actually don't know the internals of pseudo-random number generators myself, but logic tells me that using them the way they were intended to be used is most likely the best choice, especially when others more knowledgeable on the subject than I are saying the same thing.

    Your logic is that it might be better because it might make the output more random. I'm still curious as to what exactly you mean by that.
    to be fair, i have no solid reasons for believing that a coupling of PRNGs will yield better results. its just a half-hearted assumption that prng * prng = better prng.

    i also agree that I haven't proven that multiple seeding of srand() with MT.out is not worse. it just worked well for me, thats all.

    what i was intrigued about was why there are seemingly vehement claims that multiple srand() calls would be detrimental. iMalc pointed out a possible reason and Salem gave an example for the same, but that applies for an implementation slightly different than mine (please see my last post).

    EDIT: i am aware that you had previously questioned what i meant by the phrase "more random". i had addressed that here.
    Last edited by shawnt; 06-09-2008 at 03:58 PM.

  7. #52
    Registered User
    Join Date
    May 2008
    Posts
    81
    Quote Originally Posted by Elysia View Post
    Since you are SO interested, here is how one rand implementation is done:
    Code:
            _ptiddata ptd = _getptd();
            return( ((ptd->_holdrand = ptd->_holdrand * 214013L
                + 2531011L) >> 16) & 0x7fff );
    srand simply sets ptd->_holdrand to the value you seed it with.
    So in essence, calling srand destroys the current uniqueness of the number.
    So does this say if using rand() is a good idea?

    i'm sorry, i don't see how re-seeding srand() with a 'random' seed 'each' time can possibly cause biased data. if anything, a single seed is worse than random multiple seeds since all prngs have a natural bias, so a single seed would compel srand to adhere to that bias, whereas multiple seeding with a random seed provides some room to decrease that bias, by way of resetting the prng repeatedly.


    if there are other potential harmful effects of multiple srand() calls (like compiler instability), then that would justify a single srand call per program.

  8. #53
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    I don't know that you've quite got the point. Every PRNG is designed to give you N call-it-random numbers in a row, starting from the seed value, where N is very large. You are taking those N numbers, and throwing away N-1 of them, essentially un-randomizing. There is no way that rand(), the way you're using it, can add any randomness to your list of numbers, because you have thrown away all the effects rand could have. Calling it a bunch of times gives "random" results -- calling it once only gives predetermined results. (Now if you went through the N numbers and then picked a new random seed, you could maybe have something.)

  9. #54
    verbose cat
    Join Date
    Jun 2003
    Posts
    209
    It sounds to me like when you say "bias", you mean how easily someone can figure out the next number the prng will generate. I apologize if I misunderstand, but your own definition of more random seems to support this.

    Quote Originally Posted by shawnt View Post
    i'm sorry, i don't see how re-seeding srand() with a 'random' seed 'each' time can possibly cause biased data.
    By doing this, you are effectively only applying a Caesar Cipher like effect to each output of the MT prng. Instead of simply shifting each number by the same amount each time, it transforms each MT.out with the rand() algorithm (I don't know if this is just a variation of a Caesar Cipher or if it has its own name, as I'm not a cryptography expert). This may make it more difficult to figure out, but if this algorithm is known then you are still left with only needing 624 "random numbers" before the person trying to figure it out can find where they are in the MT sequence. They just have to undo what srand(MT.out) followed by rand() does to the result of each MT "seed", and then it is a simple matter of applying what rand() will do to what they know will be the next MT.out to get what your next number will be, and your effort is broken.

    If you use the output of MT as the seed for a group of rand() numbers, you will effectively increase the number of elements one has to observe before they can predict the next numbers, but you will also be potentially destroying the "randomness" (the distribution) as stated earlier.

    Say you use MT.out as the seed for 10 rand() numbers. You are increasing the number of elements a person has to observe to 6,240. But knowing the algorithm, they need only look at every 10th number and then reverse what rand() is doing to get the MT.out number. Once they figure out where they are in MT, they can manually apply the rand() to the next 10 numbers themselves and they are predicting your sequence. If you know you will never need to generate that many numbers then this might be useful, assuming it doesn't bias the results in some other way that you are not expecting or that makes the effort pointless.

    "So then we'll use MT.out as the seed for a random number of rand() numbers" you might think. That would add one additional element of complexity, but again, if the person trying to predict the pattern knows the algorithm, it's not that much different from adding a Caesar Cipher; they just need to figure out the pattern for how many rand() calls depend on each MT.out seed, and then only look at that first number as above.

    "Ok then, I'll use a rand() as the seed for x number of MT's!" And then all you are doing is forcing the person to undo MT to get the rand() number (instead of undoing rand() to get the MT number), and you are back to square one. Regardless of how you do it, if your algorithm is known, it can be broken. It might take longer to figure out as you make it more complex, and the number of elements that must be output before the pattern is discovered might go up and up and up, but that's about the same thing that's going on with 64bit, 128bit, 256bit, 1024bit, etc. encryption. They make it more and more complex, but brute force will always win if given enough time.

    Now, if you aren't worried about the distribution and just want numbers that no one will be able to predict (even if you end up with the number 8 more often than any 3 other numbers combined for example), then these methods would probably suit you. If you dont' want that number 8 coming up that often though, you need to use the prng's as they are designed. Any tampering with a result of MT is the same as tampering with the MT algorithm itself.
    abachler: "A great programmer never stops optimizing a piece of code until it consists of nothing but preprocessor directives and comments "

  10. #55
    Registered User
    Join Date
    May 2008
    Posts
    81
    jessycat, you hit the nail on the head. that is exactly what I meant by "more random". you have also effectively given a very good explanation of the 'compound randomness' that i was sensing would be the result of combining MT19937 and srand/rand, viz:
    If you use the output of MT as the seed for a group of rand() numbers, you will effectively increase the number of elements one has to observe before they can predict the next numbers, but you will also be potentially destroying the "randomness" (the distribution) as stated earlier.
    the only part I disagree with (or fail to understand) is how would this "potentially (destroy) the 'randomness'". i understand that this would nullify the distribution over the period (by not allowing it to manifest, instead resetting it), but it wouldn't make the sequence more predictable (you said so yourself). could you please explain?

    i also fully agree with your analysis of the vulnerability of PRNGs or combinations thereof as applied to cryptology. since my application utilizes PRNGs for statistical simulation, the impenetrability of the PRNG algorithm is irrelevant for me.
    Last edited by shawnt; 06-09-2008 at 08:05 PM.

  11. #56
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Yes, yes. Shall we also debate the order in which the Fourteen Ships of Xarxoni are going to land on the southern continents and take us all away (under stasis, of course) to Vublizon 4?

  12. #57
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > its just a half-hearted assumption that prng * prng = better prng.
    But good prng * bad prng = bad prng.

    rand() SUCKS. In all the implementations I've ever seen, it's just a simple generator which will fail every Diehard test. MT on the other hand has such a huge cycle time and good distribution that it passes every statistical test for randomness (or so I believe). Read the blurb on the homesite to find out.

    It's only good enough for student homework dice rolling programs say. It should NEVER be used for anything which might be considered serious work.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  13. #58
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by shawnt View Post
    jessycat, you hit the nail on the head. that is exactly what I meant by "more random". you have also effectively given a very good explanation of the 'compound randomness' that i was sensing would be the result of combining MT19937 and srand/rand, viz:


    the only part I disagree with (or fail to understand) is how would this "potentially (destroy) the 'randomness'". i understand that this would nullify the distribution over the period (by not allowing it to manifest, instead resetting it), but it wouldn't make the sequence more predictable (you said so yourself). could you please explain?

    i also fully agree with your analysis of the vulnerability of PRNGs or combinations thereof as applied to cryptology. since my application utilizes PRNGs for statistical simulation, the impenetrability of the PRNG algorithm is irrelevant for me.
    As soon as you start using the phrase "more random" when you're talking about something based entirely on rand then you're increasing your chances of being the subject of an article on www.thedailywtf.com

    If there were any way the using extra calls to srand could make rand more random, then that functionality would simply be built into rand. by calling srand before each call to rand you're basically no longer generating a random number sequence, but are instead using a simple arithmetic calculation on the supplied value to translate that to another value, as the resulting random numbers will be based entirely on the seed given each time. That doesn't add randomness, it's can't possibly do that because it's entirely deterministic according to the value supplied to srand every time. All it does is add complexity, obfuscation, & code bloat, and also makes the resulting numbers no more random than the seeds supplied.
    You're basing the seed on Mersenne Twister so the results will be nicely random, but the ONLY randomness is coming from the Mersenne Twister, and the Mersenne Twister does not need help to be random.
    If you base the seed in such a horrible scheme solely on the current time, then you will probably get pretty awful results.
    Whatever you do, don't re-seed your Mersenne Twister every time as well, or again you're back to square one, with something useless again!

    Anyway, that's enough from me on this subject. You're new to psuedo random number generators. It's your responsibility to do the research and discover how these things are supposed to be used, and there are a vast number of sources out there with all this information.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  14. #59
    verbose cat
    Join Date
    Jun 2003
    Posts
    209
    Quote Originally Posted by shawnt View Post
    the only part I disagree with (or fail to understand) is how would this "potentially (destroy) the 'randomness'". i understand that this would nullify the distribution over the period (by not allowing it to manifest, instead resetting it), but it wouldn't make the sequence more predictable (you said so yourself). could you please explain?

    i also fully agree with your analysis of the vulnerability of PRNGs or combinations thereof as applied to cryptology. since my application utilizes PRNGs for statistical simulation, the impenetrability of the PRNG algorithm is irrelevant for me.
    If you are not worried about the "impenetrability", then you shouldn't care that MT only takes 624 numbers to figure out where in the *HUGE* sequence it is. There is no reason to try and make it "more random" unless you think someone is going to try and manipulate results based on being able to predict the next number (in which case you do care about the impenetrability).

    For a statistical simulation, it is more important that the distribution be even. An even distribution means that any given number in the range is as likely to be generated as any other given number in that range.

    A prng with even distribution run 1,000,000 times and producing numbers in the range 1 to 100 inclusive should produce each of those numbers about 10,000 times. An uneven distribution might show the number 8 appearing 47,000 times, the number 86 appearing 24,000 times, the number 1 only 400 times, etc. That would be Bad unless you carefully crafted your custom prng to do exactly that. Much like rolling a six-sided die however, you want each of those numbers to have an equal chance to be generated. This is what is meant by "destroying the randomness" since it introduces that "bias" toward certain results.

    Let MT do its thing and use its results as-is.
    abachler: "A great programmer never stops optimizing a piece of code until it consists of nothing but preprocessor directives and comments "

  15. #60
    Registered User
    Join Date
    May 2008
    Posts
    81
    jessycat, that was a great explanation. as you said, based on the fact that any PRNG only pans out a distribution more or less evenly yet still deterministically, its now clear to me that coupling MT with rand() will not 'add randomness'. I'm not sure if I'm convinced by Salem's assertion that: good prng * bad prng = bad prng, but rand() definately doesn't seem to be ADDING any value.

    As for the original question that got me posting in this thread in the first place, i.e. whether or not calling srand() for each rand() in a loop actually DECREASEs randomness (lets now define that as decreasing the likelihood of each term of a distribution having equal probability of occurance; thats what i've been after all along), I think I will take your advice and do my own research into it, as I haven't gotten very convincing answers on that here. Would you like to chime in on this?
    Last edited by shawnt; 06-10-2008 at 09:02 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. random to int?
    By psyadam in forum C# Programming
    Replies: 7
    Last Post: 07-22-2008, 08:09 PM
  2. Lesson #3 - Math
    By oval in forum C# Programming
    Replies: 2
    Last Post: 04-27-2006, 08:16 AM
  3. Another brain block... Random Numbers
    By DanFraser in forum C# Programming
    Replies: 2
    Last Post: 01-23-2005, 05:51 PM
  4. How do I restart a random number sequence.
    By jeffski in forum C Programming
    Replies: 6
    Last Post: 05-29-2003, 02:40 PM
  5. Best way to generate a random double?
    By The V. in forum C Programming
    Replies: 3
    Last Post: 10-16-2001, 04:11 PM