Thread: A new random number generator!

  1. #1
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708

    Post A new random number generator!

    Well, I've learned a lot about random numbers this week. Having tried almost a dozen different designs I've come to the conclusion that designing a good generator isn't at all trivial! But it was fun, anyway.

    So here's what I came up with. It's incredibly simple, actually, and yet it has some pretty interesting properties:

    - Bit width neutral. No matter how many bits in the underlying type, the algorithm is identical and requires no "magic" numbers.
    - Invariant period. Regardless of the seed, the period is exactly 4^N, where N is the number of bits in the underlying type. For example, using 32-bit integers the period would be 18,446,744,073,709,551,616, and with a 64-bit type it would be roughly 3.4e+38.
    - Good distribution. Over the length of it's period the distribution averages about 82%. Using different cycle lengths the value obtained was generally comparable to other popular algorithms.
    - Extremely fast. Each value generated only requires 1 multiply, 4 additions, 4 bit manipulations, and 2 compares. An extra addition is required every 2^N iterations.
    - Low overhead. Only 3 words of memory required.

    There are, however, some possible disadvantages:

    - Due to the simplicity of the operations, there may be some inherent mathematical weaknesses in the context of certain applications.
    - Probably not suitable for applications requiring cryptographically secure random numbers (for the above reasons).

    If nothing else, though, the algorithm might be a useful component for more complex multiphase generators.

    Here is a side-by-side cross-section of a plot of the algorithm compared with the Marsenne twister (using 32768 and 131072 samples, respectively). Can you tell which is which?

    Anyway, I would be interested to hear some opinions from the experts, and would like to know if it has any serious flaws?
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  2. #2

  3. #3
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158
    Wow, that code is scarily skinny, I can understand your concern of it's 'randomness'. But, I don't think a potential flaw found be found in the screenies (unless it's really bad). I guess yours is on the left?

  4. #4
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Sorry, but it's horrible. Let me write the formula you use here:

    Code:
    X(1) = seed + ~(((1*seed) + 1) ROL 1)
    X(2) = X(1) + ~(((2*X(1)) + 1) ROL 1)
    ROL means rotate left. So, let's describe the formula better:

    Code:
    X(0) = seed
    X(n) = X(n-1) + ~(((n*X(n-1)) + 1) ROL 1)
    That means that given a number, the next number is COMPLETELY independent of the seed. Given a number, the next number can be calculated by simply filling in the formula if you know how many numbers have been generated already. If not, you need two to predict the rest.

    Not only that, but it shouldn't be hard to find the seed either if you get the first number.
    Last edited by EVOEx; 07-20-2009 at 09:06 AM.

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Does it pass the DieHard tests?
    Diehard tests - Wikipedia, the free encyclopedia
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> Sorry, but it's horrible. Let me write the formula you use here:

    How so? As in "horribly simple"? My main goal was to make an algorithm that had predictable, generic properties that worked on any number of bits. I believe I've achieved that.

    >> Given a number, the next number can be calculated by simply filling in the formula if you know how many numbers have been generated already. Not only that, but it shouldn't be hard to find the seed either if you get the first number.

    I think that falls under the second category of "disadvantages" that I listed. Security was not my focus here at all. The types of flaws I am concerned with are lapses in periodicity, bad distribution, and evidence that it is not in fact "bit width neutral".

    >> If not, you need two to predict the rest.

    Are you sure? Can you guess what follows these two numbers?

    4290416596
    806034681

    >> But, I don't think a potential flaw found be found in the screenies (unless it's really bad).

    Well, yes and no. Plotting the values is on the one hand a suprisingly effective tool for evaluating the basic properties of a generator, since our eyes are remarkably good at recognizing patterns (or the lack thereof). On the other hand, there are certainly subtle mathematical relationships that would be missed using this "method" and a more rigorous analysis would be necessary to find them.

    >> Let's see what NIST thinks about it

    Good idea. I had gotten a hold of one of their programs recently only to find that it had a very primitive interface and was difficult to work with, but the package you linked to looks like it might be a newer one, so I think I'll give it a shot.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  7. #7
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> Does it pass the DieHard tests?

    Unfortunately, it only compiles on 'nix systems. But I have heard that it's one of the best.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  8. #8
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Code:
    X(n-1) = 4290416596
    x(n) = 806034681
    X(n) = 4290416596 + ~(((n*4290416596) + 1) ROL 1)
    810585381 = ~(((n*4290416596) + 1) ROL 1)
    3484381914 = (((n*4290416596) + 1) ROL 1)
    1742190957 = ((n*4290416596) + 1)
    1742190956 = (n*4290416596)    
    n = 9999
    x(10000) = X(9999) + ~(((10000*X(9999)) + 1) ROL 1)
    x(10000) = 806034681 + ~(((10000*806034681) + 1) ROL 1)
    x(10000) = 3419643861
    So the next number should be 3419643861 if I haven't made any mistakes.

    I might even be able to find out the seed... I'll post if I manage to do it. [Edit: never mind, too much work :P]

    Distribution isn't too great either: For the first several numbers, if X(n) is low, X(n+1) is likely to be very high.
    Last edited by EVOEx; 07-20-2009 at 10:40 AM.

  9. #9
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> So the next number should be 3419643861 if I haven't made any mistakes.

    You didn't. =)

    >> Distribution isn't too great either: For the first several numbers, if X(n) is low, X(n+1) is likely to be very high.

    I was aware of that but didn't want to muddle with it at this point. At any rate, I'm running the NIST tests right now, so hopefully that'll help shed some light on the more important mathematical properties of the generator.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  10. #10
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Ok, well the one thing I have established at this point is that the NIST test suite sucks! You can't automate anything and the input format is really wonky (you have to convert the input to binary ASCII strings and replace any CRLF with LF). So far it's choked on all of the files I've given it. Bleh. I guess I'll try binary mode (I wonder what endian mode it expects?).
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  11. #11
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by Sebastiani View Post
    I wonder what endian mode it expects?
    It doesn't matter. Otherwise the random number generator could simply modify its endianess and be more (or less) random, which is of course nonsense.

  12. #12
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> It doesn't matter. Otherwise the random number generator could simply modify its endianess and be more (or less) random, which is of course nonsense.

    Good point.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  13. #13
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Well it looks like the problem I was having with the tests was an underflow issue - I was a few hundred thousand bits shy of the recommended sample size! I thought of just running the generator as is and have the test suite "think" it was a longer sample size, but that probably wouldn't yield accurate results, would it? I'll probably need to use a big integer but then I don't know how to modify the marsenne twister algorithm (to use as a comparison) for such a large data type. Any ideas?
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  14. #14
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by Sebastiani View Post
    Well it looks like the problem I was having with the tests was an underflow issue - I was a few hundred thousand bits shy of the recommended sample size! I thought of just running the generator as is and have the test suite "think" it was a longer sample size, but that probably wouldn't yield accurate results, would it? I'll probably need to use a big integer but then I don't know how to modify the marsenne twister algorithm (to use as a comparison) for such a large data type. Any ideas?
    Read an integer
    Write it as binary to an output file (big endian, little endian, whatever you like)
    Repeat number_of_bytes / 4 times.
    That would give you the size you want, right?

  15. #15
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    [rant]

    I'm just fuming here, but just to give you an idea of how badly implemented the NIST code is:

    Code:
    	if ( ((X = (double*) calloc(n,sizeof(double))) == NULL) ||
    		 ((wsave = (double *)calloc(2*n+15,sizeof(double))) == NULL) ||
    		 ((ifac = (double *)calloc(15,sizeof(double))) == NULL) ||
    		 ((m = (double*)calloc(n/2+1, sizeof(double))) == NULL) ) {
    			fprintf(stats[7],"\t\tUnable to allocate working arrays for the DFT.\n");
    			if( X == NULL )
    				free(X);
    			if( wsave == NULL )
    				free(wsave);
    			if( ifac == NULL )
    				free(ifac);
    			if( m == NULL )
    				free(m);
    			return;
    	}
    Can we say "memory leak" (although "hemorrhage" might be more appropriate, considering the sizes involved)?

    Besides that, the programmer seemed way too comfortable with using goto's, including one bit where he had a goto jump directly to another goto instead of directly to the target!

    Oh and did I mention the so-called user-interface? It's menu based and honestly, I've seen better systems implemented by the newb's that post their homework assignments here!

    And then there are the functions that modify global variables in ways that can have side-effects on other unsuspecting functions unless special steps are taken.

    Oh the humanity!

    [/rant]

    Ok, I'm done. =D
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Issue w/ Guess My Number Program
    By mkylman in forum C++ Programming
    Replies: 5
    Last Post: 08-23-2007, 01:31 AM
  2. Good Random Number Generator
    By MethodMan in forum C Programming
    Replies: 4
    Last Post: 11-18-2004, 06:38 AM
  3. How do I restart a random number sequence.
    By jeffski in forum C Programming
    Replies: 6
    Last Post: 05-29-2003, 02:40 PM
  4. how to link random number generator with cpu?
    By chris285 in forum C++ Programming
    Replies: 5
    Last Post: 04-28-2003, 05:26 AM
  5. Seeding Random Number Generator
    By zdude in forum C++ Programming
    Replies: 2
    Last Post: 09-05-2002, 03:10 PM