Thread: A new random number generator!

  1. #16
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Sebastiani View Post
    I'm just fuming here, but just to give you an idea of how badly implemented the NIST code is:
    You know, it's a poor workman who blames his tools
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  2. #17
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> You know, it's a poor workman who blames his tools

    Yeah well in this case they're like cheap chinese knock-offs! If the nail breaks the hammer then you're probaby in trouble, methinks. =/
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  3. #18
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by EVOEx View Post
    Read an integer
    Write it as binary to an output file (big endian, little endian, whatever you like)
    Repeat number_of_bytes / 4 times.
    That would give you the size you want, right?
    The problem with that tactic is that you're invoking the generator more times than would actually be required had it been the proper bit-width, and assembling a larger stream of bits from a smaller one is probably going to botch the result since the smaller one has a much shorter period, etc.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  4. #19
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Ok, so I've run a good number of tests on the generator (gave up on the NIST suite but found a decent one here that was much easier to use) and it appears to perform well (I'm a bit suprised, actually). I've also made some minor changes to the algorithm to fix the initial convergence problem that Evoex pointed out (with no cost to performance, luckily) and confirmed it's period and distribution. I've been wanting to put together some decent templated test functions but so far haven't really had much time to work on them. If anyone is interested on working on some feel free to post them here. =)

    Here are the ones I have so far:

    Code:
    template < typename Iterator >
    Iterator find_period( Iterator begin, Iterator end, size_t* period = 0 )
    {
        for( ; begin != end; ++begin )
        {
            Iterator
                found = begin;
            while( ( found = find( found + 1, end, *begin ) ) != end )
            {
                bool
                    counting = true;
                size_t
                    count = 1;
                Iterator
                    next = begin,
                    search = found;
                for( ;; )
                {
                    if( ++next == found )
                        counting = false;
                    if( ++search == end )
                        break;
                    if( *next != *search )
                        break;
                    if( counting )
                        ++count;
                }             
                if( search == end && !counting )
                {
                    if( period )
                        *period = count;
                    return begin;
                }   
            }
        }
        if( period )
            *period = 0;
        return end;
    }
    
    template < typename Container >
    inline typename Container::const_iterator find_period( Container const& data, size_t* period = 0 )
    {
        return find_period( data.begin( ), data.end( ), period );
    }
    
    template < typename Iterator >
    size_t length_of_period( Iterator begin, Iterator end )
    {
        size_t
            period;
        find_period( begin, end, &period );
        return period;
    }
    
    template < typename Container >
    inline size_t length_of_period( Container const& data )
    {
        return length_of_period( data.begin( ), data.end( ) );
    }
    
    template < typename Iterator >
    double frequency_distribution( Iterator seq, Iterator fin )
    {
        double
            result = 0.0,
            size = std::distance( seq, fin ),
            count = std::accumulate( seq, fin, 0.0 ),
            expected = count / size,
            maximum_error = ( count - expected ) + ( expected * ( size - 1.0 ) );
        if( count == 0.0 )
            return 0.0;
        while( seq != fin )
            result += fabs( *seq++ - expected );
        return 1.0 - ( result / maximum_error );
    }
    
    template < typename Container >
    inline double frequency_distribution( Container const& data )
    {
        return frequency_distribution( data.begin( ), data.end( ) );
    }
    
    template < typename Iterator >
    double monte_carlo_pi( Iterator seq, Iterator fin )
    {
        size_t
            size = 0, 
            count = 0;
        double const
            pi = 4.0  * atan( 1.0 ),
            factor = 1.0 / pow( 2.0, int( sizeof( *seq ) * numeric_limits< char >::digits ) );
        for( ;; )
        {
            if( seq == fin )
                break;
            double
                x = double( *seq++ ) * factor;
            if( seq == fin )
                break;
            double
                y = double( *seq++ ) * factor;
            if( ( ( x * x ) + ( y * y ) ) <= 1.0 )
                ++count;
            ++size;    
        }
        return fabs( ( 4.0 * ( double( count ) / double( size ) ) ) - pi ) / pi;
    }
    
    template < typename Container >
    inline double monte_carlo_pi( Container const& data )
    {
        return monte_carlo_pi( data.begin( ), data.end( ) );
    }
    
    template < typename Iterator >
    double chi_squared( Iterator seq, Iterator fin )
    {
        double
            result = 0.0, 
            size = std::distance( seq, fin ),
            count = std::accumulate( seq, fin, 0.0 );
        if( count == 0.0 )
            return 0.0;
        double
            expected = count / size;
        while( seq != fin )
        {
            double
                temp = ( *seq++ - expected );
            result += ( temp * temp ) / expected;
        }    
        return result;
    }
    
    template < typename Container >
    inline double chi_squared( Container const& data )
    {
        return chi_squared( data.begin( ), data.end( ) );
    }
    [EDIT]

    Note: All of the above functions expect unsigned integers.

    [/EDIT]

    I'm pretty sure the chi-squared algorithm is correct, but as yet haven't been able to make much sense of the results. As far as periodicity, frequency distribution, and the monte-carlo pi test, both the generator and the marsenne twister performed exceedingly well. The standard library rand function did ok except for the latter test.

    And here are the results of the Fourmilab "ent" test. I chose the absolute worst performers of numerous runs.

    --- std::rand ---

    Value Char Occurrences Fraction
    0 17120290 0.531920
    1 15065534 0.468080

    Total: 32185824 1.000000

    Entropy = 0.997058 bits per bit.

    Optimum compression would reduce the size
    of this 32185824 bit file by 0 percent.

    Chi square distribution for 32185824 samples is 131176.45, and randomly
    would exceed this value less than 0.01 percent of the times.

    Arithmetic mean value of data bits is 0.4681 (0.5 = random).
    Monte Carlo value for Pi is 3.830697142 (error 21.93 percent).
    Serial correlation coefficient is -0.003975 (totally uncorrelated = 0.0).

    --- marsenne ---

    Value Char Occurrences Fraction
    0 16078169 0.500499
    1 16046079 0.499501

    Total: 32124248 1.000000

    Entropy = 0.999999 bits per bit.

    Optimum compression would reduce the size
    of this 32124248 bit file by 0 percent.

    Chi square distribution for 32124248 samples is 32.06, and randomly
    would exceed this value less than 0.01 percent of the times.

    Arithmetic mean value of data bits is 0.4995 (0.5 = random).
    Monte Carlo value for Pi is 3.148647377 (error 0.22 percent).
    Serial correlation coefficient is -0.000144 (totally uncorrelated = 0.0).

    --- garth ---

    Value Char Occurrences Fraction
    0 16079544 0.500566
    1 16043184 0.499434

    Total: 32122728 1.000000

    Entropy = 0.999999 bits per bit.

    Optimum compression would reduce the size
    of this 32122728 bit file by 0 percent.

    Chi square distribution for 32122728 samples is 41.16, and randomly
    would exceed this value less than 0.01 percent of the times.

    Arithmetic mean value of data bits is 0.4994 (0.5 = random).
    Monte Carlo value for Pi is 3.147524816 (error 0.19 percent).
    Serial correlation coefficient is -0.000060 (totally uncorrelated = 0.0).

    --- placebo ---

    Value Char Occurrences Fraction
    0 27000000 0.843750
    1 5000000 0.156250

    Total: 32000000 1.000000

    Entropy = 0.625262 bits per bit.

    Optimum compression would reduce the size
    of this 32000000 bit file by 37 percent.

    Chi square distribution for 32000000 samples is 15125000.00, and randomly
    would exceed this value less than 0.01 percent of the times.

    Arithmetic mean value of data bits is 0.1563 (0.5 = random).
    Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
    Serial correlation coefficient is 0.525926 (totally uncorrelated = 0.0).
    The 'placebo' function was just a control case that always returns the same 32-bit number (as a sort of base-line worst-case scenario). All of the tests were performed on 1,000,000 32-bit numbers, except for std::rand, which was done on 2,000,000 16-bit numbers, and each set of data was treated as a stream of 32,000,000 bits.

    So the results look pretty good, but I'm still not totally convinced. I'm wondering what else I should be checking for?


    -
    Last edited by Sebastiani; 07-26-2009 at 07:49 AM.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  5. #20
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Well, I would post a working copy of my proprietary RNG, but it wont let me upload rar files.

    Anyway, non-trivial PRNG's exist, but anything based on a 64 bit number isn't going to be highly random.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Issue w/ Guess My Number Program
    By mkylman in forum C++ Programming
    Replies: 5
    Last Post: 08-23-2007, 01:31 AM
  2. Good Random Number Generator
    By MethodMan in forum C Programming
    Replies: 4
    Last Post: 11-18-2004, 06:38 AM
  3. How do I restart a random number sequence.
    By jeffski in forum C Programming
    Replies: 6
    Last Post: 05-29-2003, 02:40 PM
  4. how to link random number generator with cpu?
    By chris285 in forum C++ Programming
    Replies: 5
    Last Post: 04-28-2003, 05:26 AM
  5. Seeding Random Number Generator
    By zdude in forum C++ Programming
    Replies: 2
    Last Post: 09-05-2002, 03:10 PM