C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
>> You know, it's a poor workman who blames his tools
Yeah well in this case they're like cheap chinese knock-offs! If the nail breaks the hammer then you're probaby in trouble, methinks. =/
Code:#include <cmath> #include <complex> bool euler_flip(bool value) { return std::pow ( std::complex<float>(std::exp(1.0)), std::complex<float>(0, 1) * std::complex<float>(std::atan(1.0) *(1 << (value + 2))) ).real() < 0; }
The problem with that tactic is that you're invoking the generator more times than would actually be required had it been the proper bit-width, and assembling a larger stream of bits from a smaller one is probably going to botch the result since the smaller one has a much shorter period, etc.
Code:#include <cmath> #include <complex> bool euler_flip(bool value) { return std::pow ( std::complex<float>(std::exp(1.0)), std::complex<float>(0, 1) * std::complex<float>(std::atan(1.0) *(1 << (value + 2))) ).real() < 0; }
Ok, so I've run a good number of tests on the generator (gave up on the NIST suite but found a decent one here that was much easier to use) and it appears to perform well (I'm a bit suprised, actually). I've also made some minor changes to the algorithm to fix the initial convergence problem that Evoex pointed out (with no cost to performance, luckily) and confirmed it's period and distribution. I've been wanting to put together some decent templated test functions but so far haven't really had much time to work on them. If anyone is interested on working on some feel free to post them here. =)
Here are the ones I have so far:
[EDIT]Code:template < typename Iterator > Iterator find_period( Iterator begin, Iterator end, size_t* period = 0 ) { for( ; begin != end; ++begin ) { Iterator found = begin; while( ( found = find( found + 1, end, *begin ) ) != end ) { bool counting = true; size_t count = 1; Iterator next = begin, search = found; for( ;; ) { if( ++next == found ) counting = false; if( ++search == end ) break; if( *next != *search ) break; if( counting ) ++count; } if( search == end && !counting ) { if( period ) *period = count; return begin; } } } if( period ) *period = 0; return end; } template < typename Container > inline typename Container::const_iterator find_period( Container const& data, size_t* period = 0 ) { return find_period( data.begin( ), data.end( ), period ); } template < typename Iterator > size_t length_of_period( Iterator begin, Iterator end ) { size_t period; find_period( begin, end, &period ); return period; } template < typename Container > inline size_t length_of_period( Container const& data ) { return length_of_period( data.begin( ), data.end( ) ); } template < typename Iterator > double frequency_distribution( Iterator seq, Iterator fin ) { double result = 0.0, size = std::distance( seq, fin ), count = std::accumulate( seq, fin, 0.0 ), expected = count / size, maximum_error = ( count - expected ) + ( expected * ( size - 1.0 ) ); if( count == 0.0 ) return 0.0; while( seq != fin ) result += fabs( *seq++ - expected ); return 1.0 - ( result / maximum_error ); } template < typename Container > inline double frequency_distribution( Container const& data ) { return frequency_distribution( data.begin( ), data.end( ) ); } template < typename Iterator > double monte_carlo_pi( Iterator seq, Iterator fin ) { size_t size = 0, count = 0; double const pi = 4.0 * atan( 1.0 ), factor = 1.0 / pow( 2.0, int( sizeof( *seq ) * numeric_limits< char >::digits ) ); for( ;; ) { if( seq == fin ) break; double x = double( *seq++ ) * factor; if( seq == fin ) break; double y = double( *seq++ ) * factor; if( ( ( x * x ) + ( y * y ) ) <= 1.0 ) ++count; ++size; } return fabs( ( 4.0 * ( double( count ) / double( size ) ) ) - pi ) / pi; } template < typename Container > inline double monte_carlo_pi( Container const& data ) { return monte_carlo_pi( data.begin( ), data.end( ) ); } template < typename Iterator > double chi_squared( Iterator seq, Iterator fin ) { double result = 0.0, size = std::distance( seq, fin ), count = std::accumulate( seq, fin, 0.0 ); if( count == 0.0 ) return 0.0; double expected = count / size; while( seq != fin ) { double temp = ( *seq++ - expected ); result += ( temp * temp ) / expected; } return result; } template < typename Container > inline double chi_squared( Container const& data ) { return chi_squared( data.begin( ), data.end( ) ); }
Note: All of the above functions expect unsigned integers.
[/EDIT]
I'm pretty sure the chi-squared algorithm is correct, but as yet haven't been able to make much sense of the results. As far as periodicity, frequency distribution, and the monte-carlo pi test, both the generator and the marsenne twister performed exceedingly well. The standard library rand function did ok except for the latter test.
And here are the results of the Fourmilab "ent" test. I chose the absolute worst performers of numerous runs.
The 'placebo' function was just a control case that always returns the same 32-bit number (as a sort of base-line worst-case scenario). All of the tests were performed on 1,000,000 32-bit numbers, except for std::rand, which was done on 2,000,000 16-bit numbers, and each set of data was treated as a stream of 32,000,000 bits.--- std::rand ---
Value Char Occurrences Fraction
0 17120290 0.531920
1 15065534 0.468080
Total: 32185824 1.000000
Entropy = 0.997058 bits per bit.
Optimum compression would reduce the size
of this 32185824 bit file by 0 percent.
Chi square distribution for 32185824 samples is 131176.45, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.4681 (0.5 = random).
Monte Carlo value for Pi is 3.830697142 (error 21.93 percent).
Serial correlation coefficient is -0.003975 (totally uncorrelated = 0.0).
--- marsenne ---
Value Char Occurrences Fraction
0 16078169 0.500499
1 16046079 0.499501
Total: 32124248 1.000000
Entropy = 0.999999 bits per bit.
Optimum compression would reduce the size
of this 32124248 bit file by 0 percent.
Chi square distribution for 32124248 samples is 32.06, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.4995 (0.5 = random).
Monte Carlo value for Pi is 3.148647377 (error 0.22 percent).
Serial correlation coefficient is -0.000144 (totally uncorrelated = 0.0).
--- garth ---
Value Char Occurrences Fraction
0 16079544 0.500566
1 16043184 0.499434
Total: 32122728 1.000000
Entropy = 0.999999 bits per bit.
Optimum compression would reduce the size
of this 32122728 bit file by 0 percent.
Chi square distribution for 32122728 samples is 41.16, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.4994 (0.5 = random).
Monte Carlo value for Pi is 3.147524816 (error 0.19 percent).
Serial correlation coefficient is -0.000060 (totally uncorrelated = 0.0).
--- placebo ---
Value Char Occurrences Fraction
0 27000000 0.843750
1 5000000 0.156250
Total: 32000000 1.000000
Entropy = 0.625262 bits per bit.
Optimum compression would reduce the size
of this 32000000 bit file by 37 percent.
Chi square distribution for 32000000 samples is 15125000.00, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.1563 (0.5 = random).
Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
Serial correlation coefficient is 0.525926 (totally uncorrelated = 0.0).
So the results look pretty good, but I'm still not totally convinced. I'm wondering what else I should be checking for?
-
Last edited by Sebastiani; 07-26-2009 at 07:49 AM.
Code:#include <cmath> #include <complex> bool euler_flip(bool value) { return std::pow ( std::complex<float>(std::exp(1.0)), std::complex<float>(0, 1) * std::complex<float>(std::atan(1.0) *(1 << (value + 2))) ).real() < 0; }
Well, I would post a working copy of my proprietary RNG, but it wont let me upload rar files.
Anyway, non-trivial PRNG's exist, but anything based on a 64 bit number isn't going to be highly random.