This is a discussion on A new random number generator! within the General Discussions forums, part of the Community Boards category; Originally Posted by Sebastiani I'm just fuming here, but just to give you an idea of how badly implemented the ...
C programming resources:
GNU C Function and Macro Index  glibc reference manual
The C Book  nice online learner guide
Current ISO draft standard
CCAN  new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1  #2  #3
cpwiki  our wiki on sourceforge
>> You know, it's a poor workman who blames his tools
Yeah well in this case they're like cheap chinese knockoffs! If the nail breaks the hammer then you're probaby in trouble, methinks. =/
24bbs.cppCode:if( numeric_limits< byte >::digits != bits_per_byte ) error( "program requires bits_per_bytebit bytes" );
The problem with that tactic is that you're invoking the generator more times than would actually be required had it been the proper bitwidth, and assembling a larger stream of bits from a smaller one is probably going to botch the result since the smaller one has a much shorter period, etc.
24bbs.cppCode:if( numeric_limits< byte >::digits != bits_per_byte ) error( "program requires bits_per_bytebit bytes" );
Ok, so I've run a good number of tests on the generator (gave up on the NIST suite but found a decent one here that was much easier to use) and it appears to perform well (I'm a bit suprised, actually). I've also made some minor changes to the algorithm to fix the initial convergence problem that Evoex pointed out (with no cost to performance, luckily) and confirmed it's period and distribution. I've been wanting to put together some decent templated test functions but so far haven't really had much time to work on them. If anyone is interested on working on some feel free to post them here. =)
Here are the ones I have so far:
[EDIT]Code:template < typename Iterator > Iterator find_period( Iterator begin, Iterator end, size_t* period = 0 ) { for( ; begin != end; ++begin ) { Iterator found = begin; while( ( found = find( found + 1, end, *begin ) ) != end ) { bool counting = true; size_t count = 1; Iterator next = begin, search = found; for( ;; ) { if( ++next == found ) counting = false; if( ++search == end ) break; if( *next != *search ) break; if( counting ) ++count; } if( search == end && !counting ) { if( period ) *period = count; return begin; } } } if( period ) *period = 0; return end; } template < typename Container > inline typename Container::const_iterator find_period( Container const& data, size_t* period = 0 ) { return find_period( data.begin( ), data.end( ), period ); } template < typename Iterator > size_t length_of_period( Iterator begin, Iterator end ) { size_t period; find_period( begin, end, &period ); return period; } template < typename Container > inline size_t length_of_period( Container const& data ) { return length_of_period( data.begin( ), data.end( ) ); } template < typename Iterator > double frequency_distribution( Iterator seq, Iterator fin ) { double result = 0.0, size = std::distance( seq, fin ), count = std::accumulate( seq, fin, 0.0 ), expected = count / size, maximum_error = ( count  expected ) + ( expected * ( size  1.0 ) ); if( count == 0.0 ) return 0.0; while( seq != fin ) result += fabs( *seq++  expected ); return 1.0  ( result / maximum_error ); } template < typename Container > inline double frequency_distribution( Container const& data ) { return frequency_distribution( data.begin( ), data.end( ) ); } template < typename Iterator > double monte_carlo_pi( Iterator seq, Iterator fin ) { size_t size = 0, count = 0; double const pi = 4.0 * atan( 1.0 ), factor = 1.0 / pow( 2.0, int( sizeof( *seq ) * numeric_limits< char >::digits ) ); for( ;; ) { if( seq == fin ) break; double x = double( *seq++ ) * factor; if( seq == fin ) break; double y = double( *seq++ ) * factor; if( ( ( x * x ) + ( y * y ) ) <= 1.0 ) ++count; ++size; } return fabs( ( 4.0 * ( double( count ) / double( size ) ) )  pi ) / pi; } template < typename Container > inline double monte_carlo_pi( Container const& data ) { return monte_carlo_pi( data.begin( ), data.end( ) ); } template < typename Iterator > double chi_squared( Iterator seq, Iterator fin ) { double result = 0.0, size = std::distance( seq, fin ), count = std::accumulate( seq, fin, 0.0 ); if( count == 0.0 ) return 0.0; double expected = count / size; while( seq != fin ) { double temp = ( *seq++  expected ); result += ( temp * temp ) / expected; } return result; } template < typename Container > inline double chi_squared( Container const& data ) { return chi_squared( data.begin( ), data.end( ) ); }
Note: All of the above functions expect unsigned integers.
[/EDIT]
I'm pretty sure the chisquared algorithm is correct, but as yet haven't been able to make much sense of the results. As far as periodicity, frequency distribution, and the montecarlo pi test, both the generator and the marsenne twister performed exceedingly well. The standard library rand function did ok except for the latter test.
And here are the results of the Fourmilab "ent" test. I chose the absolute worst performers of numerous runs.
The 'placebo' function was just a control case that always returns the same 32bit number (as a sort of baseline worstcase scenario). All of the tests were performed on 1,000,000 32bit numbers, except for std::rand, which was done on 2,000,000 16bit numbers, and each set of data was treated as a stream of 32,000,000 bits. std::rand 
Value Char Occurrences Fraction
0 17120290 0.531920
1 15065534 0.468080
Total: 32185824 1.000000
Entropy = 0.997058 bits per bit.
Optimum compression would reduce the size
of this 32185824 bit file by 0 percent.
Chi square distribution for 32185824 samples is 131176.45, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.4681 (0.5 = random).
Monte Carlo value for Pi is 3.830697142 (error 21.93 percent).
Serial correlation coefficient is 0.003975 (totally uncorrelated = 0.0).
 marsenne 
Value Char Occurrences Fraction
0 16078169 0.500499
1 16046079 0.499501
Total: 32124248 1.000000
Entropy = 0.999999 bits per bit.
Optimum compression would reduce the size
of this 32124248 bit file by 0 percent.
Chi square distribution for 32124248 samples is 32.06, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.4995 (0.5 = random).
Monte Carlo value for Pi is 3.148647377 (error 0.22 percent).
Serial correlation coefficient is 0.000144 (totally uncorrelated = 0.0).
 garth 
Value Char Occurrences Fraction
0 16079544 0.500566
1 16043184 0.499434
Total: 32122728 1.000000
Entropy = 0.999999 bits per bit.
Optimum compression would reduce the size
of this 32122728 bit file by 0 percent.
Chi square distribution for 32122728 samples is 41.16, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.4994 (0.5 = random).
Monte Carlo value for Pi is 3.147524816 (error 0.19 percent).
Serial correlation coefficient is 0.000060 (totally uncorrelated = 0.0).
 placebo 
Value Char Occurrences Fraction
0 27000000 0.843750
1 5000000 0.156250
Total: 32000000 1.000000
Entropy = 0.625262 bits per bit.
Optimum compression would reduce the size
of this 32000000 bit file by 37 percent.
Chi square distribution for 32000000 samples is 15125000.00, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.1563 (0.5 = random).
Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
Serial correlation coefficient is 0.525926 (totally uncorrelated = 0.0).
So the results look pretty good, but I'm still not totally convinced. I'm wondering what else I should be checking for?

Last edited by Sebastiani; 07262009 at 07:49 AM.
24bbs.cppCode:if( numeric_limits< byte >::digits != bits_per_byte ) error( "program requires bits_per_bytebit bytes" );
Well, I would post a working copy of my proprietary RNG, but it wont let me upload rar files.
Anyway, nontrivial PRNG's exist, but anything based on a 64 bit number isn't going to be highly random.
Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.