>> The MinGW port of gcc 3.4.5 refused to compile your original code, reporting that the initialiser of the static variable was not constant.
That's silly. I wonder why it thinks it should be (3.4.2 doesn't seem to care!)?
>> The MinGW port of gcc 3.4.5 refused to compile your original code, reporting that the initialiser of the static variable was not constant.
That's silly. I wonder why it thinks it should be (3.4.2 doesn't seem to care!)?
Code:#include <cmath> #include <complex> bool euler_flip(bool value) { return std::pow ( std::complex<float>(std::exp(1.0)), std::complex<float>(0, 1) * std::complex<float>(std::atan(1.0) *(1 << (value + 2))) ).real() < 0; }
Just to clarify something a few pages back; Yes unsigned chars is absolutely what I would use. I was too busy making a point about the memory usage difference that I neglected to see I had missed off the "unsigned".
Of course it doesn't really that much anyway.
Sebastiani has the solution I would use.
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"
On a hunch: did you compile as C++ or as C99? I tried compiling as C++ instead, and there was no errors, but this is the C programming forum.Originally Posted by Sebastiani
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
Hey Sebastiani...
In your timing code, can you please add another function to try out:
Similar in mathematical result as
Here is my contest entry to compute ceil ( log2 ( n ) ) faster than using the library math functions.Code:int get_first_available_channel_ceil_log( unsigned channels ) { int index = ( int )ceil( log2( channels + 1 ) ); return index ? 33 - index : 0; }
Can you please run it and post the timing results for all the algorithms?Code:int fast_index(unsigned channels) { double c = (double)channels; return channels ? 1055 - (int)(*(long long *)&c >> 52): 0; }
Last edited by nonoob; 05-29-2009 at 04:01 PM.
>> On a hunch: did you compile as C++ or as C99? I tried compiling as C++ instead, and there was no errors, but this is the C programming forum.
You're right, I did compile it with a C++ compiler. But I'm still confused as to why this is considered an error (and too lazy at the moment to look at the standard).
>> Here is my contest entry to compute ceil ( log2 ( n ) ) faster than using the library math functions.
It's not really a contest, I was just trying to get an idea of how the algorithms compared. Unfortunately, I don't think that the code you posted is portable for a couple of reasons. For one, the long long type is not guaranteed to be the same size as a double, and besides that, I don't believe the standard even mandates any particular binary representation for doubles. Having said that, it will probably work on *most* implementations. At any rate, I wil take a look at it and see how the function performs, in general.
EDIT:
It looks like the bitwise ceil/log solution is not far behind the lookup table approach (slightly less than twice in running time), so it may be a good candidate for memory-strapped applications such as imbedded systems. It's possible that it could be optimized further to run even faster, but I'll leave it up to you math geniuses to work that out. Anyway, here's an updated (C99 compiant) version of the test:
Code:#include <stdio.h> #include <stdlib.h> #include <math.h> #include <time.h> #define array_size( array ) ( sizeof( array ) / sizeof( array[ 0 ] ) ) void initialize_table( unsigned char table[ 256 ] ) { int index, shift, min, max; for( shift = min = 1; shift < 9; shift++ ) { max = 1 << shift; for( index = min; index < max; index++ ) { table[ index ] = 9 - shift; } min = max; } } int get_first_available_channel_table_lookup( unsigned channels ) { static unsigned char table[ 256 ]; static int initialized = 0; if( !initialized ) { initialize_table( table ); initialized = 1; } return channels == 0 ? 0 : channels < 0x100 ? table[ channels & 0xff ] + 24 : channels < 0x10000 ? table[ ( channels >> 8 ) & 0xff ] + 16 : channels < 0x1000000 ? table[ ( channels >> 16 ) & 0xff ] + 8 : table[ ( channels >> 24 ) & 0xff ]; } int get_first_available_channel_ceil_log( unsigned channels ) { int index = ( int )ceil( log2( channels + 1 ) ); return index ? 33 - index : 0; } int get_first_available_channel_bit_by_bit( unsigned channels ) { unsigned index, mask; for( index = 1, mask = 0x80000000; mask; mask >>= 1, index++ ) { if( channels & mask ) return index; } return 0; } int get_first_available_channel_bitwise_ceil_log( unsigned channels ) { double c = ( double )channels; return channels ? 1055 - ( int )( *( long long * )&c >> 52 ) : 0; } typedef int ( * get_first_available_channel_function )( unsigned ); void test( unsigned* values, size_t length, const char* name, get_first_available_channel_function function, size_t repetitions ) { size_t iterations = repetitions * length; unsigned * v, * e = values + length; clock_t elapsed, stop, start = clock( ); while( repetitions-- ) { v = values; while( v != e ) { function( *v++ ); } } stop = clock( ); elapsed = ( clock_t )( ( ( double ) ( stop - start ) / ( double )CLK_TCK ) * 1000 ); printf( "%s : %d iterations performed in %d milliseconds\n", name, iterations, ( size_t )elapsed ); } int main( void ) { int index; unsigned data[ 256 ]; size_t repetitions = 30000, tests = 10; struct { char* name; get_first_available_channel_function function; } functions[ ] = { { "bitwise_ceil_log", get_first_available_channel_bitwise_ceil_log }, { "ceil_log", get_first_available_channel_ceil_log }, { "table_lookup", get_first_available_channel_table_lookup }, { "bit_by_bit", get_first_available_channel_bit_by_bit } }; srand( time( 0 ) ); while( tests-- ) { for( index = 0; index < array_size( data ); ++index ) { data[ index ] = rand( ); } for( index = 0; index < array_size( functions ); ++index ) { test( data, array_size( data ), functions[ index ].name, functions[ index ].function, repetitions ); } } return 0; }
Last edited by Sebastiani; 05-29-2009 at 05:17 PM.
Code:#include <cmath> #include <complex> bool euler_flip(bool value) { return std::pow ( std::complex<float>(std::exp(1.0)), std::complex<float>(0, 1) * std::complex<float>(std::atan(1.0) *(1 << (value + 2))) ).real() < 0; }
The one with the 1055 constant in it could be made portable by using frexp:
Code:int get_first_available_channel_frexp( unsigned channels ) { int exponent; frexp(static_cast<double>(channels), &exponent); return channels ? sizeof(channels) * CHAR_BIT + 1 - exponent : 0; }
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"
Can you add GCC's internal implementation for comparison?
I had to do this for a project a while ago, and tried a few algorithms. None could beat the GCC builtin one. I was working with 64-bit unsigned, though.Code:int get_first_available_channel_gcc_builtin( unsigned channels ) { if (channels) return __builtin_clz (channels); return 0; }
>> I had to do this for a project a while ago, and tried a few algorithms. None could beat the GCC builtin one. I was working with 64-bit unsigned, though.
You're right, this turned out to be the fastest so far. Any idea how it's implemented?
Here's an updated version of the test (with verification):
Code:#include <stdio.h> #include <stdlib.h> #include <math.h> #include <time.h> #include <limits.h> #define array_size( array ) ( sizeof( array ) / sizeof( array[ 0 ] ) ) void initialize_table( unsigned char table[ 256 ] ) { int index, shift, min, max; for( shift = min = 1; shift < 9; ++shift ) { max = 1 << shift; for( index = min; index < max; ++index ) { table[ index ] = 9 - shift; } min = max; } } int get_first_available_channel_table_lookup( unsigned channels ) { static unsigned char table[ 256 ]; static int initialized = 0; if( !initialized ) { initialize_table( table ); initialized = 1; } return channels == 0 ? 0 : channels < 0x100 ? table[ channels & 0xff ] + 24 : channels < 0x10000 ? table[ ( channels >> 8 ) & 0xff ] + 16 : channels < 0x1000000 ? table[ ( channels >> 16 ) & 0xff ] + 8 : table[ ( channels >> 24 ) & 0xff ]; } int get_first_available_channel_ceil_log( unsigned channels ) { int index = ( int )ceil( log2( channels + 1 ) ); return index ? 33 - index : 0; } int get_first_available_channel_bit_by_bit( unsigned channels ) { unsigned index, mask; for( index = 1, mask = 0x80000000; mask; mask >>= 1, ++index ) { if( channels & mask ) return index; } return 0; } int get_first_available_channel_bitwise_ceil_log( unsigned channels ) { double c = ( double )channels; return channels ? 1055 - ( int )( *( long long * )&c >> 52 ) : 0; } int get_first_available_channel_frexp( unsigned channels ) { static int constant = sizeof( channels ) * CHAR_BIT + 1; int exponent; frexp( ( double )channels, &exponent ); return channels ? constant - exponent : 0; } int get_first_available_channel_gcc_builtin( unsigned channels ) { if( channels ) return __builtin_clz( channels ) + 1; return 0; } typedef int ( * get_first_available_channel_function )( unsigned ); struct entry { char* name; get_first_available_channel_function function; int passed; }; void test( unsigned* values, size_t length, struct entry* entry, size_t repetitions ) { int result, control; size_t iterations = repetitions * length; unsigned * v, * e = values + length; for( v = values; v != e; ++v ) { result = entry->function( *v ); control = get_first_available_channel_table_lookup( *v ); if( result != control ) { printf( "Error: %s reported channel in 0x%x to be %d (should be %d)\n", entry->name, *v, result, control ); entry->passed = 0; } } if( !entry->passed ) { printf( "%s : N/A (invalid)\n" ); return; } clock_t elapsed, stop, start = clock( ); while( repetitions-- ) { v = values; while( v != e ) { entry->function( *v++ ); } } stop = clock( ); elapsed = ( clock_t )( ( ( double ) ( stop - start ) / ( double )CLK_TCK ) * 1000 ); printf( "%s : %d iterations performed in %d milliseconds\n", entry->name, iterations, ( size_t )elapsed ); } int main( void ) { int index; unsigned data[ 256 ]; size_t repetitions = 30000, tests = 10; struct entry entries[ ] = { { "frexp", get_first_available_channel_frexp, 1 }, { "gcc_builtin", get_first_available_channel_gcc_builtin, 1 }, { "bitwise_ceil_log", get_first_available_channel_bitwise_ceil_log, 1 }, { "ceil_log", get_first_available_channel_ceil_log, 1 }, { "table_lookup", get_first_available_channel_table_lookup, 1 }, { "bit_by_bit", get_first_available_channel_bit_by_bit, 1 } }; srand( time( 0 ) ); while( tests-- ) { for( index = 0; index < array_size( data ); ++index ) { data[ index ] = rand( ); } for( index = 0; index < array_size( entries ); ++index ) { test( data, array_size( data ), &entries[ index ], repetitions ); } } return 0; }
Last edited by Sebastiani; 05-30-2009 at 01:20 PM.
Code:#include <cmath> #include <complex> bool euler_flip(bool value) { return std::pow ( std::complex<float>(std::exp(1.0)), std::complex<float>(0, 1) * std::complex<float>(std::atan(1.0) *(1 << (value + 2))) ).real() < 0; }
I ran the test myself, and with -O3 (gcc -O3 -std=c99 test.c -lm),
If I take it out...
Error: bitwise_ceil_log reported channel in 0xe002f85 to be 1055 (should be 5)
Error: bitwise_ceil_log reported channel in 0x68dce8fc to be 1055 (should be 2)
Error: bitwise_ceil_log reported channel in 0x3997ff43 to be 1055 (should be 3)
Error: bitwise_ceil_log reported channel in 0x23c952b to be 1055 (should be 7)
Error: bitwise_ceil_log reported channel in 0xecf79ff to be 1055 (should be 5)
Error: bitwise_ceil_log reported channel in 0x401317cd to be 1055 (should be 2)
Error: bitwise_ceil_log reported channel in 0x141ac60 to be 1055 (should be 8)
This is GCC 4.3.3 on 32-bit x86 Linux.rexp : 7680000 iterations performed in 200 milliseconds
gcc_builtin : 7680000 iterations performed in 40 milliseconds
ceil_log : 7680000 iterations performed in 830 milliseconds
table_lookup : 7680000 iterations performed in 70 milliseconds
bit_by_bit : 7680000 iterations performed in 120 milliseconds
Hardware implementation.Any idea how it's implemented?
Code:bsrl %edx, %eax
I don't understand why bitwise_ceil_log crapped out. Perhaps there is no "long long" (__int64 or some variant) in your compiler - although you should have gotten hideous syntax errors to that effect. My guess it that the >> 53 shift gives a zero every time for some reason.
No, wait. It's more likely an endian thing where the IEEE 754 floating-point format can not be directly overlaid with the internal representation of a 64-bit integer.
It most certainly does.Perhaps there is no "long long" (__int64 or some variant) in your compiler
I have no idea, either. It only doesn't work if I turn on optimization. Works fine with O0. Meaning it's probably something to do with relying on some kind of undefined behaviour (or GCC bug).