>> The MinGW port of gcc 3.4.5 refused to compile your original code, reporting that the initialiser of the static variable was not constant.
That's silly. I wonder why it thinks it should be (3.4.2 doesn't seem to care!)?
Printable View
>> The MinGW port of gcc 3.4.5 refused to compile your original code, reporting that the initialiser of the static variable was not constant.
That's silly. I wonder why it thinks it should be (3.4.2 doesn't seem to care!)?
Just to clarify something a few pages back; Yes unsigned chars is absolutely what I would use. I was too busy making a point about the memory usage difference that I neglected to see I had missed off the "unsigned".
Of course it doesn't really that much anyway.
Sebastiani has the solution I would use.
On a hunch: did you compile as C++ or as C99? I tried compiling as C++ instead, and there was no errors, but this is the C programming forum.Quote:
Originally Posted by Sebastiani
Hey Sebastiani...
In your timing code, can you please add another function to try out:
Similar in mathematical result as
Here is my contest entry to compute ceil ( log2 ( n ) ) faster than using the library math functions.Code:int get_first_available_channel_ceil_log( unsigned channels )
{
int
index = ( int )ceil( log2( channels + 1 ) );
return index ? 33 - index : 0;
}
Can you please run it and post the timing results for all the algorithms?Code:int fast_index(unsigned channels) {
double c = (double)channels;
return channels ? 1055 - (int)(*(long long *)&c >> 52): 0; }
>> On a hunch: did you compile as C++ or as C99? I tried compiling as C++ instead, and there was no errors, but this is the C programming forum.
You're right, I did compile it with a C++ compiler. But I'm still confused as to why this is considered an error (and too lazy at the moment to look at the standard).
>> Here is my contest entry to compute ceil ( log2 ( n ) ) faster than using the library math functions.
It's not really a contest, I was just trying to get an idea of how the algorithms compared. Unfortunately, I don't think that the code you posted is portable for a couple of reasons. For one, the long long type is not guaranteed to be the same size as a double, and besides that, I don't believe the standard even mandates any particular binary representation for doubles. Having said that, it will probably work on *most* implementations. At any rate, I wil take a look at it and see how the function performs, in general.
EDIT:
It looks like the bitwise ceil/log solution is not far behind the lookup table approach (slightly less than twice in running time), so it may be a good candidate for memory-strapped applications such as imbedded systems. It's possible that it could be optimized further to run even faster, but I'll leave it up to you math geniuses to work that out. :p Anyway, here's an updated (C99 compiant) version of the test:
Code:#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#define array_size( array ) ( sizeof( array ) / sizeof( array[ 0 ] ) )
void initialize_table( unsigned char table[ 256 ] )
{
int
index,
shift,
min,
max;
for( shift = min = 1; shift < 9; shift++ )
{
max = 1 << shift;
for( index = min; index < max; index++ )
{
table[ index ] = 9 - shift;
}
min = max;
}
}
int get_first_available_channel_table_lookup( unsigned channels )
{
static unsigned char
table[ 256 ];
static int
initialized = 0;
if( !initialized )
{
initialize_table( table );
initialized = 1;
}
return
channels == 0 ? 0
: channels < 0x100 ? table[ channels & 0xff ] + 24
: channels < 0x10000 ? table[ ( channels >> 8 ) & 0xff ] + 16
: channels < 0x1000000 ? table[ ( channels >> 16 ) & 0xff ] + 8
: table[ ( channels >> 24 ) & 0xff ];
}
int get_first_available_channel_ceil_log( unsigned channels )
{
int
index = ( int )ceil( log2( channels + 1 ) );
return index ? 33 - index : 0;
}
int get_first_available_channel_bit_by_bit( unsigned channels )
{
unsigned
index,
mask;
for( index = 1, mask = 0x80000000; mask; mask >>= 1, index++ )
{
if( channels & mask )
return index;
}
return 0;
}
int get_first_available_channel_bitwise_ceil_log( unsigned channels )
{
double
c = ( double )channels;
return channels ? 1055 - ( int )( *( long long * )&c >> 52 ) : 0;
}
typedef int ( * get_first_available_channel_function )( unsigned );
void test( unsigned* values, size_t length, const char* name, get_first_available_channel_function function, size_t repetitions )
{
size_t
iterations = repetitions * length;
unsigned
* v,
* e = values + length;
clock_t
elapsed,
stop,
start = clock( );
while( repetitions-- )
{
v = values;
while( v != e )
{
function( *v++ );
}
}
stop = clock( );
elapsed = ( clock_t )( ( ( double ) ( stop - start ) / ( double )CLK_TCK ) * 1000 );
printf( "%s : %d iterations performed in %d milliseconds\n", name, iterations, ( size_t )elapsed );
}
int main( void )
{
int
index;
unsigned
data[ 256 ];
size_t
repetitions = 30000,
tests = 10;
struct
{
char*
name;
get_first_available_channel_function
function;
}
functions[ ] =
{
{ "bitwise_ceil_log", get_first_available_channel_bitwise_ceil_log },
{ "ceil_log", get_first_available_channel_ceil_log },
{ "table_lookup", get_first_available_channel_table_lookup },
{ "bit_by_bit", get_first_available_channel_bit_by_bit }
};
srand( time( 0 ) );
while( tests-- )
{
for( index = 0; index < array_size( data ); ++index )
{
data[ index ] = rand( );
}
for( index = 0; index < array_size( functions ); ++index )
{
test( data, array_size( data ), functions[ index ].name, functions[ index ].function, repetitions );
}
}
return 0;
}
The one with the 1055 constant in it could be made portable by using frexp:
Code:int get_first_available_channel_frexp( unsigned channels )
{
int exponent;
frexp(static_cast<double>(channels), &exponent);
return channels ? sizeof(channels) * CHAR_BIT + 1 - exponent : 0;
}
Can you add GCC's internal implementation for comparison?
I had to do this for a project a while ago, and tried a few algorithms. None could beat the GCC builtin one. I was working with 64-bit unsigned, though.Code:int get_first_available_channel_gcc_builtin( unsigned channels ) {
if (channels)
return __builtin_clz (channels);
return 0;
}
>> I had to do this for a project a while ago, and tried a few algorithms. None could beat the GCC builtin one. I was working with 64-bit unsigned, though.
You're right, this turned out to be the fastest so far. Any idea how it's implemented?
Here's an updated version of the test (with verification):
Code:#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include <limits.h>
#define array_size( array ) ( sizeof( array ) / sizeof( array[ 0 ] ) )
void initialize_table( unsigned char table[ 256 ] )
{
int
index,
shift,
min,
max;
for( shift = min = 1; shift < 9; ++shift )
{
max = 1 << shift;
for( index = min; index < max; ++index )
{
table[ index ] = 9 - shift;
}
min = max;
}
}
int get_first_available_channel_table_lookup( unsigned channels )
{
static unsigned char
table[ 256 ];
static int
initialized = 0;
if( !initialized )
{
initialize_table( table );
initialized = 1;
}
return
channels == 0 ? 0
: channels < 0x100 ? table[ channels & 0xff ] + 24
: channels < 0x10000 ? table[ ( channels >> 8 ) & 0xff ] + 16
: channels < 0x1000000 ? table[ ( channels >> 16 ) & 0xff ] + 8
: table[ ( channels >> 24 ) & 0xff ];
}
int get_first_available_channel_ceil_log( unsigned channels )
{
int
index = ( int )ceil( log2( channels + 1 ) );
return index ? 33 - index : 0;
}
int get_first_available_channel_bit_by_bit( unsigned channels )
{
unsigned
index,
mask;
for( index = 1, mask = 0x80000000; mask; mask >>= 1, ++index )
{
if( channels & mask )
return index;
}
return 0;
}
int get_first_available_channel_bitwise_ceil_log( unsigned channels )
{
double
c = ( double )channels;
return channels ? 1055 - ( int )( *( long long * )&c >> 52 ) : 0;
}
int get_first_available_channel_frexp( unsigned channels )
{
static int
constant = sizeof( channels ) * CHAR_BIT + 1;
int
exponent;
frexp( ( double )channels, &exponent );
return channels ? constant - exponent : 0;
}
int get_first_available_channel_gcc_builtin( unsigned channels )
{
if( channels )
return __builtin_clz( channels ) + 1;
return 0;
}
typedef int ( * get_first_available_channel_function )( unsigned );
struct entry
{
char*
name;
get_first_available_channel_function
function;
int
passed;
};
void test( unsigned* values, size_t length, struct entry* entry, size_t repetitions )
{
int
result,
control;
size_t
iterations = repetitions * length;
unsigned
* v,
* e = values + length;
for( v = values; v != e; ++v )
{
result = entry->function( *v );
control = get_first_available_channel_table_lookup( *v );
if( result != control )
{
printf( "Error: %s reported channel in 0x%x to be %d (should be %d)\n", entry->name, *v, result, control );
entry->passed = 0;
}
}
if( !entry->passed )
{
printf( "%s : N/A (invalid)\n" );
return;
}
clock_t
elapsed,
stop,
start = clock( );
while( repetitions-- )
{
v = values;
while( v != e )
{
entry->function( *v++ );
}
}
stop = clock( );
elapsed = ( clock_t )( ( ( double ) ( stop - start ) / ( double )CLK_TCK ) * 1000 );
printf( "%s : %d iterations performed in %d milliseconds\n", entry->name, iterations, ( size_t )elapsed );
}
int main( void )
{
int
index;
unsigned
data[ 256 ];
size_t
repetitions = 30000,
tests = 10;
struct entry
entries[ ] =
{
{ "frexp", get_first_available_channel_frexp, 1 },
{ "gcc_builtin", get_first_available_channel_gcc_builtin, 1 },
{ "bitwise_ceil_log", get_first_available_channel_bitwise_ceil_log, 1 },
{ "ceil_log", get_first_available_channel_ceil_log, 1 },
{ "table_lookup", get_first_available_channel_table_lookup, 1 },
{ "bit_by_bit", get_first_available_channel_bit_by_bit, 1 }
};
srand( time( 0 ) );
while( tests-- )
{
for( index = 0; index < array_size( data ); ++index )
{
data[ index ] = rand( );
}
for( index = 0; index < array_size( entries ); ++index )
{
test( data, array_size( data ), &entries[ index ], repetitions );
}
}
return 0;
}
I ran the test myself, and with -O3 (gcc -O3 -std=c99 test.c -lm),
If I take it outQuote:
...
Error: bitwise_ceil_log reported channel in 0xe002f85 to be 1055 (should be 5)
Error: bitwise_ceil_log reported channel in 0x68dce8fc to be 1055 (should be 2)
Error: bitwise_ceil_log reported channel in 0x3997ff43 to be 1055 (should be 3)
Error: bitwise_ceil_log reported channel in 0x23c952b to be 1055 (should be 7)
Error: bitwise_ceil_log reported channel in 0xecf79ff to be 1055 (should be 5)
Error: bitwise_ceil_log reported channel in 0x401317cd to be 1055 (should be 2)
Error: bitwise_ceil_log reported channel in 0x141ac60 to be 1055 (should be 8)
This is GCC 4.3.3 on 32-bit x86 Linux.Quote:
rexp : 7680000 iterations performed in 200 milliseconds
gcc_builtin : 7680000 iterations performed in 40 milliseconds
ceil_log : 7680000 iterations performed in 830 milliseconds
table_lookup : 7680000 iterations performed in 70 milliseconds
bit_by_bit : 7680000 iterations performed in 120 milliseconds
Hardware implementation.Quote:
Any idea how it's implemented?
Code:bsrl %edx, %eax
I don't understand why bitwise_ceil_log crapped out. Perhaps there is no "long long" (__int64 or some variant) in your compiler - although you should have gotten hideous syntax errors to that effect. My guess it that the >> 53 shift gives a zero every time for some reason.
No, wait. It's more likely an endian thing where the IEEE 754 floating-point format can not be directly overlaid with the internal representation of a 64-bit integer.
It most certainly does.Quote:
Perhaps there is no "long long" (__int64 or some variant) in your compiler
I have no idea, either. It only doesn't work if I turn on optimization. Works fine with O0. Meaning it's probably something to do with relying on some kind of undefined behaviour (or GCC bug).