I'm bored, so here's a solution (assuming a 4-byte int)
Code:

inline int which_bit_char( const char mydata ) {
if ( mydata & 0x0F ) // in upper half
if ( mydata & 0x03 ) // lower quarter
if ( mydata & 0x01 )
return 1;// first bit
else
return 2; // second bit
else // second quarter
if ( mydata & 0x04 )
return 3; // third bit
else
return 4; // fourth bit
else // in lower half
if ( mydata & 0x30 )
if ( mydata & 0x10 )
return 5;// fifth bit
else
return 6; // sixth bit
else // second quarter
if ( mydata & 0x40 )
return 7; // seventh bit
else
return 8; // eighth bit
return 0;
}
inline int which_bit_int( int mydata ) {
if ( !mydata )
return 0;
if ( mydata & 0x0000FFFF )
if ( mydata & 0x000000FF )
return which_bit_char( (char)mydata );
else
return 8 + which_bit_char( (char)(mydata>>8) );
else
if ( mydata & 0xFFFF0000 )
if ( mydata & 0x00FF0000 )
return 16 + which_bit_char( (char)(mydata>>16) );
else
return 24 + which_bit_char( (char)(mydata>>24) );
}
int main () {
int mynum = 0x00400;
printf( "Bit #%d", which_bit_int( mynum ) );
return 0;
}

I'm really bored today. Run this and lemme know what you get
Code:

#include <stdio.h>
#include <time.h>
int main () {
double bin_time = 0, for_time = 0;
for ( int k=0; k<10; k++ ) {
{
clock_t start, end;
start = clock();
unsigned int num = 0, mask=1, i=1;
for ( unsigned int j=0; j<0xfffffff; j++ )
for ( num=mask; num; mask<<=1, num=mask, i++ )
which_bit_int( num );
end = clock();
bin_time += (double)( end - start ) / (double)CLOCKS_PER_SEC;
printf ( "Binary Search: %f seconds\n",
(double)( end - start ) / (double)CLOCKS_PER_SEC );
}
{
clock_t start, end;
start = clock();
unsigned int num = 0, mask=1, i=1;
for ( unsigned int j=0; j<0xfffffff; j++ )
for ( num=mask; num; mask<<=1, num=mask, i++ )
for ( unsigned int mmask=1; mmask; mmask<<=1 )
if ( num & mmask )
break;
end = clock();
for_time += (double)( end - start ) / (double)CLOCKS_PER_SEC;
printf ( "Loop Search: %f seconds\n\n",
(double)( end - start ) / (double)CLOCKS_PER_SEC );
}
}
printf( "total binary time: %f\n", bin_time );
printf( "total loop time: %f\n", for_time );
return 0;
}

for runs faster for me (mostly, it varies). I've optimisers set to fast.

Edit: I figure the loops must by un-looped, actually. I changed the wrapping for loop to run 1000 times, resulting in 8,321,499,105,000 iterations (maybe. Very back-of-envelope calculation) for both binary and loop searches. My binary search came out in 369.104 seconds and the loop comes out at 368.191... This results in a faster loop-search of about 120fs between per iteration (obviously not really, but with laws of averages 'n' all it is).

A friend did it without optimisers set for 10 wrapping iterations and got 16.81 and 30.17 seconds for binary and for searches respectively.

Conclusion -- Optimisers are smart.