efficient way to calc log base 2 of 64-bit unsigned int

**cyberfish** · 07-25-2007

Hi,
Is there a more efficient way to calculate the log base 2 of a 64-bit unsigned integer than looping through the bits? I couldn't find an algorithm on the internet that works on 64-bit numbers.

PS. assume that the input is a power of 2 (exactly one bit is set)

Thank you very much

**Salem** · 07-25-2007

Code:

#include<stdio.h>
#include<stdlib.h>

int findBit ( unsigned long val ) {
    int result = 0;
    if ( val >= 0x10000 ) {
        result += 16;
        val >>= 16;
    }
    if ( val >= 0x100 ) {
        result += 8;
        val >>= 8;
    }
    if ( val >= 0x10 ) {
        result += 4;
        val >>= 4;
    }
    if ( val >= 0x4 ) {
        result += 2;
        val >>= 2;
    }
    if ( val >= 0x2 ) {
        result += 1;
        val >>= 1;
    }
    return result + val;
}

int main ( ) {
    unsigned long tests[] = {
        0, 0x1, 0x100000, 0x400000, 0x80000000
    };
    int i;
    for ( i = 0 ; i < sizeof tests / sizeof *tests ; i++ ) {
        printf("&#37;08lx = %d\n", tests[i], findBit(tests[i]) );
    }

    return 0;
}

$ gcc foo.c
$ ./a.exe
00000000 = 0
00000001 = 1
00100000 = 21
00400000 = 23
80000000 = 32

64 bits is just one more test.

**cyberfish** · 07-25-2007

thank you for the code, took me a while to understand, but I think I do now.

**robatino** · 07-25-2007

The "val >=" and "result +=" lines can be implemented using bitwise & and |=, resp.

**cyberfish** · 07-25-2007

thank you for the suggestion. I changed the code accordingly:

Code:

unsigned int findBit ( Uint64 bb ) {
	unsigned int result = 0;
	
	if ( bb & ~0x1FFFFFFFFLL) {
		result |= 32;
		bb >>= 32;
	}	
	if ( bb & ~0x1FFFF ) {
		result |= 16;
		bb >>= 16;
	}
	if ( bb & ~0x1FF ) {
		result |= 8;
		bb >>= 8;
	}
	if ( bb & ~0x1F ) {
		result |= 4;
		bb >>= 4;
	}
	if ( bb & ~0x7 ) {
		result |= 2;
		bb >>= 2;
	}
	if ( bb & ~0x3 ) {
		result |= 1;
		bb >>= 1;
	}
	return (result + bb);
}

however, to my surprise, through benchmarking (a simple loop that executes this function many times), it appears that the improvement in efficiency is unnoticeable or even non-existant. Is this what is supposed to happen? or has my compiler (gcc 4.1) done some magic to the old code?

Thank you

**Mario F.** · 07-25-2007

bitwise operations don't experience the performance improvement of old days.

Always a nice thing to do, I expect. Especially when there are plans to turn new code into old code or use new code in old systems. However, as far as modern compilers (and processors?)are concerned, explicit bitwise operations in source code don't have too much of an impact any more under most common situations.

**robatino** · 07-25-2007

I'm surprised - I wasn't sure about the "val >=", but I was pretty sure that the compiler wasn't smart enough to realize that each of the "result +=" could be replaced with |= (since it requires seeing that the different increments have no bits in common, and that the initial value of result is 0) and that |= would be faster than +=.

**matsp** · 07-25-2007

Originally Posted by robatino

I'm surprised - I wasn't sure about the "val >=", but I was pretty sure that the compiler wasn't smart enough to realize that each of the "result +=" could be replaced with |= (since it requires seeing that the different increments have no bits in common, and that the initial value of result is 0) and that |= would be faster than +=.

The compiler probably doesn't replace += with |=, but += and |= are the same number of clocks either way on a modern x86-based processor - one cycle latency for the actual OR operation, and then whatever additional time to access the arguments (in this example no time at all, since on side is constants and most likely result is in a register - or it's cached so not much of a latency there either).

Where it may make a difference is if you're OR-ing to data that is bigger than a register, say a 32-bit value on a 16-bit x86 processor. But that was a while back. I'm pretty sure that OR and ADD are the same number of cycles even on a 486.

--
Mats

**Salem** · 07-25-2007

If you wanted to go "all out", and you're using an x86, there's always the "BSR—Bit Scan Reverse" instruction to play with.

> it appears that the improvement in efficiency is unnoticeable or even non-existant.
> Is this what is supposed to happen?
Why would you imagine that there would be some vast difference in performance?

Personally, I prefer mine, because it's more intuitively obvious (well I think so) as to what is going on.

**cyberfish** · 07-25-2007

thank you for all your comments

I was under the (wrong) impression that bitwise operations are faster than add.

thank you for clarifying it for me

in this case, I would prefer Salem's original version, too, for the readability.

by the way, i am using this function in the inner loop of a chess playing program, that's why I wanted to optimize it

**cyberfish** · 07-25-2007

Where it may make a difference is if you're OR-ing to data that is bigger than a register, say a 32-bit value on a 16-bit x86 processor. But that was a while back. I'm pretty sure that OR and ADD are the same number of cycles even on a 486.

so it would matter if I am OR-ing to 64-bit data on a 32-bit CPU? what kind of difference would that make?

Thank you

**matsp** · 07-26-2007

Originally Posted by cyberfish

so it would matter if I am OR-ing to 64-bit data on a 32-bit CPU? what kind of difference would that make?

Thank you

Whilst testing this out, I've found that Visual Studio VC7 doesn't do this very well [because it touches the second half of the 64-bit word anyways]- but when I tried to do it in assembler, I found that the result isn't noticably better - most likely due to bad branch prediction. If I use a constant instead of a variable, then the assembler version is much faster than the C-version with variable input, but the compiler realizes that the input is constant and optimizes the C-version out of the loop, so it takes "zero time" to run the loop!

--
Mats

**cyberfish** · 07-26-2007

I see... thank you

**brewbuck** · 07-26-2007

Originally Posted by cyberfish

Hi,
Is there a more efficient way to calculate the log base 2 of a 64-bit unsigned integer than looping through the bits? I couldn't find an algorithm on the internet that works on 64-bit numbers.

My advice is to keep the board reading code as simple and understandable as possible. Look for performance gains by enhancing your evaluation function, using a transposition table, and improving move ordering for alpha-beta search. You can get very large gains this way, whereas optimizing the board access function will lead to only a linear speedup. You're not going to make your code twice as fast by fiddling with this function.

One of the NASA engineers who designed the control code for the antenna beam on the space shuttle once told me, "Don't bother making something faster unless you can make it 4 times faster."

**cyberfish** · 07-26-2007

My advice is to keep the board reading code as simple and understandable as possible. Look for performance gains by enhancing your evaluation function, using a transposition table, and improving move ordering for alpha-beta search. You can get very large gains this way, whereas optimizing the board access function will lead to only a linear speedup. You're not going to make your code twice as fast by fiddling with this function.

Thank you for your advice, it's much appreciated. I am still on my board handling and move generation code, not yet onto the AI part, but I am already planning to implement various optimizations such as transposition tables as I have one month to work on the AI if needed (I am still a high school student).

One of the NASA engineers who designed the control code for the antenna beam on the space shuttle once told me, "Don't bother making something faster unless you can make it 4 times faster."

well, the change from my original looping algorithm to this algorithm is a change from linear search to binary search (if I understand this algorithm correctly) and the speed up is well over 4 times =)

Thread: efficient way to calc log base 2 of 64-bit unsigned int

Thread Tools

Search Thread

Display

efficient way to calc log base 2 of 64-bit unsigned int

Similar Threads

Code review

30something GSOH seeks help with the basics of a minor programme

Need help understanding info in a header file

Can't Find Conio.h?

A Simple (?) Problem