1. ## ffs algorithm

I have code, which works, to find the most significant bit set in an unsigned
int. It's an ffs clone for systems without the non-standard libc ffs().
Our other "bit" code uses bit twiddling and is fast. This one algorithm is
about 300% slower. It's also slow on the boxes that do have it as part the math library
version from libc.

Since ffs is part of libc on only some of our boxes, and we need it everywhere, I
reverse engineered it and got:

Code:
```    int
ffs(unsigned int value)
int exponent=floor(log2((double) value));
return  exponent+1;
}```

After googling I got, among other things (from D Neto) some
promising code, or so it seemed.

Code:
```ffs1(register unsigned int value)
{
int i=0;
value |= value>>1;
value |= value>>2;
value |= value>>4;
value |= value>>8;
value |= value>>16;
value ^= (value>>1);
while(value)
{
value=value>>1;
i++;
}
return i;
}```
Another try by me:
Code:
```int ffs2(register unsigned int value){
register int i=0;
if(value)
for(i=1;~value&1;value=value>>1,++i);
return i;
}```
Profile says for 104,000,000 iterations ffs1() was 35% slower the ffs with math.
ffs2() is about 5% faster than ffs() for the same test. -- not really significant.

What am I missing?

ffs() is part of the critical path in new code and needs improvement. Suggestions,
changes, other algorithms?

Thanks.

2. I'm not familiar with ffs, so I don't know how that might compare to some of the things here. One thing I do notice is that I don't get the same result with ffs1 as I do with ffs2. [EDIT]Well I guess they do if only one bit is set in the value.

Look at my first reply, and use the same technique to compare against powers of two to home in on the highest set bit.
Should only take 5 comparisons for any 32 bit number.

4. Intel has assembly instructions bsr and bsf.

5. Thanks guys.

Bithack page I remember from years ago - good link.

asm won't work -- too many platforms

Salem's approach worked the most efficiently.

Thanks. We have an old app that creates bit-encoded data and we have to ffs on 32 bit longs... way too many times. Otherwise it would not be an issue, and I'd let the compiler work it out. Run-time is now way less than 24 hours for a daily job, kind of an elementary requirement.