quick log lookup table

**R.Stiltskin** · 12-01-2008

I want to create a quick lookup table to return unsigned chars for log (base 2) of 2,4,8,16,32,64,128 and avoid calculating (unsigned char)log(x)/log(2.0) thousands of times per second.

A simple solution seems to be an array of unsigned chars which would use 129 bytes ("wasting" 122 of them, but who cares).

I can use a const array, initialized in a header file so it will be available to public static methods of class U. (No "U" object is ever instantiated; it just contains several static methods that are used by various objects.)

Is there a nicer way to do this than listing all 129 elements as in

Code:

const unsigned char mylog[129] = { 0,0,1,0,2,0,0,0,3,0,0,0,0,0,0,0,4,0 ... 0,7};

so that, for example mylog[64] = (unsigned char)6? I'm not worried about all those 0's because I'll never look up mylog of anything other than those specific 7 values.

I read somewhere that there is a gcc extension that would allow initialization by writing

Code:

const int mylog[129] = { [2]=1, [4]=2, [8]=3, ... };

and so on, but it doesn't seem to work in gcc 4.2 or 3.3. In 4.2 I get "error: expected primary-expression before '[' token". In 3.3 I get "syntax error before '=' token". (Anyway, I'd rather not use something that's not portable.)

Alternatively, can I use map or vector? Is there a way to initialize a map or a vector in a header (since I never call a constructor for the class that will be using the table)?

**matsp** · 12-01-2008

[quote]
In ISO C99 you can give the elements in any order, specifying the array indices or structure field names they apply to, and GNU C allows this as an extension in C89 mode as well. This extension is not implemented in GNU C++.
[/code]

Don't ask me why. You could of course have a constructor of the class U that initializes the data array, or you can write it in code like this:

Code:

unsigned char log2n(unsigned char n)
{
     char logv = 0;
     while(n >> 1) logv++;
     return logv;
}

Since it will loop a maximum of 7 times, it shouldn't be very bad. You could also do a if-tree, which would be 4 conditions.

Since I like this sort of challenge, I'm going to write a tiny bit of code to try a few different variants out. Will report back.

--
Mats

**phantomotap** · 12-01-2008

O_o

Why not just fill in the correct values for all 8 bits? Simple. Easy. Fast. Usable.

You could also do a if-tree, which would be 4 conditions.

If he is really only interested in those values it would be two.

Since I like this sort of challenge, I'm going to write a tiny bit of code to try a few different variants out. Will report back.

I shouldn't bother. The array version should always be fastest, followed by the simple while, followed by the IEEE float trick, followed by nibbling, followed by modified binary search, followed by modified binary search as expressed by bit twiddling. (This assuming interest in only eight bits on a processor with decent float hardware.)

Edit: I'm not pulling that out of a hat either; a question regarding the base two logarithm of octets was recently asked at Ars which sent me on the same challenge.

Oh, and you have a bug:

Code:

while(n >> 1) logv++;

->

Code:

while(n >>= 1) logv++;

Soma

**R.Stiltskin** · 12-01-2008

Thanks, I didn't intend this to be a "tricky code" challenge. Just looking for a "nice" way to write a simple direct-lookup table.

So, it looks as if it's going to be:

Code:

const unsigned char myLog[] = { 0,0,1,0,2,0,0,0,3,0,0,0,0,0,0,0,
                                4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                7 };

But since you mentioned this

You could of course have a constructor of the class U that initializes the data array

it brings up this general C++ question. Please clarify: if I never instantiate an object of type U, but only call its public static methods from other objects, would its constructor ever be called?

**matsp** · 12-01-2008

Originally Posted by R.Stiltskin

Thanks, I didn't intend this to be a "tricky code" challenge. Just looking for a "nice" way to write a simple direct-lookup table.

So, it looks as if it's going to be:

Code:

const unsigned char myLog[] = { 0,0,1,0,2,0,0,0,3,0,0,0,0,0,0,0,
                                4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                7 };

But since you mentioned this it brings up this general C++ question. Please clarify: if I never instantiate an object of type U, but only call its public static methods from other objects, would its constructor ever be called?

No, you need to instantiate the class somewhere - but if mylog is a static member, then you can do it "sneakily" by just doing:

Code:

int main()
{
   ... // somewhere early in main. 
   {
        U u;
   }  // u no longer exists, so no space is used by u. 
   ... 
}

I know you didn't mean to make it a challenge. I will most my code and results in a bit.

--
Mats

**EVOEx** · 12-01-2008

Originally Posted by matsp

Code:

unsigned char log2n(unsigned char n)
{
     char logv = 0;
     while(n >> 1) logv++;
     return logv;
}

I'm quite sure your code is faster, but just for the fun of it:

Code:

int log2n(unsigned char n)
{
        return ((n & (n-1)) == 0) * (((n-1) * 134480385) & 4581298449) % 15 % 9;
}

**matsp** · 12-01-2008

EVOEx, your function doesn't give the right value for 32 - and since I have no idea what it actually does, I can't fix it... It returns 6 instead of the expected 5.

I removed the exit from my code so it still accepts it, just for the fun factor.

Code:

#include <iostream>
#include <ctime>
#include <cmath>
#include <cstdlib>

using namespace std;

typedef unsigned char U8;

U8 log2a(U8 n)
{
    static const U8 table[129] = { 0, 
		   0, 1, 
		   0, 2, 
                   0, 0, 0, 3, 
                   0, 0, 0, 0, 0, 0, 0, 4,
                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 
                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 
                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7 };
    return table[n];
}

U8 log2b(U8 n)
{
    U8 logv = 0;
    while(n >>= 1) logv++;
    return logv;
}

U8 log2c(U8 n)
{
     if (n >= 16)
     {
         if (n >= 64)
         {
             if (n >= 128)
                return 7;
             return 6; 
         } 
	 else 
	 {
	     if (n >= 32)
		 return 5;
	     return 4;
	 }
     }
     // Less than 16. 
     else
     {
          if (n >= 4)
	  {
	      if (n >= 8)
		  return 3;
	      return 2;
	  } 
	  else
	  {
	      if (n <= 1)
		  return 0;
	      return 1;
	  }
     }       
     return 100;
}

const double constlog2 = log(2.0);

U8 log2d(U8 n)
{
    return static_cast<int>(log(static_cast<double>(n)+0.0001) / constlog2);
}


U8 log2n(U8 n)
{
    return ((n & (n-1)) == 0) * (((n-1) * 134480385) & 4581298449LL) % 15 % 9;
}

#define _FUNC(x) { x, #x, 0.0 }
#define FUNC(x)  _FUNC(log2##x)

struct funcs
{
    U8 (*fn)(U8 n);
    char *name;
    double kiterspersec;
};

funcs functab[] =
{
    FUNC(a),
    FUNC(b),
    FUNC(c),
    FUNC(d),
    FUNC(n)
};



const int functabsize = sizeof(functab) / sizeof(functab[0]);


const double mintime = 3.1;

void timedLoop(funcs &f)
{
    int n = 1000;  // Starting value. 
    double t = 0;
    clock_t c;
    do
    {
	c = clock();
	for(int iter = 0; iter < n; ++iter) 
	{
	    for(U8 i = 1; i; i <<= 1)
	    {
		f.fn(i);
	    }
	}
	t = static_cast<double>(clock() - c) / CLOCKS_PER_SEC;
	if (t < mintime)
	{
	    if (t == 0)
	    {
		n <<= 2;
	    }
	    else
	    {
		n = static_cast<int>(mintime / t * n + 2.0);
	    }
	}
    } while(t < mintime);
    f.kiterspersec = (n / (t * 1000.0)) ;
    cout << f.name << " does " << f.kiterspersec << "k iterations per second" 
	 << endl;
}
    

int main()
{
    // First, validate results. 
    int err = 0;
    for(U8 i = 1; i; i <<= 1)
    {
	U8 r = functab[0].fn(i);
	U8 p;
	for(int j = 0; j < functabsize; j++)
	{
	    p = functab[j].fn(i);
	    if (p != r)
	    {
		cout << "func " << functab[j].name << " is different for " 
		     << static_cast<int>(i) 
		     << " expected " << static_cast<int>(r) 
		     << " got " << static_cast<int>(p) << endl;
		err++;
	    }
	}
    }
    if(err)
    {
	cout << "Error found" << endl;
//	exit(1);
    }
    double best = 0.0;
    int bestindex = 0;
    for(int i = 0; i < functabsize; i++)
    {
	timedLoop(functab[i]);
	if (functab[i].kiterspersec > best)
	{
	    best = functab[i].kiterspersec;
	    bestindex = i;
	}
    }
    cout << "Best function: " << functab[bestindex].name << " with " 
	 << functab[bestindex].kiterspersec 
	 << "k iterations per second" << endl;
    
    return 0;
}

I tried rewriting log2b() as assembler, but it was only a tiny bit faster.
Log2c is actually getting optimized by the compiler to avoid a couple fo the jumps.

Results from gcc:

Code:

E:\Temp>a
func log2n is different for 32 expected 5 got 6
Error found
log2a does 30340.7k iterations per second
log2b does 9144.34k iterations per second
log2c does 23787k iterations per second
log2d does 694.322k iterations per second
log2n does 1482.22k iterations per second
Best function: log2a with 30340.7k iterations per second

Compiled with g++ -O3 -Wall -Wextra.

I also tried Visual Studio, but it was noticably slower with -Ox - not sure why - I never looked at the assembler code.

Aside from log2n being wrong for the value 32, it's about twice as fast as log(n)/log2 with constant optimization.

--
Mats

**phantomotap** · 12-01-2008

EVOEx, your function doesn't give the right value for 32 - and since I have no idea what it actually does, I can't fix it... It returns 6 instead of the expected 5.

Nearest I can tell it doesn't do much of anything. Change the relevant loop in 'main' to 'for(U8 i = 1; i < 129; ++i)' and you will see what I mean.

Edit: Also try these two on for size:

Code:

U8 log2e(U8 n)
{
  return((n > 127) + (n > 63) + (n > 31) + (n > 15) + (n > 7) + (n > 3) + (n > 1));
}

Code:

U8 log2f(U8 n)
{
  static const U8 n1[16] = {0, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3};
  static const U8 n2[16] = {0, 4, 8, 8, 12, 12, 12, 12, 16, 16, 16, 16, 16, 16, 16, 16};
  static const U8 nt[17] = {0, 1, 2, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7};
  return(nt[n1[n & 0x0F] | n2[(n >> 4) & 0x0F]]);
}

Note: It isn't that they are fast. They are slower than many methods. They just aren't as bad as most people think, but the real fun is that they are considerably faster than what most people concoct as "clever" solutions.

Soma

**phantomotap** · 12-01-2008

^_^

Sorry, EVOEx, I didn't realize what you had done until after I had gone to get my shower. (Referencing the fact that you coded it specifically to return zero in the "gaps" exactly as the OP asked rather than just return the base two logarithm in all cases.)

Anyway, matsp, his algorithm is fine; it is his code that is off. He needs extra casting to force a 'long long' integer during other portions of the algorithm.

Code:

U8 log2n(U8 n1)
{
    long long n = n1;
    return ((n & (n-1)) == 0) * (((n-1) * 134480385) & 4581298449LL) &#37; 15 % 9;
}

Soma

Edit: Yea, I'm really just guessing. It just seems like this is probably correct. ^_^

**iMalc** · 12-02-2008

I suggest bookmarking this page. It's got some facinating and mind-boggling optimistions:
http://www-graphics.stanford.edu/~seander/bithacks.html

Here are some of the super fast methods for what you want:
http://www-graphics.stanford.edu/~se...tegerLogLookup
http://www-graphics.stanford.edu/~se...tml#IntegerLog
http://www-graphics.stanford.edu/~se...gerLogDeBruijn

**EVOEx** · 12-02-2008

Originally Posted by phantomotap

^_^

Sorry, EVOEx, I didn't realize what you had done until after I had gone to get my shower. (Referencing the fact that you coded it specifically to return zero in the "gaps" exactly as the OP asked rather than just return the base two logarithm in all cases.)

Anyway, matsp, his algorithm is fine; it is his code that is off. He needs extra casting to force a 'long long' integer during other portions of the algorithm.

Code:

U8 log2n(U8 n1)
{
    long long n = n1;
    return ((n & (n-1)) == 0) * (((n-1) * 134480385) & 4581298449LL) % 15 % 9;
}

Soma

Edit: Yea, I'm really just guessing. It just seems like this is probably correct. ^_^

It's horrible code, isn't it

? Like I said, just for fun. But I like this line better, then:

Code:

U8 log2n(U8 n1)
{
    return ((n & (n-1)) == 0) * (((n-1) * 134480385LL) & 4581298449LL) % 15 % 9;
}

The (((x-1) * 134480385LL) & 4581298449LL) % 15 I found on the web, it's a way of counting the number of bits set in the least-significant 9 bits. I just didn't realize it needed 64 bits. I wrote the numbers in decimal just to be a tad more confusing; translate it to octal or hex and they give you a bit of a clue on how it works.
So, if n is a power of two, that is (n & (n-1)) equals 0 and n is not 0, n-1 is the logarithm. Then there's the last case for n=0, which will be accepted as a power of 2, unfortunately, so the function will return the number of bits in the least significant 9 bits of -1, which will be 9. So, the last % 9 will hack it to be 0.

And yes, if you must know, making it as hackish as possible was my goal

. Although I still think there should be a faster solution than the % 9.

**iMalc** · 12-02-2008

Originally Posted by EVOEx

Although I still think there should be a faster solution than the % 9.

Yes. Mod means division, and division means slow. Even the loop shifting by one bit each time would probably be faster than performing a single modulus or division.
The actual fast ways are mentioned on the page I linked to.

**phantomotap** · 12-02-2008

The actual fast ways are mentioned on the page I linked to.

To be fair to the OP, he actually came up with the fastest way for what he wants on his own.

Although I still think there should be a faster solution than the % 9.

Considering that we are only interested in the first three bits "& 7" will be faster.

Soma

**CornedBee** · 12-02-2008

Don't ask me why.

Because C and C++ use completely independent parsers in GCC, and the C++ guys simply didn't care enough to implement parsing designators.

**iMalc** · 12-03-2008

Originally Posted by phantomotap

To be fair to the OP, he actually came up with the fastest way for what he wants on his own.

Perhaps, I wasn't doubting that a table lookup was likely the fastest. It depends how likely it is that such a table would push something else out of the cache; probably not very likely I imagine.

I noticed a bug in one one of the routines on the page I linked, for when you know that the number is a power of two.

Oh and R.Stiltskin, by looking at your table you do appear to know already that the number is a power of two, and this is only for a byte, not an int, in which case unrolling one of the methods on the linked site and combining the resulting statements gives:

Code:

return (((n & 0xF0) != 0) << 2) | (((n & 0xCC) != 0) << 1) | ((n & 0xAA) != 0);

You might find that's pretty close to the speed of a table version, and in any case compared to the original code using logs, either method is lightning fast.

Thread: quick log lookup table

Thread Tools

Search Thread

Display

quick log lookup table

Similar Threads

quick question about log

search file for values in lookup table

Lookup table with different spacing between entries

Writing array, to file

Lookup Table Reference???