Branching vs array indexing

**GReaper** · 06-20-2012

I was wondering... if I had many different functions that I wanted to call based on what the value of a variable would be, and the limits of that value where integral [0, N) , would the second method be much faster than the first?:

Code:

void proc1();
void proc2();
...
void procN();

1st:

Code:

switch (value)
{
    case 0:
        proc1();
    break;
    case 1:
        proc2();
    break;
    ...
    case N-1:
        procN();
    break;
}

2nd:

Code:

typedef void (*procedure)();

procedure procArray[N] = { proc1, proc2, ... , procN };
...
procArray[value]();

In the case of indexing though, what "guessing" rules would apply, if any?

**Elysia** · 06-20-2012

Premature optimization.
The compiler may very well optimize a switch into a function table. Regardless, don't concern yourself about it. Let the compiler do its work, while you keep the code readable.

**whiteflags** · 06-20-2012

In my opinion indexing is more useful and I try to use it when I can. Indexing is an important operation to consider, especially if you want to speed up something nontrivial like random file access. Sometimes I just do it this way for the practice.

The compiler may very well optimize a switch into a function table. Regardless, don't concern yourself about it. Let the compiler do its work, while you keep the code readable.

That really sounds like you made it up. Surprise me.

**nvoigt** · 06-20-2012

Originally Posted by Elysia

Premature optimization.
The compiler may very well optimize a switch into a function table. Regardless, don't concern yourself about it. Let the compiler do its work, while you keep the code readable.

Second that. Even if you optimize it to save 1ms (a LOT of time in computing) per run, you would need a lot of runs to compensate for the one additional working day (28.800.000ms) your colleague needs to understand your program because you optimized it for speed. Optimize for readability while keeping it "fast enough" is probably more rewarding.

That said I think your second approach using a table would indeed be more readable as the number of functions increases. I would not want to read a 2000 line switch statement where I can never be sure if there isn't one case that is not just the single function call. So it's win-win. Probably faster, most likely more readable.

**GReaper** · 06-20-2012

Thanks guys. I asked this because I've been building some old PC emulators( like Kenbak-1

), and I thought I could use the opcodes as indices!

It'd be nice if I managed to get it to run at approx 300 IPS( speaking about the Kenbak-1 ) with less than 5% of my CPU( Core2 Duo 2.4GHz )

.

**phantomotap** · 06-20-2012

I offer my vote for the table method.

"Table Driven Development" for the win!

Soma

**Elkvis** · 06-20-2012

I use a function pointer table in one of my big programs, because there are literally hundreds of entries. a switch statement in my case would be insanely huge and unreadable. if C++ had reflection, I might be able to use that instead, but the function pointer table works great for my purposes. it's fast, it's obvious what it does, and it's easy to add elements when necessary.

**Elysia** · 06-20-2012

Originally Posted by whiteflags

That really sounds like you made it up. Surprise me.

If this suffices for you...

Code:

#include <iostream>

void foo0() { std::cout << "I am foo0!\n"; }
void foo1() { std::cout << "I am foo1!\n"; }
void foo2() { std::cout << "I am foo2!\n"; }
void foo3() { std::cout << "I am foo3!\n"; }
void foo4() { std::cout << "I am foo4!\n"; }
void foo5() { std::cout << "I am foo5!\n"; }
void foo6() { std::cout << "I am foo6!\n"; }
void foo7() { std::cout << "I am foo7!\n"; }
void foo8() { std::cout << "I am foo8!\n"; }
void foo9() { std::cout << "I am foo9!\n"; }

int main(void)
{
	int n;
	std::cout << "Enter function to call: ";
	std::cin >> n;

	switch (n)
	{
		case 0: foo0(); break;
		case 1: foo1(); break;
		case 2: foo2(); break;
		case 3: foo3(); break;
		case 4: foo4(); break;
		case 5: foo5(); break;
		case 6: foo6(); break;
		case 7: foo7(); break;
		case 8: foo8(); break;
		case 9: foo9(); break;
	}
	
	return 0;
}

Looking at optimized assembly from VC++, we see:

Code:

switch (n)
003610FF  mov         eax,dword ptr [n]  
00361102  cmp         eax,9  
00361105  ja          $LN1+5 (361152h)  
00361107  jmp         dword ptr  (361159h)[eax*4]  
	{
		case 0: foo0(); break;
0036110E  call        foo0 (36101Bh)  
00361113  jmp         $LN1+5 (361152h)

Essentially, it jumps directly to the right position in the switch, then performs the call. If I enable inlining, it inlines the functions directly into the switch.
Not a direct function table call, but almost. But that's what I meant, if it isn't clear. The compiler can optimize the switch to essentially jump to the right position.
Dunno if a compiler is smart enough to understand that all cases essentially just call a function, though.

**brewbuck** · 06-20-2012

Originally Posted by whiteflags

That really sounds like you made it up. Surprise me.

Seriously? That's literally the textbook way of implementing switch statements.

**whiteflags** · 06-20-2012

Originally Posted by brewbuck

Seriously? That's literally the textbook way of implementing switch statements.

I only pretend to know everything about how it works.

**phantomotap** · 06-20-2012

That's literally the textbook way of implementing switch statements.

O_o

"That's literally the textbook way of implementing switch statements with cases that follow discernible mathematics logic."

I thought I should fix that before someone runs off and misuses a switch statement.

Soma

**cyberfish** · 06-22-2012

I believe compilers will turn switch into jump table for you only if the indices are more or less continuous. Otherwise memory space wasted would be significant, especially if it decreases code density to the point that you get more instruction cache misses, then it would definitely not be worth it.

**phantomotap** · 06-22-2012

I believe compilers will turn switch into jump table for you only if the indices are more or less continuous.

I think "GCC" will happily mix and match as is appropriate.

In other words, if several cases follow easily discernible linear mathematics the compiler may build a table for parts of the `switch' and branch otherwise as appropriate.

Soma

**wildcard_seven** · 06-23-2012

map<"some kind of key", object> .... have the objects inherit from an abstract class which has a public void function like "execute()", then you can insert various objects in there that do different jobs, and fire them all off using the same command after you've looked up what you want (command pattern)

Maybe I just spoke craziness there. Don't know. I'm trying to translate java style OO with interfaces into "c++ style". Either way, I'm quite sure I can count on a good chastising if I am wrong on any count.

**smokeyangel** · 06-24-2012

I had a look at some old (2005ish) articles on GCC's behaviour (http://ivoras.sharanet.org/papers/switch-complexity.pdf, Branch Patterns, Using GCC - CellPerformance) -- the heuristic is (was?) pretty basic -- switch tables with a large range of case values relative to the number of cases will revert to normal compare and branch. I prodded GCC a bit and I'm pretty sure the exact heuristic given on those pages is no longer correct (look it up if you're interested).

Originally Posted by phantomotap

I think "GCC" will happily mix and match as is appropriate.

In other words, if several cases follow easily discernible linear mathematics the compiler may build a table for parts of the `switch' and branch otherwise as appropriate.

From some testing, I don't think GCC does mix and match, but I think MSVC does. I may be wrong though, not great at reading x86 disassembly.

On the original question... I completely agree with what's already been said. Use whatever is appropriate and most readable/maintainable for the task at hand. For what you said (building an emulator) I'd probably opt for the table approach.

I don't know if it'll be any faster than the optimised-switch code. You've saved a branch, but gained array index and load of the function address. I'd guess it'll work out about the same.

I think "optimised" C/C++ can be useful and has its place. The programmer will know things about the application that the compiler can't know or coudln't be expected to figure out, so the programmer can write their code in a way that expresses their intent explicitly or implicitly to the compiler. I think treading on the compiler's standard optimisation turf is less likely to be useful, as you don't have total control of what is output.

Originally Posted by wildcard_seven

map<"some kind of key", object> .... have the objects inherit from an abstract class which has a public void function like "execute()", then you can insert various objects in there that do different jobs, and fire them all off using the same command after you've looked up what you want (command pattern)

Maybe I just spoke craziness there. Don't know. I'm trying to translate java style OO with interfaces into "c++ style". Either way, I'm quite sure I can count on a good chastising if I am wrong on any count.

lol, no chastising from me.... much too OO for me to comment. Didn't sound crazy until I looked up the command pattern -- looks, err, rather heavyweight!

Thread: Branching vs array indexing

Thread Tools

Search Thread

Display

Branching vs array indexing

Similar Threads

probability branching

Minimizing array lookup times using effective indexing...?

Array Indexing w/ strlen()

Branching?

array indexing