And if you want to know why code is performing poorly, a debugger-disassembler is also a great tool for studying correct programs. I had this code:
Code:
struct foo {
int something;
char the_buffer[1<<18];
unsigned int a_index_for_the_buffer;
};
My program was working fine. But there was a 'bad thing' in it. The buffer index was below a big buffer. The generated assembly for the accesses of the buffer suffered a big drawback. The EBP register was used as the struct pointer, and index was way 256k after EBP. So every instruction for indexing the buffer was like: MOV [EBP+262148],EAX. Which included the literal 32bit number 262148 at the machine code of the instruction. Now the 686 family of processors has some hardcoded [EBP+-] certain offsets instructions which do not load a pointer offset.
If i hadn't used a disassembler to view my generated code, that would be an optimization i would have missed. I moved the buffer to the end of the struct, and code run 20-40% faster.