Originally Posted by
CornedBee
Have you checked in the assembly that all functions actually get inlined?
Code:
$L1093:
; 65 : for(n = 0; n < 60; n++)
; 66 : for(i = 0; i < 10000; i++)
; 67 : a(j, n);
mov eax, DWORD PTR _j$[esp+32]
cdq
xor esi, esi
mov ebp, eax
mov DWORD PTR $T1180[esp+36], edx
$L1096:
mov eax, 1
xor edx, edx
mov ecx, esi
call __allshl
mov edi, eax
mov eax, DWORD PTR $T1180[esp+36]
mov ebx, edx
and edi, ebp
and ebx, eax
mov DWORD PTR tv473[esp+32], 10000 ; 00002710H
$L1099:
mov ecx, edi
or ecx, ebx
je SHORT $L1100
lea ecx, DWORD PTR [esi+1]
mov eax, 1
xor edx, edx
call __allshl
mov ecx, DWORD PTR $T1180[esp+36]
and eax, ebp
and edx, ecx
or eax, edx
jne SHORT $L1100
inc DWORD PTR ?c1@@3HA ; c1
$L1100:
dec DWORD PTR tv473[esp+32]
jne SHORT $L1099
inc esi
cmp esi, 60 ; 0000003cH
jl SHORT $L1096
mov eax, DWORD PTR _j$[esp+32]
inc eax
cmp eax, 1000 ; 000003e8H
mov DWORD PTR _j$[esp+32], eax
jl SHORT $L1093
; 68 : report("a");
push OFFSET FLAT:??_C@_01MCMALHOG@a?$AA@
call ?report@@YAXPAD@Z ; report
add esp, 4
This is the code for "calling" a(), as you can see by the included C comments. As discussed before, the long shift is not inlined - if you know of a way to make that happen, let me know. [One problem with "inlining" the 64-bit shift is that it's actually an external assembler function, so the compiler doesn't natively know how to do a 64-bit shift for some reason]. -- and before anyone says so, yes, I've tried "Max Optimization" as well as "Optimize for speed" - which are the two options most likely to make it work.
I also tried to write my own inline version of "shift64", but that was slower - because of the parameters passed to the inline function get stored in a temporary memory location and then retrieved from the memory location in the assembly code - short of writing the WHOLE function in assembler, there's nothing you can do about that - I've tried before!
[If the functions are NOT inlined, the time is roughly 8 seconds per function, and the calling overhead is much larger than the benefit/loss of any particular method of checking the bits].
--
Mats