try -fno-elide-constructors. (I'm using VC++ 2008, btw)
.. And what you really want is full optimizaton and no -fno-elide-constructors
Going to bed. bye.
try -fno-elide-constructors. (I'm using VC++ 2008, btw)
.. And what you really want is full optimizaton and no -fno-elide-constructors
Going to bed. bye.
Originally Posted by brewbuck:
Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.
Yeah that did it.
I don't think -O0 would eliminate dead code, though, so the problem may not be that.
Heh, so I had this clever (read: really stupid) idea to have the function "optimize itself away" after it's first invocation:
Of course, I forgot that the code section will be read-only.Code:void do_nothing( void ) { asm ( /* setup */ "push %ecx; \n" "xor %eax, %eax; \n" /* call will push the instruction pointer onto the stack */ "call get_eip; \n" "get_eip: ;\n" /* %ecx will now contain the instruction pointer */ "pop %ecx; \n" /* look for function preamble signature */ "locate_push_ebp_signature: \n" "dec %ecx; \n" "movb (%ecx), %al; \n" "cmpb $0x55, %al; \n" "jnz locate_push_ebp_signature; \n" /* replace first byte of preamble with ret */ "movb $0xc3, (%ecx); \n" /* cleanup */ "pop %ecx; \n" ); }
I know how to fix that under Windows, but I just want to confirm that, barring that issue, it would have worked, correct?
Ok, well I've put together a sample project that should work fine. Here's how to set everything up:
- Compile suppress_nrvo.c with a plain C compiler. This should produce a much smaller object file than a C++ compiler would. Example:
gcc -c -osuppress_nrvo.o suppress_nrvo.c
- Disassemble the object file. Example:
objdump -D suppress_nrvo.o
- Note the raw byte sequences that make up the function.
- Open the object file in a hex-editor. Locate the raw byte sequences and change the first byte to 0xc3 (RET). Save changes.
- Disassemble the object file again. Verify that the first instruction of the function is a RET.
- Now just include suppress_nrvo.hpp in your projects, making sure of course to link with suppress_nrvo.o
I've included a small, prepatched object file in the ZIP archive that should be compatible with most compilers, in case you don't feel like making one yourself. You can verify it's functionality with a disassembler, of course.
If you just want a really cheap function, it looks like this:
But I would advise against that. May I suggest this little combination:Code:__declspec(naked) void cheap() { __asm ret; }
This requires that all types you use with it have a default constructor. You could adapt the macro to not require this, of course. The point is that the compiler cannot apply NRVO if there are two return statements returning different objects, and it cannot optimize the if away.Code:int g_suppressNRVO__ = 0; // Will never be set, but be sure to hide that fact from the compiler. #define SuppressNRVO(T) if(g_suppressNRVO) return T() Object func() { Object obj; SuppressNRVO(Object); OtherStuff(); }
All the buzzt!
CornedBee
"There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
- Flon's Law
Exceptions seem to disable RVO too, but thats nowhere near cheap But at least its a clean standard C++ solution
check
Named Return Value Optimization in Visual C++ 2005
I think that still just falls into the "will probably work" category, eg: some future compiler implementation could still possibly apply NRVO. On the other hand, if you look closely at CornedBee's solution it's obvious that it will *always* work since the function can return more than one possible value. It's also 100% portable, of course, making it an ideal workaround.
Beautiful stuff CornedBee. The cost of a condition check and very easy to port. Can't be any better than that.
Results in (with full optimization):Code:#ifndef DONT_SUPRESS_NRVO int g_suppressNRVO__ = 0; #define SuppressNRVO(T) if(g_suppressNRVO__) return T() #else #define SuppressNRVO(T) #endif #include <iostream> class foo { public: foo() {} foo(const foo&) { std::cout << "Success!\n"; } }; foo NRVO() { foo bar; return bar; } foo NOT_NRVO() { foo bar; SuppressNRVO(foo); return bar; } int main() { std::cout << "Fire with NRVO suppressed\n"; foo one = NOT_NRVO(); }
Knew I could count on you guys. Thanks everyoneFire with NRVO suppressed
Success!
On a side note, kudos to gcc! Which chooses to apply it even without any optimizations enabled making its behavior more consistent with its nature.
Originally Posted by brewbuck:
Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.