Thread: Is virtual calls really slow or it is just bull........?

  1. #16
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    So I did a test using this code:
    Code:
    // Test.cpp : Defines the entry point for the console application.
    //
    
    #include "stdafx.h"
    #include <windows.h>
    #include <iostream>
    
    void Test();
    
    namespace Constants
    {
    	const int Kilo = 1000;
    	const int Mega = Kilo * 1000;
    }
    
    int _tmain(int argc, _TCHAR* argv[])
    {
    	HANDLE h = OpenThread( THREAD_ALL_ACCESS, FALSE, GetCurrentThreadId() );
    	if (h == NULL)
    	{
    		std::cout << "Failed to open thread!\n";
    		return 1;
    	}
    	if (! SetThreadPriority(h, THREAD_PRIORITY_TIME_CRITICAL) )
    	{
    		std::cout << "Failed to set thread priority!\n";
    		return 1;
    	}
    	CloseHandle(h);
    
    	h = GetCurrentProcess(); //OpenProcess( GetCurrentProcessId(), FALSE, PROCESS_ALL_ACCESS );
    	if (h == NULL)
    	{
    		std::cout << "Failed to open process!\n";
    		return 1;
    	}
    	if (! SetPriorityClass(h, REALTIME_PRIORITY_CLASS) )
    	{
    		std::cout << "Failed to set process priority!\n";
    		return 1;
    	}
    	CloseHandle(h);
    
    	void (*pTest)() = &Test;
    	DWORD dwStart = GetTickCount();
    	for (int i = 0; i < 1000 * Constants::Mega; i++)
    		pTest();
    	std::cout << "Took " << GetTickCount() - dwStart << " ms.\n";
    
    	dwStart = GetTickCount();
    	for (int i = 0; i < 1000 * Constants::Mega; i++)
    		Test();
    	std::cout << "Took " << GetTickCount() - dwStart << " ms.\n";
    
    	return 0;
    }
    Test is just an empty function defined in another source file to prevent the compiler from optimizing away the function call.
    I set the process and thread priority to time critical to avoid as much outside interference as possible.
    I ran the test a total of 36 times, and these are the results I got:

    Function pointer (ms)
    3292
    3307
    3307
    3276
    3307
    3307
    3308
    3292
    3307
    3323
    3307
    3292
    3323
    3323
    3339
    3291
    3323
    3291
    3323
    3323
    3323
    3291
    3323
    3323
    3323
    3307
    3292
    3307
    3308
    3292
    3323
    3307
    3307
    3322
    3338
    3307
    3308

    Direct call (ms)
    3307
    3292
    3308
    3276
    3323
    3276
    3307
    3276
    3308
    3291
    3292
    3276
    3307
    3276
    3307
    3276
    3291
    3276
    3307
    3276
    3307
    3292
    3308
    3276
    3307
    3276
    3291
    3276
    3292
    3276
    3276
    3308
    3276
    3292
    3276
    3307
    3276

    Mean (ms)
    Function pointer: 3310
    Direct call: 3291

    Standard deviation (ms)
    Function pointer: 14,52609746311790
    Direct call: 14,74945919506710

    Standard uncertainty (ms)
    Function pointer: 0,3925972287329160
    Direct call: 0,3986340322991110

    Conclusion: the performance impact is negligible, yet there seems to be an impact on using the function pointer, although more conclusive tests would have to be done to say exactly.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  2. #17
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by abachler View Post
    Utter horse manure. A pointer to a function has no run-time overhead, that's the whole point of using them.
    Your first sentence here applies to your second.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  3. #18
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by Elysia View Post
    So I did a test using this code:

    Test is just an empty function defined in another source file to prevent the compiler from optimizing away the function call.
    I set the process and thread priority to time critical to avoid as much outside interference as possible.
    I ran the test a total of 36 times, and these are the results I got:


    Conclusion: the performance impact is negligible, yet there seems to be an impact on using the function pointer, although more conclusive tests would have to be done to say exactly.
    Quote Originally Posted by Visual Studio
    1>------ Build started: Project: OpenMP Sandbox, Configuration: No Debug Info Win32 ------
    1>Compiling...
    1>main.cpp
    1>Linking...
    1>LINK : warning LNK4224: /OPT:NOWIN98 is no longer supported; ignored
    1>main.obj : error LNK2001: unresolved external symbol "void __cdecl Test(void)" (?Test@@YAXXZ)
    1>E:\Projects\Double Vision Recorder\No Debug Info\OpenMP Sandbox.exe : fatal error LNK1120: 1 unresolved externals
    1>Build log was saved at "file://e:\Projects\Double Vision Recorder\OpenMP Sandbox\No Debug Info\BuildLog.htm"
    1>OpenMP Sandbox - 2 error(s), 1 warning(s)
    ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
    Odd that it wouldn't even compile. Ah I see well, after I actually defined test, it failed to open the thread, so perhaps the code is a bit buggy. Change that monstrosity at the start to this -
    Code:
    	if (! SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL) ){
    		std::cout << "Failed to set thread priority!\n";
    		return 1;
    		}
    Here, i cleaned it up and among other things removed the part that includes the allocation fo the local variable as a penalty to the pointer routine. probably negligeable but its bad form.
    Code:
    #include <windows.h>
    #include <iostream>
    
    void Test(){
    	
    	return;
    	}
    
    namespace Constants
    {
    	const int Kilo = 1000;
    	const int Mega = Kilo * 1000;
    }
    
    int main(int argc, char* argv[]){
    	if (! SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL) ){
    		std::cout << "Failed to set thread priority!\n";
    		return 1;
    		}
    	if (! SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS) ){
    		std::cout << "Failed to set process priority!\n";
    		return 1;
    		}
    
    	void (*pTest)() = &Test;
    	DWORD dwStartpFunc , dwStartFunc;
    	DWORD dwStoppFunc , dwStopFunc;
    	dwStartpFunc = GetTickCount();
    	for (int i = 0; i < 1000 * Constants::Mega; i++) pTest();
    	dwStoppFunc = GetTickCount();
    	std::cout << "Took " << dwStoppFunc - dwStartpFunc << " ms.\n";
    
    	dwStartFunc = GetTickCount();
    	for (int i = 0; i < 1000 * Constants::Mega; i++) Test();
    	dwStopFunc = GetTickCount();
    	std::cout << "Took " << dwStopFunc - dwStartFunc << " ms.\n";
    
    	return 0;
    	}
    and apparently I cant get VS2008 to stop optimizing the function call, either that or it claims my computer is performing one billion increments in less than 15ms

    Ok, the problem came down to forcing a rebuild. And teh results i got where below the timer resolution.

    Here is the final code -

    Test.cpp
    Code:
    extern unsigned long Count;
    
    void Test(unsigned long* Junk){
    	
    	*Junk+=2;
    	
    	return;
    	}
    main.cpp
    Code:
    #include <windows.h>
    #include <iostream>
    
    extern void Test(unsigned long*);
    unsigned long Junk = 0;
    unsigned long Count = 1000000000;
    
    namespace Constants
    {
    	const int Kilo = 1000;
    	const int Mega = Kilo * 1000;
    }
    
    int main(int argc, char* argv[]){
    	if (! SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL) ){
    		std::cout << "Failed to set thread priority!\n";
    		return 1;
    		}
    	if (! SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS) ){
    		std::cout << "Failed to set process priority!\n";
    		return 1;
    		}
    	
    	for(int y = 0;y<10;y++){
    	// first we make sure the code for test() is in teh cache, so the pointer routine doesnt getpenalized
    		for(int x = 0;x<1000;x++) Test(&Junk);
    
    		void (*pTest)(unsigned long*) = &Test;
    		DWORD dwStartpFunc , dwStartFunc;
    		DWORD dwStoppFunc , dwStopFunc;
    			
    		dwStartpFunc = GetTickCount();
    		for (int i = 0; i < Count; i++) pTest(&Junk);
    		dwStoppFunc = GetTickCount();
    		std::cout << "pFunc Took " << dwStoppFunc - dwStartpFunc << " ms.\n";
    
    		dwStartFunc = GetTickCount();
    		for (int i = 0; i < Count; i++) Test(&Junk);
    		dwStopFunc = GetTickCount();
    		std::cout << "Func Took " << dwStopFunc - dwStartFunc << " ms.\n";
    
    		std::cout << "\n";
    		}
    
    	return 0;
    	}
    results indicate that there is no zero, or very little difference in speed of either method.
    Last edited by abachler; 09-26-2009 at 09:41 AM.

  4. #19
    The larch
    Join Date
    May 2006
    Posts
    3,573
    It does compile. It doesn't link. What you need to make it link is describe in the section you quoted.

    Not having a good day?
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  5. #20
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by abachler View Post
    Odd that it wouldn't even compile. Ah I see well, after I actually defined test...
    Did you miss where I specifically mentioned that Test was an empty method in another source file?
    Quote Originally Posted by Elysia View Post
    Test is just an empty function defined in another source file to prevent the compiler from optimizing away the function call.

    Quote Originally Posted by abachler View Post
    ...it failed to open the thread, so perhaps the code is a bit buggy.
    Perhaps. I am not an expert on threads and process priority. I never really change them. I can only say that it worked for me, and it was just really a minimal test to make the experiment.

    Quote Originally Posted by abachler View Post
    Here, i cleaned it up and among other things removed the part that includes the allocation fo the local variable as a penalty to the pointer routine. probably negligeable but its bad form.
    What? You removed or added a local variable? Bad form? If it is bad form, then why did you do it?

    and apparently I cant get VS2008 to stop optimizing the function call, either that or it claims my computer is performing one billion increments in less than 15ms
    If you disable the link-time code generation, it will not optimize the function call of a function in another source file. That is what I did.

    Quote Originally Posted by abachler View Post
    Ok, the problem came down to forcing a rebuild. And teh results i got where below the timer resolution.
    Fast cpu? Did you try increasing the number of iterations?

    Quote Originally Posted by abachler View Post
    results indicate that there is no zero, or very little difference in speed of either method.
    I am not surprised. Using the pointer causes negligible overhead in 99.9999% of the cases, it would seem.
    However, as I mentioned, I did see that pointer call was slightly slower.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  6. #21
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    I'm willing to take the performance hit of using virtual simply because of the flexibility, maintainability, and extendability it offers. Those numbers won't change my mind.

  7. #22
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    This is true, and I agree. The study does show that the impact of pointers in negligible. Thus, you should not consider them as overhead - virtual functions will be just as fast as direct calls (the whole point of the topic).
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #23
    Registered User
    Join Date
    Apr 2006
    Posts
    2,149
    Quote Originally Posted by abachler View Post
    Highly unlikely it will cause a page fault. it will most certainly cause cache misses, but only the first time it is called, after that the data will be in the cache. what slows it down is the extra memory accesses, which is why its very bad to use as part of an inner loop. Inner loops are inherently CPU bound if designed properly. Any extra work will slow the system down.

    Now with THAT said, its probable that in most cases you will never notice the difference.
    Yeah, page faults are rare with the amount of memory computers have these days, but I included that for completeness.

    A cache miss is a memory access. Or rather, it's a memory access that actually slows the CPU. An L1 cache hit will make a memory access as fast as a register access.

    Likely, cache misses will be more frequent than once per memory location, because multiple memory locations are mapped to the same cache location, and because in a modern multi-process environment will switch between applications often.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

  9. #24
    Registered User
    Join Date
    Apr 2006
    Posts
    2,149
    Quote Originally Posted by Elysia View Post
    This is true, and I agree. The study does show that the impact of pointers in negligible. Thus, you should not consider them as overhead - virtual functions will be just as fast as direct calls (the whole point of the topic).
    That's not what your test showed. Your test was between direct calls, and function pointers. Vitrual calls require an additional layer of indirection.

    Also, the compiler may actually optimize your function pointers to direct calls, since the pointer is compile time constant here.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

  10. #25
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by King Mir View Post
    That's not what your test showed. Your test was between direct calls, and function pointers. Vitrual calls require an additional layer of indirection.
    All in all, the same as virtual function calls.
    Looking at assembly for function calls, the compiler will basically just call a specific function pointer inside the vtable inside the class. So this is basically the same code.
    Some compilers might do differently, but this is sufficient for me.

    Also, the compiler may actually optimize your function pointers to direct calls, since the pointer is compile time constant here.
    Lucky for me, then, that it did not.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #26
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by Elysia View Post
    What? You removed or added a local variable? Bad form? If it is bad form, then why did you do it?
    I guess a better way of saying is I moved it, so that it didn't count against the pointer test. I have a bit of experience writing benchmarking tests, and a big part of it is makign sure the test is fair to both pieces fo code you are testing. If somethign that test A does will benefit Test B then you need to either eliminate that benefit or write a pretest that will give Test A the sme benefit. In this case it was having the function in the cache already.

    If you disable the link-time code generation, it will not optimize the function call of a function in another source file. That is what I did.
    No need for that, I just needed to force a rebuild, as I stated.
    I am not surprised. Using the pointer causes negligible overhead in 99.9999% of the cases, it would seem.
    However, as I mentioned, I did see that pointer call was slightly slower.
    My figures, which included running the test automatically several times, showed zero difference. Average runtime for both pieces of code was ~4 seconds, sometimes the pointer would be faster, sometimes the function would be faster. They averaged exactly the same.

    Quote Originally Posted by Bubba View Post
    I'm willing to take the performance hit of using virtual simply because of the flexibility, maintainability, and extendability it offers. Those numbers won't change my mind.
    I don't see how virtual adds an of those features, or how not writing the function as virtual removes any of them. In fact I find that over-virtualization tends to make your classes spaghetti code. Maybe it's just the use of them I have seen from the specific subset of programmers whose code I have had to maintain. I personally can count of one hand the number of times I've used it n my code over the last 15 years. Most of the classes I write deal with specific pieces of hardware, or specific software models whose functionality is necessarily completely encapsulated within the base class.
    Last edited by abachler; 09-27-2009 at 07:06 AM.

  12. #27
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by abachler View Post
    I guess a better way of saying is I moved it, so that it didn't count against the pointer test. I have a bit of experience writing benchmarking tests, and a big part of it is makign sure the test is fair to both pieces fo code you are testing. If somethign that test A does will benefit Test B then you need to either eliminate that benefit or write a pretest that will give Test A the sme benefit. In this case it was having the function in the cache already.
    Ah, I see. Fair enough, yes.

    My figures, which included running the test automatically several times, showed zero difference. Average runtime for both pieces of code was ~4 seconds, sometimes the pointer would be faster, sometimes the function would be faster. They averaged exactly the same.
    That is what I got before running in time critical mode.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #28
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by Elysia View Post
    That is what I got before running in time critical mode.
    and what I got when running in it.

  14. #29
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    *shrug*
    Then I guess your system just does not want to dedicate resources to the app.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #30
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by Elysia View Post
    *shrug*
    Then I guess your system just does not want to dedicate resources to the app.
    Did you run my version of the benchmark and get different results? Also, what processor are you using, it may be a pipeline issue, where my processor may have a deeper pipeline and therefor preemptively load the pointer during a free memory cycle. Newer and older processors have more shallow pipelines, which could effect the test.

    the latest version is -
    Code:
    // Test.cpp : Defines the entry point for the console application.
    //
    
    #include <windows.h>
    #include <iostream>
    
    extern void Test(unsigned long*);
    unsigned long Junk = 0;
    unsigned long Count = 1000000000;
    
    namespace Constants
    {
    	const int Kilo = 1000;
    	const int Mega = Kilo * 1000;
    }
    
    int main(int argc, char* argv[]){
    	if (! SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL) ){
    		std::cout << "Failed to set thread priority!\n";
    		return 1;
    		}
    	if (! SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS) ){
    		std::cout << "Failed to set process priority!\n";
    		return 1;
    		}
    	
    	for(int y = 0;y<10;y++){
    	// first we make sure the code for test() is in teh cache, so the pointer routine doesnt getpenalized
    		for(int x = 0;x<1000;x++) Test(&Junk);
    
    		void (*pTest)(unsigned long*) = &Test;
    		DWORD dwStartpFunc , dwStartFunc;
    		DWORD dwStoppFunc , dwStopFunc;
    			
    		dwStartpFunc = GetTickCount();
    		for (int i = 0; i < Count; i++) pTest(&Junk);
    		dwStoppFunc = GetTickCount();
    		std::cout << "pFunc Took " << dwStoppFunc - dwStartpFunc << " ms.\n";
    
    		dwStartFunc = GetTickCount();
    		for (int i = 0; i < Count; i++) Test(&Junk);
    		dwStopFunc = GetTickCount();
    		std::cout << "Func Took " << dwStopFunc - dwStartFunc << " ms.\n";
    
    		std::cout << "\n";
    		}
    
    	return 0;
    	}

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 26
    Last Post: 07-05-2010, 10:43 AM
  2. Replies: 48
    Last Post: 09-26-2008, 03:45 AM
  3. pure virtual calls in destructor
    By FillYourBrain in forum C++ Programming
    Replies: 2
    Last Post: 08-21-2003, 08:31 AM
  4. C++ XML Class
    By edwardtisdale in forum C++ Programming
    Replies: 0
    Last Post: 12-10-2001, 11:14 PM
  5. Exporting Object Hierarchies from a DLL
    By andy668 in forum C++ Programming
    Replies: 0
    Last Post: 10-20-2001, 01:26 PM