Thread: Execution speed slow initially, then speeds up

  1. #1
    Registered User
    Join Date
    Mar 2011
    Posts
    596

    Execution speed slow initially, then speeds up

    I have a relatively slow loop in my program, lots of calculations.

    The loop takes about 5 seconds when the program first starts, but only takes about 3 seconds
    when executed later in the program.

    Is the code being moved into cache memory or something, after it first runs?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > Is the code being moved into cache memory or something, after it first runs?
    Can you post a small complete program which also demonstrates the same behaviour?

    Caching is certainly one plausible explanation.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Mar 2011
    Posts
    596
    Probably a good idea to write something smaller and see if it behaves the same way.
    It doesn't seem to be a bug, but it would be good to make sure.

    I'll attach the smaller code if it acts the same.

  4. #4
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    First off I have a few questions.

    Are you using QueryPerformanceCounter on a SouthBridge chipset and do you have heavy PCI bus activity?
    Are you using QueryPerformanceCounter on an AMD dual core machine?

    If you answered yes to any of these then that could be your problem. Dual cores had a problem with QueryPerformanceCounter where the system could not compute the right values between the two cores. There is an AMD dual core optimizer that fixes this and there are other workarounds - like not using QueryPerformanceCounter() or coupling it with GetTickCount(). There is another hardware-based defect in the SouthBridge chipset that causes QueryPerformanceCounter() to jump ahead during periods of high PCI bus activity. This will causes your code to appear to speed up if it us using the time delta as a basis for integration over time.

  5. #5
    Registered User
    Join Date
    Mar 2011
    Posts
    596
    Not using QueryPerformanceCounter.

    Program is running on a Pentium 4, not sure of the details.

    It's slow enough that I can count off the seconds while watching it.
    I added GetTickCount() to the program, and calculated time agrees with observed time.

    I can't get a simple program to duplicate the effect, though. (program in question is pretty big)
    Still working on it.

    (just added)
    Something I noticed. I have two long loops. One always runs about 5 sec at first, and then
    3 sec thereafter. The other always runs 11 secs at first and then 9 secs thereafter. It's only
    the first loop to run that is slower. Either loop runs faster after that first one, whichever it is.

    So it's not scaling with the loop length. It's like a fixed 2 second improvement.
    Last edited by megafiddle; 11-15-2011 at 08:23 PM.

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    How long must you wait between runs for the speed to revert to the slower speed?

    It could be something to do with variable clock rate (SpeedStep). There is a certain lag between the program execution and the adjustment of the clock rate.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  7. #7
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    If any parts of your loop involve both searching a data structure and removing entries from the data structure, then that should be a pretty good reason for it speeding up. In other words, what the actual algorithm is can make all the difference. It's just random shots in the dark without seeing code.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  8. #8
    'Allo, 'Allo, Allo
    Join Date
    Apr 2008
    Posts
    639
    If this is still your fractal program (I think that's what it was) and its millions of globals, don't rule the possibility that it's just a straight up bug.

  9. #9
    Registered User
    Join Date
    Mar 2011
    Posts
    596
    Ok, I got a short version of the loop that exhibits the effect. It's related to where the loop is called from.
    The 1st call is from within WinMain. That is the slower one, by about 2 seconds. When called from within WndProc,
    it executes faster.

    But to answer the questions first:

    Quote Originally Posted by brewbuck View Post
    How long must you wait between runs for the speed to revert to the slower speed?

    It could be something to do with variable clock rate (SpeedStep). There is a certain lag between the program execution and the adjustment of the clock rate.
    Is that from one run of the loop to the next, or from initial run of program to 1st run of the loop?

    Quote Originally Posted by iMalc View Post
    If any parts of your loop involve both searching a data structure and removing entries from the data structure, then that should be a pretty good reason for it speeding up. In other words, what the actual algorithm is can make all the difference. It's just random shots in the dark without seeing code.
    I don't believe I'm doing any of that, but code is below.

    Quote Originally Posted by adeyblue View Post
    If this is still your fractal program (I think that's what it was) and its millions of globals, don't rule the possibility that it's just a straight up bug.
    It is in fact the fractal program. A few million less globals in the code below, too!

    Anyway, it may be machine dependant, but I can't try on another machine until tomorrow. And it may still very well be a bug.
    It should build with Pelles (Win32) and if you run it, the loop will run for the first time immediately after closing the
    sign-on message box. Thereafter, the 'R' key will re-execute the loop. Time is posted after each run.

    Code:
    /***********************************
            execution time test
            11/17/11
    
            keyboard controlled:
            R - generates mu set
            E - exit program
    
    ************************************/
    
    #include <windows.h>
    
    #define IMAGESIZEX 1000
    #define IMAGESIZEY 750
    
    HDC hMemDc;
    HBITMAP hBitmap;
    VOID *pbits;
    RECT imagerect;
    char mssg[25];
    int spectrum[1000];
    
    void updateclient(HWND);
    void clear(HWND);
    void mu_generator(HWND);
    int initialize(HWND);
    
    LRESULT CALLBACK WndProc(HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam);
    
    int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow)
    {	
    	HWND hWnd;
    	MSG Msg;
    	WNDCLASS wc;
    
    	wc.style = 0;
    	wc.lpfnWndProc = WndProc;
    	wc.cbClsExtra  = 0;
    	wc.cbWndExtra	  = 0;
    	wc.hInstance = hInstance;
    	wc.hIcon = LoadIcon(NULL, IDI_APPLICATION);
    	wc.hCursor = LoadCursor(NULL, IDC_CROSS);
    	wc.hbrBackground = (HBRUSH)GetStockObject(BLACK_BRUSH);
    	wc.lpszMenuName  = NULL;
    	wc.lpszClassName = "Main";
    
    	if(!RegisterClass(&wc))
    		return 0;
    
    	hWnd = CreateWindow("Main", "extest", WS_OVERLAPPEDWINDOW,
    		CW_USEDEFAULT, 0, CW_USEDEFAULT, 0, NULL, NULL, hInstance, NULL);
    
    	if(hWnd == NULL)
    		return 0;
    	
    	ShowWindow(hWnd, nCmdShow);
    
    	if(!initialize(hWnd))
    		return 0;
    
    	MessageBox(hWnd, "keyboard controlled:\n\nR - run loop\nE - exit",
    		"extest.exe", MB_OK);
    
    	ShowWindow(hWnd, SW_MAXIMIZE);
    
    	clear(hWnd);
    	mu_generator(hWnd);
    
    	while(GetMessage(&Msg, NULL, 0, 0) > 0)
    	{	TranslateMessage(&Msg);
    		DispatchMessage(&Msg);
    	}
    
    	return 0;
    }
    
    LRESULT CALLBACK WndProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
    {		
    	switch(msg)
    	{	case WM_KEYDOWN:
    			switch(wParam)
    			{	case 0x52:		// R
    				clear(hwnd);
    				mu_generator(hwnd);
    				break;
    
    				case 0x45:		// E
    				PostMessage(hwnd, WM_CLOSE, 0, 0);
    				break;
    			}
    			return 0;
    
    		case WM_PAINT:
    			updateclient(hwnd);
    			return 0;
    
    		case WM_CLOSE:
    			ReleaseDC(hwnd, hMemDc);
    			DestroyWindow(hwnd);
    			return 0;
    			
    		case WM_DESTROY:
    			PostQuitMessage(0);
    			return 0;
    
    		default:
    			return DefWindowProc(hwnd, msg, wParam, lParam);
    	}
    }
    
    void updateclient(HWND hwnd)
    {
    	PAINTSTRUCT ps;
    
    	HDC hDcPaint = BeginPaint(hwnd, &ps);	
    	
    	SelectObject(hMemDc, hBitmap);	
    	BitBlt(hDcPaint, 50, 50, IMAGESIZEX, IMAGESIZEY,
    		hMemDc, 0, 0, SRCCOPY);
    
    	EndPaint(hwnd, &ps);
    }
    
    void clear(HWND hwnd)
    {	
        int x, y;
    	DWORD *imageptr = (DWORD*)pbits;
    
    	for(y = 0; y < IMAGESIZEY; y++)
    		for(x = 0; x < IMAGESIZEX; x++)
    			*imageptr++ = 0x808080;
    
    	InvalidateRect(hwnd, &imagerect, 0);
    	UpdateWindow(hwnd);
    }
    
    void mu_generator(HWND hwnd)
    {	
    	int i, n, x, y;
    	double magsqd;
    	double cr, ci, zr, zi, tr;
    	DWORD time1, time2;
    	int maxiter = 1000;
    	double xorigin = -1.2, yorigin = 1.35;
    	double xspan = 3.6, yspan = 2.7;
    	DWORD *imageptr = (DWORD*)pbits;
    
    	time1 = GetTickCount();
    	for(y = 0; y < IMAGESIZEY; y++)
    		for(x = 0; x < IMAGESIZEX; x++)
    		{	n = 0;
    			zr = 0;
    			zi = 0;
    			cr = xorigin+(x*(xspan/(IMAGESIZEX-1)));
    			ci = yorigin-(y*(yspan/(IMAGESIZEY-1)));
    			magsqd = 0;
    
    			while(n < maxiter && magsqd <= 4.0)
    			{  tr = zr;
    				zr = (zr*zr)-(zi*zi)-cr;
    				zi = 2*tr*zi-ci;
    				n++;
    				magsqd = (zr*zr)+(zi*zi);
    			}
    
    			if(n == maxiter)
    				i = 0;
    			else
    				i = n;
    
    			*imageptr++ = spectrum[i];
    
    		}
    
    	time2 = GetTickCount();
    	InvalidateRect(hwnd, &imagerect, 0);
    	UpdateWindow(hwnd);
    	MessageBox(hwnd, _itoa(time2-time1, mssg, 10), "loop time (milliseconds)", MB_OK);
    }
    
    int initialize(HWND hWnd )
    {
    	int n, p, p1, p2, c1, c2;
    	float dp, dc, mr, mg, mb, rval, gval, bval;
    
    	HDC hDc = GetDC(hWnd);
    	BITMAPINFO bminfo;
    	BITMAPINFOHEADER bmiheader;
    
    	imagerect.left = 50;
    	imagerect.top = 50;
    	imagerect.right = 50+IMAGESIZEX;
    	imagerect.bottom = 50+IMAGESIZEY;
    
    	int node[4] = {1, 50, 100, 999};
    
    	DWORD color[4] = {0x202020, 0xFF6060, 0xFFA0A0, 0xFFFFFF};
    
    	spectrum[0] = 0;
    	for(n = 0; n < 4; n++)
    		spectrum[node[n]] = color[n];
    
    	for(n = 0; n < 3; n++)
    	{
    		p1 = node[n], p2 = node[n+1];
    		dp = p2 - p1;
    
    		c1 = GetRValue(color[n]), c2 = GetRValue(color[n+1]);
    		dc = c2 - c1;
    		mr =  dc/dp;
    		rval = c1 + 0.05;
    
    		c1 = GetGValue(color[n]), c2 = GetGValue(color[n+1]);
    		dc = c2 - c1;
    		mg =  dc/dp;
    		gval = c1 + 0.05;
    
    		c1 = GetBValue(color[n]), c2 = GetBValue(color[n+1]);
    		dc = c2 - c1;
    		mb =  dc/dp;
    		bval = c1 + 0.05;
    
    		for(p = p1+1; p < p2; p++)
    		{	rval += mr;
    			gval += mg;
    			bval += mb;
    			spectrum[p] = RGB((BYTE)rval, (BYTE)gval, (BYTE)bval);
    		}
    	}
    
    	hMemDc = CreateCompatibleDC(hDc);
    	if(hMemDc == NULL)
    	{	MessageBox(hWnd, "cannot create Mem DC","error", MB_OK);
    		return 0;
    	}
    
    	bminfo.bmiHeader = bmiheader;
    
    	bmiheader.biSize = sizeof(BITMAPINFOHEADER);
    	bmiheader.biWidth = IMAGESIZEX;
    	bmiheader.biHeight = IMAGESIZEY;
    	bmiheader.biPlanes = 1;
    	bmiheader.biBitCount = 32;
    	bmiheader.biCompression = BI_RGB;
    	bmiheader.biSizeImage = 0;
    	bmiheader.biXPelsPerMeter = 0;
    	bmiheader.biYPelsPerMeter = 0;
    	bmiheader.biClrUsed = 0;
    	bmiheader.biClrImportant = 0;
    
    	hBitmap = CreateDIBSection(NULL, (BITMAPINFO*)&bmiheader, 0, &pbits, NULL, 0);
    	if(hBitmap == NULL)
    	{	MessageBox(hWnd, "cannot create bitmap","error", MB_OK);
    		return 0;
    	}
    
    	return 1;
    }
    Last edited by megafiddle; 11-17-2011 at 08:51 PM.

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    I built it with Visual C++ 2010 and I don't see any timing inconsistency when I run it. About 790 milliseconds each time, including the first time.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  11. #11
    Registered User
    Join Date
    Mar 2011
    Posts
    596
    Thanks for testing it.

    I consistantly get about 4203 mSec for the 1st and 2047 mSec for all those after.

    If I remove the 1st call to the loop from WinMain, and only call it in response to a key,
    it runs at 2047 mSec right from the very 1st call.

    I wonder if the higher speed of you're computer has anything to do with it. I would guess
    the memory is also larger, but mine ought to be more than enough.

    I did forget to delete the DIBSection, hBitmap, in that code, but adding in the delete had
    no effect on the times, and I wouldn't expect it to. Would have caused a memory leak,
    though at program end, I imagine.

  12. #12
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Seems weird that it runs almost exactly half-speed.

    Looks like one of those bugs which is simultaneously satisfying and revolting once you figure it out.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  13. #13
    Registered User
    Join Date
    Mar 2011
    Posts
    596
    It's not actually a 2 to 1 speed change. With a different loop (different math function),
    the time actually increases for subsequent loops!

    With original loop above:

    1st loop run - 4203
    thereafter - 2047

    With a different loop:

    1st loop run - 10687
    thereafter - 12735

  14. #14
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Instead of GetTickCount(), could you try using QueryPerformanceCounter() instead? I have a feeling this is just the OS messing with you. I see nothing wrong with your code, and the particular loop you are timing clearly has nothing in it that should be changing from one execution to the next.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  15. #15
    Registered User
    Join Date
    Mar 2011
    Posts
    596
    I have to do some research to use QueryPerformanceCounter(); not familiar with the
    date types and necessary headers.

    But I originally noticed the time differences just by watching the screen update. Counting
    off seconds while watching it, I get the same results (it's slow enough on my machine to
    do that).

    My computer is a microATX with onboard video, so it's not exactly high performance.
    Pentium 4 single core processor. The performance graphs show resources are not even
    close to being taxed, though.

    I am writing directly into a bitmap in the test code above. In a previous version I was
    using SetPixel(). Slower overall, but still about the same 2 second difference.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Execution speed test in c++
    By nick2price in forum C++ Programming
    Replies: 1
    Last Post: 03-12-2009, 04:23 PM
  2. upload / download speeds
    By Dino in forum Tech Board
    Replies: 20
    Last Post: 11-25-2008, 09:06 PM
  3. Speed of pointers vs. speed of arrays/structs
    By Kempelen in forum C Programming
    Replies: 32
    Last Post: 06-27-2008, 10:16 AM
  4. Unbelievable wind speeds
    By VirtualAce in forum A Brief History of Cprogramming.com
    Replies: 12
    Last Post: 09-09-2004, 03:42 PM
  5. Questions on Speed of Execution
    By wavering in forum C Programming
    Replies: 22
    Last Post: 01-20-2002, 02:04 PM