Thread: How does DirectX's Present work?

  1. #1
    Registered User
    Join Date
    Sep 2011
    Posts
    25

    How does DirectX's Present work?

    I'm still kind of new to programming games in C++ with DirectX. I have this project I'm working on in preparation for my senior project coming soon (I'm in school for Game and Simulation Programming). It's just a simple 3D application with three spheres orbiting around like a solar system. It runs fine, but I'm having problems understanding why I can't get above about 60 frames per second, even when I've removed everything from the rendering loop besides the Present call.

    It seems that the Present call itself is slowing down the application. I don't know how Present works in-depth and was just curious if anyone here could fill me in. Is it that Present hangs the application until the display is ready for a presentation? If so, how can I only call Present when it's necessary/appropriate instead of once every render interation? If 60 FPS is totally acceptable (I'm aware of how FPS correlates with monitor refresh rate) and the highest anyone would ever need to go, how is it that mainstream games can report 200 FPS or more? What does that number actually represent?

    Any and all input is greatly appreciated. Just a curious, probably ill-equipped student.

  2. #2
    Just a pushpin. bernt's Avatar
    Join Date
    May 2009
    Posts
    426
    Is it that Present hangs the application until the display is ready for a presentation?
    Pretty much. It waits until the monitor is done doing its vertical retrace, so your frames per second is locked to the refresh rate of the display (or a factor of that, if drawing your scene takes longer than 1/60 of a second).

    As for acceptable framerates, 60 FPS is perfectly acceptable, and is near the limit of human perception.
    30 FPS is considered smooth as well (standard films play at 24 FPS), and game consoles are usually locked at 25 or 30, to match PAL or NTSC standards. There are some who can tell the difference between 30 and 60, but it really depends on the audience and the type of display.

    If so, how can I only call Present when it's necessary/appropriate instead of once every render interation?
    I'm not terribly familiar with Direct3D (I'm more into OpenGL, myself) but this is what I gather from the MSDN page. There's an optional 5th parameter, for passing flags that affect how Present operates. By default the D3DPRESENT_INTERVAL_ONE bit is set but you can pass D3DPRESENT_INTERVAL_IMMEDIATE to ignore vsync.

    how is it that mainstream games can report 200 FPS or more? What does that number actually represent?
    It's just how often the framebuffer is being updated, nothing to do with the actual refresh rate (unless the application is set to sync with the refresh rate, of course). At these higher framerates you start to get screen tearing though, since the framebuffer can get updated while the monitor is still doing its trace.
    Consider this post signed

  3. #3
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Use D3DPRESENT_INTERVAL_IMMEDIATE to turn off vsync when you fill out your D3DPRESENT_PARAMETERS structure.

  4. #4
    Registered User
    Join Date
    Sep 2011
    Posts
    25
    Sorry for the late reply. I was trying to do a little more research on all this.

    Thanks for the tips, guys. I changed the presentation interval to D3DPRESENT_INTERVAL_IMMEDIATE as was suggested. It appears that Present() is no longer slowing things down. Instead, my LPD3DXMESH->DrawSubset(0) calls are. If I comment them out, I can get an effective 400 FPS (with no meshes drawn to the screen, of course). But if I uncomment either one or all of them, it immediately drops to around 24-30. What could be causing this?

  5. #5
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    The mesh may have too many vertices. Keep in mind that ID3DXMesh is really provided to get something up and running quickly. You can get the same functionality by using an IDirect3DVertexBuffer9 and an IDirect3DIndexBuffer9 in a class and provide functions to perform operations on those buffers. You can use the optimization functions for mesh as well to re-order vertices, sort them by attribute, etc. which will give a small performance gain. Without drawing anything to the screen you should get in the neighborhood of 400 to 800 FPS. As soon as you draw it will drop to around 250 to 300. Keep in mind that dropping from 400 to 200 is nothing compared to dropping from 60 to 30. There is more to the numbers than meets the eye. The important part of the data is the frame delta or how long it takes to render/update one frame. You may want to time the update and render functions separately b/c this will show you how long it takes to render as opposed to how long it takes to render. Update tends to be slower and render should be blazing fast in a single threaded renderer.

    Make sure you did not specify a multi-threaded device when you create the device. This can signficantly degrade performance...although not as much as you have indicated.

    Without seeing your main loop anything else I say here is merely a guess.

    You can tell Direct3D to not wait for present to complete with vsync on by specifying D3DPRESENT_DONOTWAIT.

    From the SDK:
    D3DPRESENT_DONOTWAIT A presentation cannot be scheduled by a hal device. If this flag is set in a call to IDirect3DSwapChain9::Present, and the hardware is busy processing or waiting for a vertical sync interval, then Present will return D3DERR_WASSTILLDRAWING to indicate that the blit operation is incomplete.
    Use with caution as this can cause your update and render loops to get out of sync with one another. If you update 10 times and present failed 10 times then your update is now 10 frames ahead of your render. You can imagine how far off you will be after only a few minutes of game time. If, on the other hand, you wait until D3DERR_WASSTILLDRAWING is not returned then you might as well use D3DPRESENT_INTERVAL_ONE.
    Last edited by VirtualAce; 12-06-2011 at 06:33 PM.

  6. #6
    Registered User
    Join Date
    Sep 2011
    Posts
    25
    Quote Originally Posted by VirtualAce View Post
    Without seeing your main loop anything else I say here is merely a guess.
    I have not yet applied the suggestions you gave in your last post, but attached is what I have thus far. I realize the matrix math in the render methods isnt as optimized as it should be, but that seems to have a negligible effect on performance compared to drawing-related operations. Do you see anything I am doing wrong?

    Also, I keep seeing IDirect3DSwapChain9 pop up, but can not find any information on when its appropriate to use it and very little in the way of tutorials. Could you offer anything?
    Attached Files Attached Files

  7. #7
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    From your code:
    Code:
    if(PeekMessage(&msg, NULL, 0U, 0U, PM_REMOVE))
    {
        TranslateMessage(&msg);
        DispatchMessage(&msg);
    }		
    else
    {
       render();
       ...
    This is probably the main issue. What you are doing is checking to see if there are any Windows messages and if there are you process them. But you do not render if there was a message. This means that if 50 messages come in succession you will not render for 50 frames.

    Consider this:
    Code:
    while(PeekMessage(&msg, NULL, 0U, 0U, PM_REMOVE))
    {
        TranslateMessage(&msg);
        DispatchMessage(&msg);
    }		
    
    Update(timeDelta);
    Render();
    ...
    Update updates the scene and all objects that need to be updated. Render then will draw all the items. You are updating and rendering in one function.

    Another suspect area is this:
    Code:
    createCamera(1.0f, 1000.0f);  // near clip plane, far clip plane
    Why are you re-computing the same exact projection matrix every frame? You are not changing the near or far plane so the matrix is not going to change. The only matrices that will change are the view matrix produced by the camera and the world matrices of the objects.

    Also prefer not to use D3DTS_PROJECTION and D3DTS_VIEW and instead you can use D3DTS_WORLD and set it to the world * view * projection. After all the following equations yield the same result:

    World = world matrix of object
    View = view matrix of camera
    Projection = projection matrix of camera
    matWVP = World * View * Projection

    World = World * View * Projection
    View = Identity
    Projection = Identity
    matWVP = World * View * Projection

    Prefer to set the projection and view matrices a minimum amount of times to render the scene. Setting these causes Direct3D to do a lot of internal housekeeping that will degrade performance.

    Calling this function per frame is slow:

    Code:
    void createCamera(float nearClip, float farClip)
    {
        // Here we specify the field of view, aspect ratio and near and far clipping planes
        D3DXMatrixPerspectiveFovLH(&matProj, D3DX_PI/4, float(WINSIZE_WIDTH)/WINSIZE_HEIGHT, nearClip, farClip);
        pd3dDevice->SetTransform(D3DTS_PROJECTION, &matProj);
    }
    • D3DX_PI /4 is invariant and does not change for the life of the application and is also equivalent to D3DX_PI * 0.25f
    • WINSIZE_WIDTH / WINSIZE_HEIGHT is also invariant
    • nearClip and farClip are invariant
    • matProj is invariant


    If you set your compiler optimizations in release builds it should optimize out this invariant code but I would not rely on that.


    A couple of other items of interest:
    • Create a proper camera class - your camera will not work in all orientations. The camera class should only re-compute the view matrix when the camera orientation has changed.
    • Consider using quaternions for rotation instead of Euler angles or prefer to use axis-angle representation instead of the D3DXMatrixRotationX/Y/Z functions. If you concatenate these the resulting matrix can suffer from gimbal lock
    • Consider making planet, sun, and moon objects and/or derive from a common base type and place them in a render list. This will allow for efficient frustum culling when you get to it.
    • Consider using hierarchichal transforms for your system so that moon can orbit planet and planet and moon can orbit sun - Equation for transform is: Parent * World or Parent * Scale * Rotate * Translate. Parent of the moon is planet and parent of the planet is sun. One problem here is that when the sun rotates the entire system will rotate but you can solve that as well.
    Last edited by VirtualAce; 12-07-2011 at 11:24 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. My birthday present to myself
    By Liger86 in forum A Brief History of Cprogramming.com
    Replies: 11
    Last Post: 01-25-2003, 09:49 PM
  2. Here's my little x-mas present to all you guys...
    By funkydude9 in forum Game Programming
    Replies: 4
    Last Post: 12-25-2002, 05:31 PM
  3. My christmas present :D
    By RoD in forum A Brief History of Cprogramming.com
    Replies: 22
    Last Post: 12-22-2002, 02:06 PM
  4. Present board bugs
    By vasanth in forum A Brief History of Cprogramming.com
    Replies: 1
    Last Post: 08-12-2002, 11:01 AM