Thread: Tile-map performance

  1. #1
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607

    Tile-map performance

    Major problem with tile maps and I'm not sure how to solve it.

    My tile map uses a repeating grid of vertices. The only thing that changes is the starting texture in the first grid cell or the upper left corner. As you scroll right, the offset increases, down it increases by mapwidth, etc, etc.

    Big problem. I'm using DrawPrimitive to draw 1 tile with a texture. This means I'm setting the texture for every single tile being rendered. Big performance drain there.

    But is there any way around this?

    Positioning each quad in world space would and drawing all quads that share a texture would not be a problem, but....determining which cells are visible might be. I could use a quad tree but I'm not so sure that would be any faster than my current method.

  2. #2
    vae victus! skorman00's Avatar
    Join Date
    Nov 2003
    Posts
    594
    If you're using tiles, you already have a quad tree of sort, and you're probably already culling stuff treating it as so. Setting a texture context is one of the biggest bottlenecks on any card, so cutting down the number of times that gets done.

    But why are you setting it each time?
    The only thing that changes is the starting texture in the first grid cell or the upper left corner.
    Set the texture once at the beginning of a frame, then go through all of those DrawPrimative calls.

  3. #3
    Registered User
    Join Date
    Jan 2006
    Location
    Sweden
    Posts
    92
    I would make some sort of arrays with pointers to the tiles which shares the same texture, and maybe cull them one by one or something. But as you say this might be a slow process. Would it maybe be a good idea to somehow render all of your textures into one big and place that on the smaller primitives and then cull as you currently do? But I wonder if that would drain a lot of memory perhaps.

    I've thought about this as well for my terrain engine, but I decided that one big texture would look the best and be least work so I didn't look into this further I'm afraid.

  4. #4
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    I can't just set the upper left corner and go from there. The upper left corner determines the starting texture in the map.

    If the map looks like this:

    1 2 3 4 5 6 7
    8 9 1 2 3 4 5
    ....
    Scrolling one cell to the right yields this:

    2 3 4 5 6 7 ...
    9 1 2 3 4 5 ...
    ....
    But you see with each change in value I must change the texture.

    And no there is no quad tree here, just a simple ortho projection in screen space with a repeating grid. Movement is achieved by changing the start offset in the map. So it auto-culls everything because it only draw what it visible.

  5. #5
    vae victus! skorman00's Avatar
    Join Date
    Nov 2003
    Posts
    594
    If you have a grid, you have a quad tree (sort of). But I see what you're trying to get at about the scrolling. Spending some extra cycles to order your drawing based on texture will probably still be faster than changing the context each tile.

    Depending on the size of your textures and the way you draw them, you might be able to put a few textures into one. That way you don't have to change the context so often, but you will have to worry about setting up the correct uv's for the tiles. The biggest drawback with that though is that you can't wrap textures, and mip/mag filters can look funky if you don't set the texture up correctly.

  6. #6
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Ok I think I've come up with a solution.

    Instead of thinking old school tile maps I need to break it down into large primitives. Direct3D is very good at doing large primitives with one texture.

    Solution:

    No tiles at all
    Only use tiles in the editor. When I save the tile maps, don't save them as data, but save them as 1024x1024 or 512x512 textures. In other words, save each portion of the map as one texture and stick that in the data file. At run-time I simply render a quad 1024x1024 (or whatever resolution we decide on) and use a quad-tree system to see which quads are on-screen. For those that are within screen limits, draw the whole quad. Hardware is very good at clipping large quads.

    So this breaks the DrawPrimitive and SetTexture down from (Tilesize/ScreenWidth)^2 to 4 max for any one screen since it's possible that you could be at the intersection of 4 of these quads.

    Tile-specific effects can still be done by using decaling, render to texture, etc, etc. I will still know where in the world the tiles are by just doing a little math based on the world scroll values.

    I think this will be a major speed up and will take less memory than storing one instance of each tile. This will free up a lot of resources. It's an odd-way to do pixel perfect scrolling but other methods just don't work anymore.

    Scrolling using texture transform flags
    Using one large texture won't work either since you cannot fit a 4096x4096 texture in video memory and yet you could have a world size that large. The alternative to this would be to cache in portions of the texture at a time, but again, this requires locking surfaces which is what I want to avoid. I would love to just transform texture u,v's to scroll the picture because it would be uber fast......but I cannot do this. I would have to cache in a portion of the data, use D3DXCreateTexture to make it a texture, and then render it. This would happen at 1024x1024 boundaries and I believe it would cause slow downs. I may even try to store a 4096x4096 texture in a file and then create several smaller textures from it using my tile extraction code. They would just be larger tiles. Then perhaps I may try this image cache scheme and just scroll the u,v's.

    For those of you interested in the tile extraction code:

    CTextureManager
    Code:
    bool CTextureManager::AddFromFileEx(std::string File,DWORD dwWidth,DWORD dwHeight,
                                        DWORD &dwOutWidth,DWORD &dwOutHeight)
    {
      //Create base texture
      IDirect3DTexture9 *pSourceTexture=NULL;
      D3DXCreateTextureFromFile(m_pDevice,File.c_str(),&pSourceTexture);
    
      //Now lock the surface
      D3DLOCKED_RECT sourceRect;
      pSourceTexture->LockRect(0,&sourceRect,NULL,0);
    
      //Get surface desc
      D3DSURFACE_DESC sourceDesc;
      pSourceTexture->GetLevelDesc(0,&sourceDesc);
    
      dwOutWidth=sourceDesc.Width;
      dwOutHeight=sourceDesc.Height;
    
    
      //Pointer to surface (buffer)
      DWORD *pSourceBits=(DWORD *)sourceRect.pBits;
      DWORD dwSourceOffset=0;
      DWORD dwStartSourceOffset=0;
    
      //X and Y counters
      DWORD dwX=0,dwY=0;
    
      do
      {
        do
        {
          //Create new texture object
          CTexture *pTexture=new CTexture;
          
          //Create texture from buffer
          pTexture->CreateFromImageEx(m_pDevice,
                                      dwWidth,
                                      dwHeight,
                                      pSourceBits,
                                      dwSourceOffset,
                                      sourceRect.Pitch);
    
          //Add texture to vector
          m_vTextures.push_back(pTexture);
    
          //Update loop variables
          dwX+=dwWidth;
          dwSourceOffset+=dwWidth;
          
        } while (dwX<(sourceRect.Pitch/4));
    
        //Move down one cell size in source texture
        dwX=0;
        dwStartSourceOffset+=((sourceRect.Pitch/4)*dwHeight);
        dwSourceOffset=dwStartSourceOffset;
        dwY+=dwHeight;
      } while (dwY<sourceDesc.Height);
      
      return false;
    
    }
    CTexture.h
    Code:
    //Grabs a texture of width,height from pSourceBuffer
    void CTexture::CreateFromImageEx(IDirect3DDevice9 *pDevice,
                               DWORD dwTargetWidth,
                               DWORD dwTargetHeight,
                               DWORD *pSourceBuffer,
                               DWORD dwSourceOffset,
                               DWORD dwSourcePitch)
        {
          //Create a blank texture
          D3DXCreateTexture(pDevice,
                            dwTargetWidth,
                            dwTargetHeight,
                            D3DX_DEFAULT,
                            0,
                            D3DFMT_A8R8G8B8,
                            D3DPOOL_MANAGED,
                            &m_pTexture);
    
          //Lock it's surface and get a pointer to it
          D3DLOCKED_RECT rect;
          m_pTexture->LockRect(0,&rect,NULL,0);
          DWORD *ptrBits=(DWORD *)rect.pBits;
    
          //Surface description
          D3DSURFACE_DESC desc;
          m_pTexture->GetLevelDesc(0,&desc);
            
          //Offsets
          DWORD dwOffset=0;
          DWORD dwStartOffset=0;
          DWORD dwStartSourceOffset=dwSourceOffset;
    
          //Start the copying
          for (DWORD i=0;i<desc.Height;i++)
          {
            for (DWORD j=0;j<desc.Width;j++)
            {
              //Copy from source to this texture's surface
              ptrBits[dwOffset]=pSourceBuffer[dwSourceOffset];
              dwSourceOffset++;
              dwOffset++;
            }
            dwStartSourceOffset+=(dwSourcePitch/4);
            dwSourceOffset=dwStartSourceOffset;
            
            dwStartOffset+=(rect.Pitch/4);
            dwOffset=dwStartOffset;
          }
    
          //Unlock surface
          m_pTexture->UnlockRect(0);
        }
    Last edited by VirtualAce; 04-10-2006 at 05:28 AM.

  7. #7
    Supermassive black hole cboard_member's Avatar
    Join Date
    Jul 2005
    Posts
    1,709
    Didn't they do something with boundaries in Morrowind? I'm not sure of the details, it just gets on my nerves that it hangs up a couple of seconds every 30 steps.
    Good class architecture is not like a Swiss Army Knife; it should be more like a well balanced throwing knife.

    - Mike McShaffry

  8. #8
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Great I messed up my own google search. A search for per-pixel scrolling yields my own thread on here about per-pixel scrolling. Unfortunately it's the method I'm dumping.

    New method:

    Old school algo. Lock all surfaces of textures and store pointers to them in a class member.

    Lock texture representing screen.
    Draw to surface using older algo from DOS days.
    Unlock texture.

    Draw quad the size of the screen using texture just created.

    Require 1 texture and tile textures. Tile textures reside in the vector or list and all are unlocked. Since I'm not actually rendering the tile textures, leaving them unlocked should not cause any issues.

    We will see.

  9. #9
    vae victus! skorman00's Avatar
    Join Date
    Nov 2003
    Posts
    594
    Per-pixel scrolling? I still don't see why you can't scroll the uv's.

  10. #10
    Registered User
    Join Date
    Jan 2005
    Posts
    847
    Quote Originally Posted by skorman00
    Per-pixel scrolling? I still don't see why you can't scroll the uv's.
    That would require one large texture.

  11. #11
    Registered User
    Join Date
    Apr 2006
    Posts
    43
    How many tiles are you going to have and what size are they going to be? How many tiles are you going to have displayed at the same time? Unless it's very very many I don't see any reason why you would have performance problems, all geometry and textures should be uploaded to the graphics card already, you should at worst have to regenerate the tiles at the borders...Do you have any pictures of what you're doing?

  12. #12
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    1. Thousands
    2. 1024x768 (32x32)
    3. Multiple Sprites (any size)
    4. GUI and HUD
    5. Special effects sprites

    I got it working.

    No you cannot just regenerate the tiles at the borders. Let's say you have a vertex buffer with a grid. As you scroll the grid vertices you must reset their position when :

    (iScrollX % iCellSize ==0) && (iScrollY % iCellSize==0)

    However what do you do if the screen being displayed is smaller than the actual screen? You must re-create the vertex grid and/or regenerate it. Solution is do not use a vertex grid.

    My new algo simply draws pixel by pixel. It retrieves a pixel for:

    Code:
    DWORD dwTileID=m_pMapMgr->GetMapValue(dwCurMapLayer,dwMapOffset);
    It gets the pixel from:

    Code:
    DWORD *ptrSurface=m_pTexMgr->GetBuffer(dwTileID);
    So:

    Code:
    D3DLOCKED_RECT rect;
    m_pScreenTexture->LockRect(0,&rect,NULL,0);
    
    DWORD *ptrScreen=(DWORD *)rect.pBits;
    
    ...
    ...
    ptrScreen[dwSourceOffset]=ptrSurface[dwDestOffset];
    
    
    ....
    m_pScreenTexture->UnlockRect(0);
    Draws the screen.

    Then I set texture to m_pScreenTexture and render a quad the size of the screen.

    1 Lock
    Several memory accesses
    1 Unlock

    1 Triangle strip
    1 Primitive
    1 Screen texture

    All textures in the texture manager class are already unlocked and remain this way for the duration of the game.

    I clear the screen texture using:

    Code:
    __asm {
      //Pointer to screen
      mov edi,ptrScreen
      
      //size = surfHeight*(surfPitch>>2)+surfWidth
      mov eax,surfHeight
      mov ebx,surfPitch
      shr  ebx,2
      mul  ebx
      add  eax,surfWidth
      
      //Size is ecx
      mov ecx,eax
      
      //Color is 0
      mov eax,0
    
      //Set [EDI+ECX] to EAX for ECX count
      rep  stosd
    }
    Using aggregated data types in inline asm is such a pain. I wish they would change that.

    Now you must do this:

    Code:
    struct Foo
    {
      int x;
    };
    
    Foo Test;
    
    __asm {
      mov ebx,OFFSET Test
      mov ecx,[ebx].x
    }
    But you can get away with this since access restrictions are a compiler side mechanism. I DON'T recommend this.

    Code:
    class Foo
    {
      private:
        int value;
      public:
        Foo():value(0) { }
    };
    
    int main(void)
    {
      Foo Test;
      __asm {
        mov ebx,OFFSET Test
        mov [ebx].value,0FFh;
      }
    }
    Even though value is private to Foo, in inline asm blocks there are no access restrictions to data members. However you cannot call member functions. Actually I could call a member function by using it's address rather than it's name. Also could make a pointer to the VTABLE and parse it as well, but no need.
    Last edited by VirtualAce; 04-12-2006 at 05:06 AM.

  13. #13
    Registered User
    Join Date
    Apr 2006
    Posts
    43
    Haven't coded D3d for ages, can't say I follow your code...

    So you have a large texture that shows your tiles?

    And then you update that texture pixel by pixel?

    I would have done the tiles as polygones with u,v coordinates indexed into a larger texture. What I would do is basically a chunk based terrain engine but without any elevation data or LODing, (look at Thatcher Ulrichs stuff) or do I misunderstand the purpose of why you do your tiles?

  14. #14
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    And how are you going to fit a 4096x4096 texture in video memory?

    You cannot just scroll u,v's and expect this to work. It will work fine until you reach the end of the texture. Video drivers do not allow for anything larger than 2048x2048 and you are lucky if they allow that. Besides the tile map is in memory as a linear array and we are drawing according to the map. I would have to make several separate 1024x1024 textures and cache them in accordingly at run time. Since 4 textures could be intersecting at any one time this would be non-trivial code. I could use IDirect3DTexture9's copy method but it has many restrictions.

    And if you want a dynamic tile map you will either have to use decaling which requires culling primitives, etc., etc. You could also do real-time texture writes using render targets but again most of this is far too complicated for a simple tile engine.

    I can do lighting of each tile by performing additive blending in assembly code and then placing the resulting pixel in the screen buffer. In fact since I have per-pixel access to the texture representing the screen....I can do a lot.

    The article on GameDev about doing 2D with 3D is nice, but it fails to address the problem of setting the texture for every single tile. This can be reduced since most tile maps are repetitive, but it is still a bad approach. Creating an entire vertex grid in memory is one approach but it would have to be software and not hardware vertex buffers. Since my vertices are using D3DFVF_XYZRHW you cannot run them through translation and expect anything to happen. So to move the vertices you must lock, move, and unlock. Not your best case scenario there.
    Rendering could be sped up by using a quad-tree, but for updating the scroll position you must update every single vertex in the map or it won't look right. I can think of ways to optimize this as well, but see.......you are working against yourself.

    So I went with the old-school per-pixel algo.
    Last edited by VirtualAce; 04-12-2006 at 05:24 AM.

  15. #15
    Registered User
    Join Date
    Apr 2006
    Posts
    43
    I'm not suggesting you should do the scrolling by updating the u,v coordinates of each polygonvertex, transforming the vertex positions by hardware takes care of the scrolling. Setting the clip area so you can render the map inside a smaller area.

    For 10000 different kind of tiles you could upload 10 1024x1024 textures containing the tiles to the graphics card and then rendering the visible tiles onto the screen. Sorting the tiles rendering order so you don't do more than 10 texture switches and you should have very decent performance and without too much work.

    I haven't read the GameDev article you mention, but I think you are doing this overly complex, switching between 10 different textures is not going to kill you as long as you sort the polygones properly.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Smooth walking on tile based map system
    By abraham2119 in forum C Programming
    Replies: 8
    Last Post: 07-10-2009, 10:33 AM
  2. Polynomials and ADT's
    By Emeighty in forum C++ Programming
    Replies: 20
    Last Post: 08-19-2008, 08:32 AM
  3. File map performance
    By George2 in forum C++ Programming
    Replies: 8
    Last Post: 01-04-2008, 04:18 AM
  4. Need testers for editor
    By VirtualAce in forum Game Programming
    Replies: 43
    Last Post: 07-10-2006, 08:00 AM
  5. Tile map loading/saving
    By sand_man in forum Game Programming
    Replies: 16
    Last Post: 04-23-2005, 09:38 PM