Resource manager tree

**VirtualAce** · 08-16-2007

I've come up with a resource manager tree structure that I want advice on.

The concept is simple. Each object has a unique ID. The objects are sorted into a tree using the following:

If the object ID is greater than the current node's ID, use the current node's right child. If there is no right child, create a new node using the object ID and set it to be the right child of the current node. Add this ID to a double linked list.
If the object ID is less than the current node's ID, use the current node's left child. If there is no left child, create a new node using the object ID and set it to be the left child of the current node. Add this ID to a double linked list.

The doubly linked list is a simple list of all the ID's that are in the tree. The list is not used to find the object's in the tree but it is used to cache out the least recently used objects. So when a new object is inserted into the tree, it is also inserted into the top of the list. If the list size is larger than a pre-set cache size, the bottommost list member is removed. If the object in the list is requested from the tree and the object is in the tree...and thus in the list...the object in the list is moved to the TOP of the list.

What this amounts to is that the most used objects or resources will be in the list and thus in the tree. I'm using the tree as a very fast way to find objects and I'm using the list as a way to cache them in or out. Worst case is that the requested ID is not in the tree and thus returns NULL at which point the calling function will need to create the object and insert it into the tree (and then this puts it at the top of the list).

I'm storing a pointer to the object in the list in the node so that the object in the list can be re-ordered using the data in the node. The list never needs to be searched and should never be since it is O(n).

Code:

struct TreeNode
{
  CResource *pObj;
  ListNode *pListPointer;
  TreeNode *pParent;
  TreeNode *pLeft;
  TreeNode *pRight;
};

struct ListNode
{
  CResource *pObj;
  ListNode *Prev;
  ListNode *Next;
};

class CResource
{
  protected:
    DWORD m_dwID;
  public:
    ...
    ...
};

class CResMgr
{
  private:
    void Insert(CResource *pObj,TreeNode *pCurNode,bool bInserted);
    CResource *Find(DWORD dwID,TreeNode *pCurNode,bool bFound);

  protected:
    TreeNode *pRoot;
    ListNode *pTop;
    ListNode *pBottom;
  public:
    ...
    ...
    CResource *Get(DWORD dwID);
    DWORD Add(CResource *pObj);
    ...
    ...
};

Comments?

**QuestionC** · 08-16-2007

This seems problematic. So... let me get this straight...

Find...

Perform a lookup for the ID in the tree. Let's assume it is found in the tree.
Use pListPointer to see if the object is currently cached (how? pListPointer != NULL?)
If the object is cached in the list, then return a pointer to the object.
If the object is not cached in the list, then load the object, stick it in the list, and return a pointer to the object.

Searching the list, BTW, is not a O(n) operation since it doesn't really grow with the number of elements... it is of finite size. The list might only contain 5 elements, while n = 10,000.

**VirtualAce** · 08-16-2007

I made a mistake in the code above. The list is self contained in the tree. Each node has a TreeNode *pNext and TreeNode *pPrev. So there are two data structures contained in one structure.

The list will grow as objects are added but I'm only allowing so many object. But to find an object in the list is still a linear search and is very slow. Otherwise I could just use the list and forget the tree. The tree is for searching and the list is used for caching. Trying to cache in and out of the tree would be problematic.

This is akin to using a std::map to represent the tree and a std::list to represent the resource cache.

I'm not quite sure how large of a list I will need to keep the cache from thrashing but also keep the game from pausing while it loads new resources. The biggest problem I've had is with the resource manager. The resource manager cannot realistically re-create or load the resource in question because it knows nothing about the underlying resource. So the object needing the resources has been delegated this task. Whenever an object needs a resource it first checks the cache. If this returns NULL then the resource must be loaded by the object and then placed into the tree and list.

Ideally my system should have 2 textures per ship, 2 or 3 textures per planet (with 1 or 2 planets per sector), sound fx for the current ship and it's laser systems (laser sounds are unique relative to the type of laser), music, models, etc.

So a list of 5000 objects would probably not be out of the question at any one time. Trying to access a certain object in the list would be quite slow which is why I need the tree.

Memory used by the objects is cleaned up by the resource manager when it pushes a new resource on the top of the list thus pushing 'out' the bottom resource in the list. When this resource is destroyed the memory it was using is then given back to the system.

**QuestionC** · 08-16-2007

I think I understand what you are trying to do, except for one detail...

When you get a cache miss with a full cache, you need to invalidate an old member of the cache as you add a new member. This is trivial to do with the linked list, but difficult with the tree.
Do you plan on
A) Leaving nodes with invalid cache entries in the tree?
or
B) Removing nodes from the tree as they are invalidated in the list?

In the latter case, it seems like there should be a better option than a basic tree, what that is I am not sure.

I have a hard time following you without at least some code, so I threw together my interpretation of what you're trying to do is.

Code:

struct CacheListNode {
   DataType data;
   std::string name;
   CacheListNode * next;
   CacheListNode * prev;
};

std::map <std::string, CacheListNode> cache_map;

// Cached lookup for data based on its identifier, aka 'name'.
// This cache has two goals...
// 1) Minimize the number of times we call expensive_data_loading_function();
// 2) Keep a limited number of DataTypes in memory.  DataType is LARGE!
DataType find (std::string name) {
   // Step 1: Figure out which node the name belongs to
   ListNode * cache_node = cache_map[name];

   // Step 2: Is the node still cached for that entry?
   if (cache_node->name != name) {
      // Step 3: If cache missed, re-cache it.

      // Use the last node.
      cache_node = last_node;
      last_node = last_node->prev;

      // Repopulate it with proper data.
      cache_node->name = name;
      cache_node->data = expensive_data_loading_function(name);

      cache_map[name] = cache_node;
   }

   // Step 4: Move the node containing data to the front of the list.
   // Remove from curr pos
   cache_node->next->prev = cache_data->prev;
   cache_node->prev->next = cache_data->next;

   // Replace at front of list.
   cache_node->next = first_node;
   first_node->prev = cache_node;
   first_node = cache_node;

   // Step 5: Return the data.
   return cache_node->data;
}

This doesn't integrate the list and map as you plan to do, but that could be done without much hassle.

**VirtualAce** · 08-16-2007

Yes removing objects from the tree is why I have to use the list. When I remove the object from the list and thus the tree I guess I will have to rebuild the tree from that point down. Hmmm.

Wonder how I can prevent that? Thanks for the constructive input. Sometimes you get a good algo and overlook small details that break it into pieces.

Any ideas on how I can prevent the tree rebuild?

EDIT:

I thought about this at work today and realized I don't have to rebuild the tree. I will try to get some samples up and running ASAP to illustrate.

**QuestionC** · 08-17-2007

Erasing a single element from a tree is doable, it's just nice to find another method because it's a dirty business, and it is possibly more efficient to just leave all the old elements in the tree.

I wasn't aware of this before, but std::map already has an erase method.

**QuestionC** · 08-17-2007

To elaborate actually, there are two issues with removing an element from the tree.

Removing elements from the tree can hurt its balance.
If you match every cache miss with a node erasure in addition to a node addition, then you are adding a log(n) operation to what is already a log(n) operation. Now, consider...
Code:
```
2 log(n) = log(n*n)
```
So... if erasing is slower than searching, erasing and then adding to a map of ~5,000 elements is slower than just searching a list of ~25,000,000 elements.

There are benefits to removing an element from the tree as well. Faster on a cache hit, less memory devoted to the cache index, and possibly less memory allocations (since the new node can just re-use the 'erased' node's memory). Without an implementation, it's easy to make a case for both methods.

**VirtualAce** · 08-18-2007

Ok after talking with Shakti for a couple of hours on Teamspeak I think we finally have a solution or close to one.

Whether or not I use my tree class or std::map all depends on the speed of my functions vs. the speed of STL. I sorta have a feeling I will lose....but that's not the issue.

But I will use this structure:

Code:

struct ResNode
{
  CResource *pResource;
  ResNode *pNext;
  ResNode *pPrev;
};

class CResMgr
{
  protected:
    std::map<DWORD,ResNode *> m_mapResources;
...
...
};

CResource is a simple class that, like the code posted previously, just stores an ID in the base class and other resource classes then derive from this very simple base.

Cannot use std::list and cannot store vector indices in the nodes

The biggest problem with using a map and a list of IDs or vector of IDs is this. When a resource is found to be in the tree, I must move that ID to the top of the list. This means that the list or cache holds the most recently used resources. There are several problems here.

This operation is basically Tree to List. I find the ID in the tree and then somehow I must find the ID in the list to move it to the top. list.find() is most certainly not an option since it is linear in time. Storing indexes in the tree is also not an option because at some point the indices into the list will become invalid and/or will point to the wrong objects.

Because of these two limiting factors using std::list and using stored indices into a vector in the tree nodes is not an option.

Solution: Linked list inside of the tree

So take the following object IDs representing object and a cache representing the order in which they were created. Instead of calling the list a list I will now refer to it as the cache since that is what its function is.

How the cache works

The cache grows downward. This also requires the class to track the top of the cache. This is very simple. When there are no objects yet in the tree we know the first object to be added is going to be the top. When I add a new object to the tree, the top then becomes the new object (new node). The new node's next member is now the former top member. In this fashion the cache will grow upward with top always changing. I will also have to track bottom so I know which object to remove from memory. Instead of actually shifting the object IDs in the cache I will track the total size of the cache. Once the cache reaches a set limit, I automatically remove the bottommost cache ID from the tree. This does NOT guarantee this is an optimum removal for the tree...but it does manage the memory used by the game.

Sample tree and cache

Top of the cache is 5 and bottom is 7. I could not draw the tree here but it is not important to the discussion anyways.

Tree - 5 10 3 4 8 1 7
Cache - 7 1 8 4 3 10 5

Object 7 was created last and object 5 was first or most recent. Now let's say something in the game is requesting object 4.

All I really need to do is stick object 4 in front of 5 and then make up for the gap created by 4 being moved. This is a simple operation. 4's prev member is 8. Since 4 is moving, 8's next member is now 4's next member. 3's prev member is now 4's prev member. A simple linked list removal. That takes care of the cache gap.

Now to take care of the top of the cache. 5's next member is NULL since it's the top of the cache. If we set 5's next member to be 4 and 4's prev member to be 5...then we have moved 4 from it's spot in the cache to the top of the cache. We have changed the order of the cache and changed the top or root with 2 simple operations.

The beauty of this system is that I do not have to 'find' the object ID in the cache since I'm always storing it's prev and next members in the tree nodes. As long as next and prev are correct this system will work and I can use a tree to find the objects which is very fast.....and use the linked list approach to remove items. I don't have to start at the top of the cache and work through it in linear fashion because I already know the prev and next members of the current object.

Confusing? It was for me until I thought it out.

In short the cache represents the most recently used (top) to least recently used (bottom) object IDs in the tree. The objects are in the tree and as long as the ID assignment scheme is decent it should be a quick find operation to access them and return the resource to the caller.

So whaddya think? I think this is the same as a double ended queue. I have not had much exposure to that container so I don't know.

ID assignment algorithm

The next task is correct assignment of ID's to keep the tree somewhat balanced. In some of the examples that Shakti and I worked through, the map quickly became a linked list which is not what I want. Keeping the tree balanced is going to require some ingenuity. Hopefully I have enough left.

**VirtualAce** · 08-21-2007

Here is my progress so far on the system along with a CRandom class I found in Game Coding Complete, 2nd ed.

**QuestionC** · 08-21-2007

I realize you have already honed in upon a clean solution, so my comments here are purely academic. Still, I haven't gotten a chance to say some things in response to your last post.

Originally Posted by Bubba

Storing indexes in the tree is also not an option because at some point the indices into the list will become invalid and/or will point to the wrong objects.

Lists and Maps have the important property that inserting elements do not invalidate any iterators. Erasing an element also does not invalidate any iterators except, of course, for the iterator pointing at the erased element.

Meaning, either of these structures would work.

Code:

// Map and List are seperate
typedef std::list <CResource> CacheList;

typedef std::map <DWORD, CacheList::iterator> CacheMap;

Code:

// List is built-in to the Map
class CacheNode;

typedef std::map<DWORD, CacheNode> CacheMap;

class CacheNode {
   CResource * data;
   CacheMap::iterator next;
   CacheMap::iterator prev;
}

In fact, you could technically use a array-implemented linked list (or vector-implemented) because the cache size is presumably constant. Just whenever you invalidate an old element of the cache, you do not erase that node... you replace it with the new element you have to add to the cache.

Originally Posted by Bubba

ID assignment algorithm

The next task is correct assignment of ID's to keep the tree somewhat balanced. In some of the examples that Shakti and I worked through, the map quickly became a linked list which is not what I want. Keeping the tree balanced is going to require some ingenuity. Hopefully I have enough left.

Because you won't need to iterate through the tree, perhaps a hashed associative container is ideal.

**VirtualAce** · 08-22-2007

Major changes

Resources are not given unique ID's but are instead based on filenames. Even if I compile resources into one huge data file I can still give each one a unique name. This allows me to use a texture in the cache multiple times based on the name. If I based it on the ID and the ID was assigned dynamically by the resource manager, there was no way to say this mesh or that mesh used texture ID(x) since (x) was always changing.
Resources are now considered to be resource files or files in memory. When a resource is created the name is stored and the file is opened and loaded into a buffer in the class. When this is added to the resource manager it is placed into the tree and at the top of the cache. Classes needing to use this resource can query the resource manager using it's unique name. If it's there then this is a cache hit. If it's not the resource will need to be created. As an exampe a texture not in the cache would be created via D3DXCreateTextureFromFileInMemory() using the data in the class buffer for the resource. This is a cache miss for sure but cannot be completely avoided.
Cache misses are tracked via a counter and the resource manager will dynamically increase or decrease the size of the cache based on the current counts. If there is an abundance of cache misses within a certain time frame this probably means the cache is being thrashed. If this happens the size of the cache will be increased to stop the cache thrash. The time frame for determining a definite cache thrash situation will have to be determined through testing. This does NOT mean the resource cache will load more resources since this is not the function of the cache. The cache merely tracks the resources and informs other classes in the engine if the resource they are requesting is in the cache. But by increasing or decreasing the size of the cache it can control how many resources exist in memory at any one time.
The resource manager class is NOT part of the main application class. It is now a global object and is available to all classes via X3DGetResMgr(). This prevents me from having to pass the thing around to everyone who needs to use it.
The Direct3D device is no longer a part of the main application class. This is available to all via X3DGetDevice(). Again this prevents me from having to pass the device pointer around to every class that needs it.
The camera manager is NOT part of the resource cache system. It manages its own map of Camera's mapped to unique ID's created via the random class and returned to the caller for later access to the camera. Camera's are very small objects and using them in the paradigm of the cache system was just not practical. Cameras can be added, viewed, iterated, and removed at any time. Since the view is totally dependent on the camera, changing the camera will immediately change the view or render. The camera manager is a global object available to all via X3DGetCamMgr().
The sound fx engine will no longer use DirectMusic. It was far too complicated trying to map sounds to audiopaths and so forth. Instead I will write a DirectSound object class and DirectSound support classes to facilitate sound effects. They will be incorporated into the resource cache system. The music system will continue to use DirectMusic but the music system will not use the resource cache for several reasons I can't go into here.

Other changes are being made as well to streamline the engine. I will begin testing all of this as soon as I get it all working. Once this is completed I should be able to begin to build the actual game data.

This streamlines the engine down to about 3 main manager classes to handle all of the resources and is much more manageable. Upon lost device or lost focus (ALT-TAB) the cache will be flushed and the tree will be destroyed or deleted. As each object needs to access it's meshes, textures, etc. they will query the cache. Since the cache does not have the objects anymore due to the flush this will necessitate a reload of current meshes, textures, etc which will then re-validate all of the objects lost during the lost device event, rebuild the tree, and rebuild the cache. MSDN recommends a similar approach for responding to and recovering from a lost device event although they don't go into details since it is engine specific on how it is dealt with.

**fractality** · 08-28-2007

Why do you use the Random stuff?

What is sizeof(DWORD) and sizeof(CResource *) ?

It seems like you are wasting a lot of time searching for unique identifiers when you already have one at hand.

Can you post the final version of the code?

/f

**VirtualAce** · 08-28-2007

Why do you use the Random stuff?

It comes from a Merseinne Twister random number generator class. I was using it to generate unique IDs...but it is a 'pseduo' random number generator so it's not guaranteed to be unique.
I cannot just assign IDs in numerical order or the tree becomes a linked list and I lose all efficiency.
I switched it to use strings for now.

What is sizeof(DWORD) and sizeof(CResource *) ?

Not sure I'm doing sizeof(DWORD) but I 'was' using sizeof() to compute current cache size when objects were added. I've added a size member to CResource so all I need now is the sizeof the node plus the size of the resource the node represents and add/subtract that value to/from the current cache size.

It seems like you are wasting a lot of time searching for unique identifiers when you already have one at hand.

I do not have guaranteed unique IDs since I'm using a pseudo random generator.

Can you post the final version of the code?

I will as soon as it is completed.

**CornedBee** · 08-29-2007

The concept seems sound, but you are re-implementing a lot of tree and list functionality that you don't need to re-implement. Nor do you need to use the indirection forced on you if you use the STL containers.

The Boost.Intrusive library (voted into Boost, but not yet released - you can get it from SVN) does pretty much what you need: tie structs that you manage the memory of into tree or list structures.

Alternatively, if you don't want to manage the memory at all, Boost.Multi_Index actually is part of Boost right now. There you could use a hash index instead of a tree index, which in your case would probably be a gain, especially if you use strings as the keys.
This use case is already very similar to what you need. You can use the splice() method of the list index to shift elements around in the list without ever invalidating the tree/hash. If the index is a tree, the container will take care of keeping it balanced.

Edit: You can't use multi_index if an element can ever be in the tree but not in the list, though.

**CornedBee** · 08-29-2007

For the fun of it, I've written a rough implementation of this using multi_index.

Code:

// Core container
#include <boost/multi_index_container.hpp>
// Hash index.
#include <boost/multi_index/hash_index.hpp>
// Sequenced (list) index.
#include <boost/multi_index/sequenced_index.hpp>
// Key extraction.
#include <boost/multi_index/mem_fun.hpp>

namespace mi = boost::multi_index;

class CResMgr
{
	// Define the type that holds the cache.
	typedef mi::multi_index_container<
		CResource *,
		mi::indexed_by<
			mi::sequenced<>,   // A sequential index for LRU removal.
			mi::hashed_unique< // A hash index for quick lookup.
				mi::const_mem_fun<CResource, std::string, &CResource::GetID>
				> // Hashed on the return value of the GetID() method.
			>
		> cache_type;
	typedef cache_type::nth_index<1>::type cache_hash_type;
	// The actual cache.
	cache_type cache;

public:
	void Add(CResource *pObj);
	CResource* Get(const std::string &sID);
};

void CResMgr::Add(CResource *pObj)
{
	// That's it. This adds it to the front of the list and makes an entry in
	// the hash table, too.
	cache.push_front(pObj);
}

CResource* CResMgr::Get(const std::string &sID)
{
	// More complex, but not so much.

	// First, obtain a reference to the hash index.
	const cache_hash_type &hash = cache.get<1>();

	// Next, look up the resource.
	cache_hash_type::iterator hit = hash.find(sID);

	if(hit == hash.end()) {
		// Not found.
		CResource *p = LoadResource(sID);
		Add(p);
		return p;
	}

	// Now we need to translate the iterator to the hash index to the list, so
	// we can splice the resource to the front.
	cache_type::iterator lit = cache.project<0>(hit);

	// Move resource to front.
	cache.splice(cache.begin(), cache, lit);

	// Return the found resource.
	return *hit;
}

Thread: Resource manager tree

Thread Tools

Search Thread

Display

Resource manager tree

Similar Threads

Sorting out a resource manager

Interpreter.c

Binary Search Trees Part III

Request for comments

BST/Red and Black Tree