Thread: Looking for appropriate container

  1. #1
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654

    Looking for appropriate container

    'Lo all.

    I have been looking at some containers now.
    Basically, I have an algorithm where I can use dynamic programming to store previously calculated results to speed it up greatly. Right.
    So I need a map or hash container of sort. Constant to logarithmic lookup time and low overhead.
    I've tried std::map (of course) and std::unordered_map. From a rough calculation, they use about 36 bytes per inserted item. The map itself looks something like std::map<unsigned long long, int>, so from that alone it should take roughly 12 bytes.
    Does anyone know of any other possible container / code to experiment with to reduce the overhead? Just reducing the overhead by a few bytes would increase the number of items that I can fit into the container greatly and give me a speed boost. Hopefully.

    I can roughly fit 29..826..161 items in 1 GB, which is not near enough.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  2. #2
    Registered User
    Join Date
    Mar 2010
    Posts
    109
    Just a thought, but have you tried rolling your own container? You might get better results with a barebones container than using something that's part of the STL.

  3. #3
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Yes, I have thought about it. But it's going to be a pain, as I'd either have to make a balanced tree or a hash container. I don't think I have enough motivation for that.
    My project already exceeds its requirements, so right now I'm just tinkering and having fun.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  4. #4
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    What I tried some time ago is to use a vector for this but it was a specialized vector. Removing from a vector invalidates all the iterators past the point of removal as we all know. However if you take the last index in the vector and put it into the 'hole' created by removing another object, notify a listener that the process has happened and notify them of the new index, and then reduce the size of the vector by 1, then you gain all the benefits of a vector and can remove from it without having all the issues with invalidation of iterators.

    However I ran into problems but you might be able to fix them. The problem was related to iteration of the vector. Even though we think of iterators as pointers they may not be that under the hood. When I removed an item from the end of the vector and put it into an empty index in the vector the iteration no longer worked. I never found out why. If the iterator is just a pointer that starts at the base address of the vector then technically the iterator shouldn't care what is at index n just that something is. However when removing from the tail end of the vector and inserting that into the index where an object was just removed messed the iterators up.
    Last edited by VirtualAce; 07-21-2010 at 01:27 AM.

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Is insertion time important? If not, use a plain old sorted vector. std::lower_bound will give you O(log N) lookups in a sorted vector, as well as finding the appropriate location to insert a new element, and insertion is O(N). And the space overhead is almost nothing.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  6. #6
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by brewbuck View Post
    Is insertion time important? If not, use a plain old sorted vector. std::lower_bound will give you O(log N) lookups in a sorted vector, as well as finding the appropriate location to insert a new element, and insertion is O(N). And the space overhead is almost nothing.
    That may be a good starting point. You can improve on the insertion time a bit, slightly affecting the lookup time, but keeping it log(n), by dividing up the items into two portions: Sorted items of size n, and unsorted items of at most log(n) in size. It all depends on whether you're going to be okay with amortised time operations.
    Once the unsorted portion reaches log(n) in size, you sort that portion and then perform a merge of the sorted and unsorted portions.
    To perform a lookup, you first do a binary search on the sorted portion, and if that fails, you perform a linear search on the remaining unsorted portion.

    Lookup time: This consists of log(n) binary search plus log(n) linear search.
    log(n) + log(n) = 2*log(n) therefore overall we have O(log(n))

    Insertion time: This is amortised so we'll calculate the time for log(n) insertions instead and then divide that by log(n) afterwards to get the amortised result.
    Let k = log(n). We have k inserts that take constant time, followed by a sort of O(k*log(k)), plus a merge taking O(n+k) operations.
    Summing it up: k + k*log(k) + n + k
    Substitute k with log(n) again gives: 2*log(n) + log(n)*log(log(n)) + n
    Simplify: n + log(n)*(log(n) + 2)
    Amortise by dividing by log(n) gives: n/log(n) + 2 + log(n)
    Now the last two parts fall away and we get O(n/log(n)) amortised insertion, somewhat better than O(n). (20 times better with a million items)

    I've been tempted to create such a container several times recently. This can all be held in a single vector with just one extra index to remember the cutoff between the sorted and unsorted portions. I should note that storing up to sqrt(n) items in the unsorted portion further reduces the amortised insertion time to O(sqrt(n)), but it also unfortunately pushes the lookup time out to O(sqrt(n)) as well.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  7. #7
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    Can you give us an example of the cached data?

    Soma

  8. #8
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Short description:
    It's a minmax algorithm, so I cache the different board configurations and how much points that move is worth.
    Code:
        #define SCORE_LOOKUP_MAP_T std::map<typename ::Board<NDim1, NDim2>::FriendlyMapInstance_t, int>
    
        struct SMinMaxVolatileData
        {
    //...
            SCORE_LOOKUP_MAP_T& ScoreLookupMap;
    //...
        };
    
                auto it = VolatileData.ScoreLookupMap.find(VolatileData.BoardData);
                InMap = (it != VolatileData.ScoreLookupMap.end());
    
                if (InMap)
                {
                    Score = it->second;
                }
    
                    if (Level <= SharedData.MaxCacheLevelDepth)
                    {
                        bool IsPlaceInCache = Stuff::Process::Memory::GetUsage() < SharedData.MaxMemoryUsage;
                        if (IsPlaceInCache)
                        {
                            VolatileData.Board.GetFriendlyMapInstance(VolatileData.BoardData);
                            VolatileData.ScoreLookupMap[VolatileData.BoardData] = Score;
                        }
                    }
    
    	void GetFriendlyMapInstance(FriendlyMapInstance_t& Data)
    	{
    		Data ^= Data;
    		for (uint x = 0; x < Width; x++)
    		{
    			for (uint y = 0; y < Height; y++)
    			{
    				auto Player = m_Board[x][y].GetPlayer();
    				if (Player)
    				{
    					int PlayerNum = Player->GetPlayerNumber();
    					PlayerNum <<= ((x * Width + y) * 2);
    					Data |= PlayerNum;
    				}
    			}
    		}
    	}
    
    	typedef typename Stuff::Traits::LeastBits<Width * Height * 2>::type FriendlyMapInstance_t;
    FriendlyMapInstance is the data type that has least Board size * board height * 2 amount of bits. In a 5x5 board, a 64-bit integer.

    Long description:
    Here is the current working set code (it's not the prettiest, I know):
    Code:
        typedef unsigned int uint;
        #define SCORE_LOOKUP_MAP_T std::map<typename ::Board<NDim1, NDim2>::FriendlyMapInstance_t, int>
    
    #pragma warning(push)
    #pragma warning(disable: 4512 4610 4510)
        template<uint NDim1, uint NDim2>
        struct SMinMaxSharedData
        {
            const Player& Computer;
            const Player& Human;
            const uint MaxDepth;
            const uint MaxMemoryUsage;
            const uint MaxCacheLevelDepth;
            const uint WinPoints;
            const uint AdjacentSquarePoints;
            const uint BlockOpponentSquarePoints;
        };
            
        template<uint NDim1, uint NDim2, typename BoardData_t>
        struct SMinMaxVolatileData
        {
            Board<NDim1, NDim2>& Board;
            SCORE_LOOKUP_MAP_T& ScoreLookupMap;
            Coordinates OutCoord;
            BoardData_t BoardData;
        };
    #pragma warning(pop)
    
        template<uint NDim1, uint NDim2>
        void GetBestMove(const Board<NDim1, NDim2>& board, Player& Player1, Player& Player2, int PlayerToGetMove, Coordinates& coord, uint MaxDepth, uint MaxCacheItems, uint MaxCacheLevelDepth,
            uint WinPoints, uint AdjacentSquarePoints, uint BlockOpponentSquarePoints
        )
        {
            //static_assert(NDim1 <= 4 && NDim2 <= 4, "The board is too big. It would require a computer from outer space to calculate the AI for this board size. Select 4x4 or lower.");
    
            typedef Board<NDim1, NDim2> board_t;
            Player* Players[2];
    
            switch (PlayerToGetMove)
            {
                case 1:
                    Players[0] = &Player1;
                    Players[1] = &Player2;
                    break;
                case 2:
                    Players[0] = &Player2;
                    Players[1] = &Player1;
                    break;
                default: assert(false);
            }
    
    #ifdef USE_STATIC_MAP
    #define STATIC static
    #else
    #define STATIC
    #endif
            SCORE_LOOKUP_MAP_T ScoreLookupMap1;
            SCORE_LOOKUP_MAP_T ScoreLookupMap2;
    #undef STATIC
    
            SMinMaxSharedData<NDim1, NDim2> SharedData = { *Players[0], *Players[1], MaxDepth, MaxCacheItems, MaxCacheLevelDepth, WinPoints, AdjacentSquarePoints,
                BlockOpponentSquarePoints };
    
            Board<NDim1, NDim2> Board1(board);
            Board<NDim1, NDim2> Board2(board);
            SMinMaxVolatileData<NDim1, NDim2, typename Board<NDim1, NDim2>::FriendlyMapInstance_t> VolatileData1 = { Board1, ScoreLookupMap1 };
            SMinMaxVolatileData<NDim1, NDim2, typename Board<NDim1, NDim2>::FriendlyMapInstance_t> VolatileData2 = { Board2, ScoreLookupMap2 };
    
            int Score1 = std::numeric_limits<int>::min(), Score2 = std::numeric_limits<int>::min();
            bool Break = false;
    
            unsigned int Range1End;
            Range1End = NDim1 * NDim2 - 1;
    
            boost::thread Thread1([&]() { Score1 = GetBestMove(SharedData, VolatileData1, *Players[0], 1, std::numeric_limits<int>::min(), std::numeric_limits<int>::max(), 0, Range1End, 0, 100); });
    
            boost::thread MergingThread([&]()
            {
                do
                {
                    Progress = VolatileData1.Progress + VolatileData2.Progress;
                    CacheHits = VolatileData1.CacheHits + VolatileData2.CacheHits;
                    CacheMisses = VolatileData1.CacheMisses + VolatileData2.CacheMisses;
                    CacheSize = VolatileData1.CacheSize + VolatileData2.CacheSize;
                    PruneHits = VolatileData1.PruneHits + VolatileData2.PruneHits;
                    TotalNodesExamined = VolatileData1.TotalNodesExamined + VolatileData2.TotalNodesExamined;
                    Sleep(500);
                }
                while (!Break);
            });
    
            Thread1.join();
    
            Break = true;
            MergingThread.join();
    
            if (Score1 > Score2)
                coord = VolatileData1.OutCoord;
            else
                coord = VolatileData2.OutCoord;
        }
    
        template<int NDim1, int NDim2, typename BoardData_t>
        int GetBestMove(SMinMaxSharedData<NDim1, NDim2>& SharedData, SMinMaxVolatileData<NDim1, NDim2, BoardData_t>& VolatileData, const Player& CurrentPlayer, 
            uint Level, int Alpha, int Beta, unsigned int StartPos, unsigned int EndPos
        )
        {
            bool IsComputer = CurrentPlayer == SharedData.Computer;
            bool IsHuman = CurrentPlayer == SharedData.Human;
            assert( (IsComputer && !IsHuman) || (IsHuman && !IsComputer) );
    
            if (Level == SharedData.MaxDepth || VolatileData.Board.IsDraw())
                return 0;
    
            if ( VolatileData.Board.CheckForWinner(CurrentPlayer) )
                return IsComputer ? 1000 : -1000;
    
            typedef const Board<NDim1, NDim2> Board_t;
            int BestScoreSoFar = IsComputer ? std::numeric_limits<int>::min() : std::numeric_limits<int>::max();
            const int& _Alpha = IsComputer ? BestScoreSoFar : Alpha;
            const int& _Beta = IsComputer ? Beta : BestScoreSoFar;
            Coordinates BestMoveSoFar(-1, -1);
    
            for (CCoordinateSystem xy(Board_t::Width - 1, 0, Board_t::Height - 1, 0, StartPos); xy.GetPosition() <= EndPos; ++xy)
            {
                if (!VolatileData.Board.IsSquareEmpty( xy.GetCoordinates() ))
                    continue;
    
                VolatileData.Board.SetSquare(xy.GetCoordinates(), CurrentPlayer);
    
                int Score = 0;
                bool InMap = false;
                VolatileData.Board.GetFriendlyMapInstance(VolatileData.BoardData);
    
                auto it = VolatileData.ScoreLookupMap.find(VolatileData.BoardData);
                InMap = (it != VolatileData.ScoreLookupMap.end());
    
                if (InMap)
                {
                    Score = it->second;
                }
    
                if (!InMap)
                {
                    Score = detail::FindNumSurroundingData(VolatileData.Board, CurrentPlayer, xy.GetCoordinates()) * (IsComputer ? 1 : -1);
                    const Player& OppositePlayer = IsComputer ? SharedData.Human : SharedData.Computer;
    
                    if ( VolatileData.Board.CheckForWinner(CurrentPlayer) )
                        Score += IsComputer ? 1000 : -1000;
                    else
                        Score += GetBestMove(SharedData, VolatileData, OppositePlayer, Level + 1, _Alpha, _Beta, 0, xy.GetMaxPoint());
    
                    if (Level <= SharedData.MaxCacheLevelDepth)
                    {
                        bool IsPlaceInCache = Stuff::Process::Memory::GetUsage() < SharedData.MaxMemoryUsage;
                        if (IsPlaceInCache)
                        {
                            VolatileData.Board.GetFriendlyMapInstance(VolatileData.BoardData);
                            VolatileData.ScoreLookupMap[VolatileData.BoardData] = Score;
                        }
                    }
                }
    
                VolatileData.Board.UndoLastMove();
    
                if ((IsComputer && Score > BestScoreSoFar) || (IsHuman && Score < BestScoreSoFar))
                {
                    BestScoreSoFar = Score;
                    BestMoveSoFar = xy.GetCoordinates();
                }
                    
                if ( (IsHuman && _Beta <= _Alpha) || (IsComputer && _Alpha >= _Beta) )
                {
                    goto ExitLoop;
                }
            }
    
    ExitLoop:
            assert(BestMoveSoFar != Coordinates(-1, -1));
            VolatileData.OutCoord = BestMoveSoFar;
            return BestScoreSoFar;
        }
    };
    
    	void GetFriendlyMapInstance(FriendlyMapInstance_t& Data)
    	{
    		Data ^= Data;
    		for (uint x = 0; x < Width; x++)
    		{
    			for (uint y = 0; y < Height; y++)
    			{
    				auto Player = m_Board[x][y].GetPlayer();
    				if (Player)
    				{
    					int PlayerNum = Player->GetPlayerNumber();
    					PlayerNum <<= ((x * Width + y) * 2);
    					Data |= PlayerNum;
    				}
    			}
    		}
    	}
    
    	typedef typename Stuff::Traits::LeastBits<Width * Height * 2>::type FriendlyMapInstance_t;
    I have no idea what would help or not. But I will test. We'll see what is more effective.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  9. #9
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    So this is a transposition table? I've never seen them implemented with anything other than a direct hash. Collisions are handled by replacement of the older entry if appropriate.

    Once you get the table working, it's a short leap to doing some rudimentary move ordering, and then you can start doing negamax alpha beta instead of the "toy" minimax algorithm.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  10. #10
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Clearly it is fast, but I am wondering if it's possible to increase speed? Squeeze out more performance as it is. Make more levels plausible.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Elysia View Post
    Clearly it is fast, but I am wondering if it's possible to increase speed? Squeeze out more performance as it is. Make more levels plausible.
    There is no improvement to the table you could make that would even come close to the speedup you'd see by switching to alpha-beta search. Don't be afraid of it, it's not difficult.

    I'm not sure I'd really recommend such a table for straight, fixed-depth minimax unless your game states really do have lots of possible transpositions. The table is useful in more ways than just caching evaluations. It can be used for move ordering and causing quick cutoffs for the less naive search methods like negascout or MTD(f)
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  12. #12
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    I already have alpha-beta pruning, as you can see. They complement each other and are necessary in order to gain enough speedups.
    Alpha-beta pruning in itself isn't enough to go 16 levels. With the complement of some dynamic programming, it is.
    Now I'm trying to see how deep I can go within a reasonable time.
    This isn't some difficult chess game. It's a simple X in a row where X e [3, 99].
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #13
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Elysia View Post
    I already have alpha-beta pruning, as you can see. They complement each other and are necessary in order to gain enough speedups.
    Alpha-beta pruning in itself isn't enough to go 16 levels. With the complement of some dynamic programming, it is.
    Now I'm trying to see how deep I can go within a reasonable time.
    This isn't some difficult chess game. It's a simple X in a row where X e [3, 99].
    Why are you calling it minimax then?

    The transposition table is a performance enhancement but I wouldn't expect miracles from it. Like I said, unless the game states actually have transpositions, it isn't really helping you unless its leading to higher level cutoffs. Have you tried measuring how many cutoffs you are getting? And are you ordering your move lists before starting the alpha-beta process? If not, you're missing most of the benefits.

    EDIT: On looking at the code, it seems like you aren't doing any ordering. You are just generating the moves in an algorithmically simple order and searching in that order. Alpha-beta gives a O(sqrt(N)) speedup with N being the total number of searched positions, but ONLY IF the moves are ordered.
    Last edited by brewbuck; 07-21-2010 at 12:21 PM.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  14. #14
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by brewbuck View Post
    Why are you calling it minimax then?
    Because the algorithm is called minmax, no? Alpha-beta pruning is an optimization.

    I have collected some statistics. I do get cutoffs. The number of cutoffs is roughly 4% off the total examined nodes. What does that actually mean, though? I really have no idea. Since I don't count the total nodes when getting a cache hit and I don't know how many nodes are cut off from the alpha-beta, I cannot count how many nodes are actually pruned.

    But this ordering... I'm afraid I don't know why it would help or how it works.
    It is just my first AI algorithm.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #15
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Elysia View Post
    I have collected some statistics. I do get cutoffs. The number of cutoffs is roughly 4% off the total examined nodes. What does that actually mean, though? I really have no idea. Since I don't count the total nodes when getting a cache hit and I don't know how many nodes are cut off from the alpha-beta, I cannot count how many nodes are actually pruned.
    Counting the exact number of nodes pruned is really looking at it in too fine detail. Just estimate it. If you know the approximate branching factor B for a given move sequence number you can just take it B^(D-1) to predict how many moves would be evaluated at depth D. So if you make a cutoff at depth D then you saved approximate B^(D-1) evaluations.

    But this ordering... I'm afraid I don't know why it would help or how it works.
    It is just my first AI algorithm.
    The ordering tries to put high-value moves first, so that they will cause earlier cutoffs. As these early cutoffs stack on top of each other at various depths, the effect is to reduce the branching factor from B to approximately sqrt(B) which lets you search twice as deep in the same amount of time. The estimates have to be legitimate, not just random values. That's where the table comes in -- as you are filling in the table you are storing the last-known values of those positions and you can use those to order the moves in a following iteration.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. iterator to unspecified container?
    By StainedBlue in forum C++ Programming
    Replies: 16
    Last Post: 06-04-2010, 05:02 AM
  2. sorting container
    By l2u in forum C++ Programming
    Replies: 6
    Last Post: 09-01-2007, 01:12 PM
  3. stl container to store more than 1 type
    By sujeet1 in forum C++ Programming
    Replies: 7
    Last Post: 05-09-2007, 04:10 AM
  4. Replies: 1
    Last Post: 01-23-2006, 07:12 PM
  5. Linked List Queue Implementation help
    By Kenogu Labz in forum C++ Programming
    Replies: 8
    Last Post: 09-21-2005, 10:14 AM