Performance issues

**jimzy** · 09-11-2007

Hello there!
Long story short -- I'm currently working on project where I have to implement simple board game and few algorithms working with it. One of first things I had to write was movements generator (which is supposed to find all possible moves from given game state, possible to do by selected player).
Problems started when I began using it to generate quite large amounts of moves... say 100'000. It was running incredibly slow... To start off, I removed widely used vectors, like this:

Code:

vector <OldGameStateClass*>* stateHolder;

...and replaced it with linked list class like this:

Code:

class GameState {
private:
	int value;
	signed char** currBoard;
	GameState* childrenList;
	GameState* nextElem;
};

This class holds pointer to next element (another move reachable from parent level) and also pointer to new list (children), moves available from this given game state.

It slightly improved whole generator, but it was still too slow.

Later, I changed ints to signed chars, as I was copying arrays very often -- it also helped a bit. Doing some tests later, I've found out copyBoard method was being major reason of time consumption.
And this is how it looks:

Code:

signed char** copyBoard(signed char **currBoard)
{
	signed char **board = new signed char*[13];
	for (int i = 0; i < 13; i++) {
		board[i] = new signed char[8];	
		for (int j = 0; j < 8; j++)
			board[i][j] = currBoard[i][j];
	}
	return board;
}

Generating 100'000 moves also means memory has to be allocated 100'000 times and values copied (but this doesn't seem to be much of a problem, it's rather memory allocation that is slow).

Are there faster ways of doing such? Or perhaps I should use different way of storing data (static arrays in class maybe?)?
Also, is there a way to make vector less slow? The whole push_back operation seems to take a lot of time, but using vector is quite comfortable. I also tried to use deque, but it seemed slow too.

If anybody had similar problems or knows best way of storing & copying many small game boards - I'll be thankfull for any help :-)

**brewbuck** · 09-11-2007

Originally Posted by jimzy

If anybody had similar problems or knows best way of storing & copying many small game boards - I'll be thankfull for any help :-)

Why copy the board at all? In all the minimax games I've ever written, I did move generation by temporarily modifying the board, then changing it back. If you remember what move you just made, it's pretty easy to undo it.

I never had zillions of copies of boards flying around...

EDIT: Also, vector.push_back should not be slow. Why do you believe that it is?

**jimzy** · 09-11-2007

Originally Posted by brewbuck

Why copy the board at all? In all the minimax games I've ever written, I did move generation by temporarily modifying the board, then changing it back. If you remember what move you just made, it's pretty easy to undo it.

I've been thinking about that too... but been too busy with debugging generator xP. Well, it seems to be best solution I got so far.

Originally Posted by brewbuck

EDIT: Also, vector.push_back should not be slow. Why do you believe that it is?

In the very early debugging part (when I didn't yet know it's about memory allocation), I've been googling for push_back performance, and somebody posted that vector reallocates whole memory from time to time (once the default memory it has gets used up, it frees whole memory it used and reallocates new, bigger memory chunk -- that's what more or less person was saying).
Anyways, I replaced to linked list, which seemed to be little gain on time, not too big though :x

One more thing, will the whole saving moves, updating, undoing and so on be faster in the end than allocations (guess I can test it on my own anyways)?

Thanks brewbuck :-)

**Daved** · 09-11-2007

Read this from Stroustrup. In particular, look at the last paragraph of that section.

http://www.research.att.com/~bs/bs_f...low-containers

Of course, if std::list proves to be faster for your application by all means use it.

**brewbuck** · 09-11-2007

Originally Posted by jimzy

In the very early debugging part (when I didn't yet know it's about memory allocation), I've been googling for push_back performance, and somebody posted that vector reallocates whole memory from time to time (once the default memory it has gets used up, it frees whole memory it used and reallocates new, bigger memory chunk -- that's what more or less person was saying).

That's pretty much correct, but it doesn't have as big of an impact as you might think. When the vector grows, it DOUBLES in size (or by some other constant factor), it does not expand linearly. This amortizes the inefficiency over time, making the operation just a tad bit worse than O(1).

Another important thing to remember with vectors is to pre-allocate if possible. If you know, for instance, that the vector will hold anywhere from 1 to 1000 elements, then when you create the vector in the first place, allocate enough space for 1000 elements. This guarantees it will never have to expand:

Code:

std::vector<whatever> big_vector(1000);

Anyways, I replaced to linked list, which seemed to be little gain on time, not too big though :x

A linked list supports O(1) insertions at the end, so it is theoretically more efficient than a vector in that regard. But there are always tradeoffs. A vector holds its data contiguously in memory, but the elements of a linked list could be scattered all over the place. This can lead to poor performance in itself, by reducing the efficiency of the CPU's memory cache.

One more thing, will the whole saving moves, updating, undoing and so on be faster in the end than allocations (guess I can test it on my own anyways)?

Unless the moves are terribly complex, it is almost guaranteed that "doing/undoing" is faster. Store the move data in a class called Move, or something, and make sure you store enough information to undo the move. For instance, if you're talking chess, you'd just store the move as it is normally annotated -- where the piece came from, and where it went to.

Thanks brewbuck :-)

I love coding zero-sum games. What game is this? A 13x8 board?

**CornedBee** · 09-11-2007

Don't allocate the board as an array of pointers to arrays of values. Allocate one block of memory and do the offset transformation yourself, or use a Boost.Multi_Array to do it for you.

**jimzy** · 09-11-2007

@Daved: thanks for the link :-)

I assume I should use reserve() then -- I can estimate the size of vector that will be needed (if I decide to switch back to vectors that is).

Originally Posted by brewbuck

Unless the moves are terribly complex, it is almost guaranteed that "doing/undoing" is faster. Store the move data in a class called Move, or something, and make sure you store enough information to undo the move. For instance, if you're talking chess, you'd just store the move as it is normally annotated -- where the piece came from, and where it went to.

They aren't. Mostly moving to adjacent cells / sliding, most complex are beatings (especially multiple). Oh, and the board is 11x6 in fact, I'm using 13x8 to create border... some way of preventing checking index out of range every time.
And the game is Bivouac. Some resources incase you'd be interested:
http://www.zillionsofgames.com/cgi-b...do=show;id=113
http://www.di.fc.ul.pt/~jpn/gv/bivouac.htm

@CornedBee: what's the gain if I do so? You mean if I keep copying the board, or I should use such array anyways?

Thanks for replies guys!

**Sang-drax** · 09-11-2007

You should definately do/undo moves on one board instead of allocating new boards if that's what you're doing.

**matsp** · 09-12-2007

And if you MUST copy the board for some reason, make sure your board is one contiguous 2D array in memory, that way, you can copy it using memcpy, which is probably 3-5 times faster than your method.

--
Mats

**jimzy** · 09-12-2007

@matsp: thanks for the tip.

As suggested, I gave up using whole boards, and created Move class which looks like this:

Code:

class Move {
private:
	int oldX;
	int oldY;
	int newX;
	int newY;
	Move* parentMove;
	Move* beatenPawns;
	vector <Move*>* chlidrenMoves;
};

And did some first performance tests. Like previously, I've been generating 100'000 new game states.
Using std::list -- 7000 ms.
Using std::vector (with reserve() call) -- 5000 ms.
And using my custom linked list class described in first post it took 1500 ms.
Both vector and list push_back are slower it seems, so guess I'll stick to list.

Maybe I should also remove vector (considering push_back will be used there aswell) from Move class and replace it with list..?

**anon** · 09-12-2007

If there is such wild difference between std containers and your custom linked list, are you sure that a) you are timing them with an optimized build (std containers can benefit very much from compiler optimizations), b) your own linked list implementation is working properly (e.g no memory leaks etc), c) you were not seriously misusing std containers (e.g telling vector to resize each time you added something)? In addition it seems that std containers held different classes in your earlier code, and it's not clear why they would hold pointers instead of the object itself (it goes into heap memory one way or another).

The greatest optimization, of course, would be not to generate 100 000 game states...

**jimzy** · 09-12-2007

Originally Posted by anon

The greatest optimization, of course, would be not to generate 100 000 game states...

Well, I did that to see which container is best... generating smaller numbers might not be easy to measure.

Anyways, while designing new classes for moves and so on, I run into quite silly problem.. and for some reasons i can't go past it. Say we got 2 classes:

Code:

class A {
  B* cB;
}

class B {
  A* cA;
}

How to fix headers to make it work?

**matsp** · 09-12-2007

Originally Posted by jimzy

Anyways, while designing new classes for moves and so on, I run into quite silly problem.. and for some reasons i can't go past it. Say we got 2 classes:

Code:

class A {
  B* cB;
}

class B {
  A* cA;
}

How to fix headers to make it work?

use:

Code:

class B;  // Tell the compiler "We will have a class B - tell you later what it contains".

class A {
   B* cB;
};

class B {
  A* cA;
};

--
Mats

**jimzy** · 09-12-2007

Ahh... thanks a lot matsp, knew it was going to be silly

**pheres** · 09-12-2007

by the way: would that forward-method also work if the remote classes are used as template arguments, for example in case he would use

Code:

class A {
  auto_ptr<B> cB;
}

class B {
  auto_ptr<A> cA;
}

Thread: Performance issues

Thread Tools

Search Thread

Display

Performance issues

Similar Threads

Performance and footprint of virtual function

File map performance

Observer Pattern and Performance questions

Binary Search Trees Part III

inheritance and performance