capacity and reserve

**George2** · 03-04-2008

Thanks robatino,

This is what you mentioned before. Do you mean in your good practices you recommend to reserve size N if the max possible size is N ^ 2?

--------------------
For example, if the upper bound is O(N^2), and the "average" is O(N).
--------------------

Originally Posted by robatino

Well, suppose you have a vector v which grows to some size which you don't know exactly in advance, but you only know that the size has an theoretical upper bound of N*N, say, where N is some parameter in the program. Also suppose that in an average case, the vector only grows to something on the order of size N, and it's only in pathological cases that it approaches size N*N. In that case, reserve()'ing space for N*N elements could harm performance, since most of it would probably be unused, and it would be unavailable to the rest of the program, or to other programs. And the machine might not even have that much memory, in which case the reserve() would fail and probably end the program, unless it caught the exception. You would be better off just letting the vector grow on its own to whatever size is needed. BTW, on Stroustrup's website he says that after doing some profiling, he stopped using reserve() unless needed to avoid invalidating iterators. Personally I tend to use it as a "comment", to indicate that this is the maximum amount of memory this vector can require, though for the reasons I gave above I wouldn't use it at all in certain situations.

regards,
George

**matsp** · 03-04-2008

reserve is only really useful if you have one of two situations [1]:
1. You know for certain what the size should be.
2. You can avoid invalidating iterators by reserving a sufficiently large size "for now" - say for example you can read 5 items at a time, and you use an iterator to find and potentially insert new items in the list, you may want to reserve(size()+5) to make sure you don't need to start over searching for the next item. [Of course, there may be a better solution overall if you need to do this - but that's a different matter].

[1] reserve only has a meaning for collections that are implemented as an array. Collections that consist of linked lists, binary trees and such will not benefit from reserving memory, since an insertion of a new element will not need copying all the data, so it's meaningless to say "I need this much space", because the addition of an individual item is pretty much O(1) in these types of collections.

--
Mats

**George2** · 03-04-2008

Hi Mats,

I do not know what do you mean "you don't need to start over searching for the next item" in your sample below, you are talking about insert all the times, right? Why mention search operation suddenly?

Originally Posted by matsp

reserve is only really useful if you have one of two situations [1]:
1. You know for certain what the size should be.
2. You can avoid invalidating iterators by reserving a sufficiently large size "for now" - say for example you can read 5 items at a time, and you use an iterator to find and potentially insert new items in the list, you may want to reserve(size()+5) to make sure you don't need to start over searching for the next item. [Of course, there may be a better solution overall if you need to do this - but that's a different matter].

[1] reserve only has a meaning for collections that are implemented as an array. Collections that consist of linked lists, binary trees and such will not benefit from reserving memory, since an insertion of a new element will not need copying all the data, so it's meaningless to say "I need this much space", because the addition of an individual item is pretty much O(1) in these types of collections.

--
Mats

regards,
George

**matsp** · 03-04-2008

Let's make a real (albeit stupid) example:

We study stock prices, and keep a list of all-time highs. We capture a short list of real-time values for shares from a server somewhere [say once a minute or so], sorted in alphabetical order by the ticker-name. We search through the list of current shares with an iterator, and if the current item matches one of our short list (by potentially iterating this list).

If it's a "new share", we need to add it to the list of shares we keep. If we do this the traditional "automatic growth" way, we would need to start from the top again with the iterator being reset.

Say this would be more appropriate if you keep track of "todays highest price", where shares that haven't been traded today would not be in the list at all.

This is, as I said, not a particularly realistic example, but I think it show the theory. It would probably be better to have a sorted list of ticker-names in the big list. There are surely other better ways to keep this sort of data [hash-map may be a good one]

--
Mats

**robatino** · 03-04-2008

Originally Posted by George2

Thanks robatino,

This is what you mentioned before. Do you mean in your good practices you recommend to reserve size N if the max possible size is N ^ 2?

No. In my example, since all you know is that the size is usually somewhere on the order of N (meaning it could easily be twice N, or half N, for example), then it doesn't make any sense to reserve() a specific number. Just let it grow by itself. As matsp said, it makes sense if you know the exact size needed, or if you need to avoid invalidating iterators. I also tend to do it if the maximum size needed is fairly small, even if I don't know the exact size, but that's a judgment call.

**Daved** · 03-04-2008

>> My understanding. You know copy and move object instance is not an efficient thing.

Ok, George2, this is referring to the conversation you and I are having. If this is the reason that using reserve on a vector can provide potential performance benefits, then why would this reason not apply to a deque?

Feel free to give a longer, more detailed answer. I know english is not your first language, but trying to explain a concept you are learning to someone else will help you understand it and help us verify that you actually do.

**George2** · 03-04-2008

Thanks Mats,

1.

Originally Posted by matsp

If it's a "new share", we need to add it to the list of shares we keep. If we do this the traditional "automatic growth" way, we would

need to start from the top again with the iterator being reset.

Do you mean insert elements into deque will make existing iterator invalid?

2.

Originally Posted by matsp

Say this would be more appropriate if you keep track of "todays highest price", where shares that haven't

been traded today would not be in the list at all.

--
Mats

I think in the deque you store object of type stock share, e.g. stock of APPLE or stock of Google. What is the relationship between

the stock I mentioned before and the so-called highest price? I am confused what do you store in your deque, stock (APPLE or

GOOGLE) or (highest) price (USD XXX.XX) of a day?

Hi Daved,

Here is my understanding.

1. Deque is stored in a way of a set of chunks and inside each chunk we store elements;

2. [insert at front/end] If we insert elements at pop/back, it is fine and we only need to grow front/back chunk if they are full, not expensive operation, like the same to push_back in vector;

3. [insert in the middle] What really expensive is movement of elements, so called reallocation. I am not sure whether the term is correct or not. If we insert in the middle, for vector, we have to move all existing elements after the inserted elements, but for deque, since we have two ends, we can choose to move elements to back or front in order to make one place available for the new inserted elements.

Any corrections or comments are welcome.

Originally Posted by Daved

>> My understanding. You know copy and move object instance is not an efficient thing.

Ok, George2, this is referring to the conversation you and I are having. If this is the reason that using reserve on a vector can provide potential performance benefits, then why would this reason not apply to a deque?

Feel free to give a longer, more detailed answer. I know english is not your first language, but trying to explain a concept you are learning to someone else will help you understand it and help us verify that you actually do.

regards,
George

**Daved** · 03-04-2008

George, what you said to me is correct, but it doesn't answer my question. Your response talked about insertion in the middle.

My question is, why does reserve provide potential performance benefit to vector but not to deque? This is unrelated to inserting in the middle.

**George2** · 03-04-2008

Hi Daved,

To answer this question we need to understand and define what operation is expensive, it is object instance movement and reallocation, especially duplicate and redundant allocation.

1. For vector, if we reserve, we can have (capacity - size) empty slots available in advance so that for (capacity - size) times of insertion, there is no need to reallocate, in this way reserve improve performance and saves time to reallcoate;

2. For deque, reallocation is less expensive -- just think about a special case of deque, List. Insertion and movement only needs to update some pointers. So, reserve does not benefit a lot.

Please feel free to comment and correct. :-)

Originally Posted by Daved

George, what you said to me is correct, but it doesn't answer my question. Your response talked about insertion in the middle.

My question is, why does reserve provide potential performance benefit to vector but not to deque? This is unrelated to inserting in the middle.

regards,
George

**CornedBee** · 03-05-2008

What really expensive is movement of elements, so called reallocation.

That's not reallocation. That's just making space.

Reallocation happens only in a vector. It's the action of allocating a new, larger block of memory when the current one becomes too small, and then moving all objects over to the new block.

**George2** · 03-05-2008

Thanks CornedBee,

Below is your definition of reallocation?

Originally Posted by CornedBee

It's the action of allocating a new, larger block of memory when the current one becomes too small, and then moving all objects over to the new block.

regards,
George

**CornedBee** · 03-05-2008

Yes. Add "Finally, free the old block." to the definition, though.

**George2** · 03-05-2008

Thanks CornedBee,

I have got your points.

Originally Posted by CornedBee

Yes. Add "Finally, free the old block." to the definition, though.

regards,
George

**Daved** · 03-05-2008

Well, you didn't exactly explain the point in the way I hoped you would, but perhaps it is just my own personal emphasis that I am expecting to read.

My answer to that question is that vector gains potential performance improvements from reserve because calling reserve prevents re-allocation when the vector grows and re-allocation can be expensive because it copies all the elements of the vector. (You seem to understand that part.)

A reserve on a deque would gain such an improvement because it would have no affect on re-allocation. If a deque is implemented as a serious of fixed size blocks and you add a new element to the end, then at worst a new block is created. However, no copying is done.

It is the copying that is expensive and it is the copying that the reserve for vector hopes to avoid. Since exceeding the capacity will never cause any copying in a deque, reserve doesn't have the same benefit.