# map or vector, which is less performant and by how much?

• 09-20-2008
elninio
map or vector, which is less performant and by how much?
I have vectors of vectors of vectors, ~1 million elements,

one vector may look like this:

3 4 5 6 7 ...
32 95 14 46 32 ...

Due to the fact that vectors cannot start with an index of 3 (can someone confirm this? explain etails too), and some complications in my programs, I can't get it to do what I want it to do.

I am faced with the option of lowering the index by 3 ( 1 2 3 ... n-3), or using maps instead of vectors.

Should I use maps for the sake of making my code easier to read for my collaborators? What are the performance disadvantages of using a map(s) instead of vector(s)
• 09-20-2008
matsp
The basic rule is "Unless you KNOW another solution is better, use vector". So start by implementing your code using vector. Most of the solutions are pretty much straight replacable, so switching from vector to some other solution is usually quite painless.

--
Mats
• 09-20-2008
elninio
If the price to pay is 10% more memory usage, thats not a problem. The code becomes FUBAR to understand when changing indecees. This would require supporting documentation into the file. Our goal is to produce functioning code ASAP. We don't have time to write doc (var names make the program [and a long program at that] readable). Modifying these indecees will also take more CPU time to process, so having a 10% increase in mem usage will as the single con will be worth it.
• 09-20-2008
matsp
If memory usage is a concern, then vectors are better than maps, since the map must store the key (index) and the value, whilst vectors store the value only.

Finding element at the key in a map will be slower than vector, because the map uses a (balanced?) binary tree, whilst the vector is a straight indexing operation.

So as long as your array isn't VERY sparse (and the way you describe it, it is not a sparse array, it just starts at 3 instead of zero).

Another option would of course be to implement (now or in the future) your own "based-vector", where all vector operations automatically subtracts 3 [or whatever number you prefer] from the index. So your actual code doesn't have to know that it should subtract 3, but when it comes to ACTUALLY access the content, you do subtract three. But by the sound of things, this doesn't seem particularly urgent.

--
Mats
• 09-20-2008
iMalc
So by your description, do you mean that you are using a vector of ints where you have 32 at index 3, 95 at index 4, 14 at index 5, 46 at index 6, 32 at index 7 etc?

If so, then switching to a map wouldn't mean using an extra 10&#37; more memory usage, it would likely be more like an extra 600% more usage, unless you're compiling for 64-bit in which case it is even worse.
Assuming 4 byte ints and pointers: Instead of a single number in a position in the vector you'd have a map which stores a key and value. That's 8 bytes. Then there's the fact that a map tends to be implemented as a red-black tree which then requires about 12 bytes overhead per node. That's 20 bytes so far. Now add to that the fact that the CRT aligns allocations to an 8-byte boundary. So thats 24 bytes per node, compared to 4 bytes per node in a vector. (Assumptions holding). Sure, you can bring it back down to 20 bytes with a custom allocator perhaps, but that's basically beating a dead horse.

If your only problem is that you wont be using indexes 0, 1, and 2 of a vector then I would certainly suggest sticking with a vector anyway. You can subtract 3 from all your indexing operations if you feel like it.
But if you have 'holes' at lots of positions in the vector, then maybe a map could work for you. Of course you'd also be sacrificing O(1) lookups for O(log n) lookups.
Another option is to use the assoc_vector from the Loki library.
• 09-20-2008
CornedBee
Or you could just overallocate three elements per vector. Leave the first three empty. The extra overhead is minimal.

Or you could write a very simple wrapper around vector which just subtracts 3 from every index.

Or you could use Boost.MultiArray to create a 1-dimensional array that does just that.
Using a MultiArray would be more efficient than that triple-nested vector anyway, if the inner vectors are all the same size.
• 09-20-2008
elninio
Unfortunately my inner vectors do not all have the same size. I just love how brilliant the replies I get from you(s) are. If i were to research this on my own it would probably take like 8 months and it would be a sparse vector of knowledge!
• 09-21-2008
bling
if the values you're using can be the keys as well, u can use std::set instead.

anyways, this is what u want in terms of performance:

map:
insert: O(log n)
remove: O(log n)
retrieval: O(log n) for any element

vector:
insert: O(1) or O(n) when it needs to expand
remove: O(n) if you remove something in the middle
retrieval: O(1) if you know the index, O(n) if you don't, or O(log n) if you sort it first

like others have said, unless you have a good reason, use a vector. maps are good because they have guaranteed O(log n) retrieval for any element, and still have good insert/delete times. a vector's downfall is when you need to add/remove in the middle and causes expensive vector copy operations.
• 09-22-2008
iMalc
Quote:

Originally Posted by bling
if the values you're using can be the keys as well, u can use std::set instead.

anyways, this is what u want in terms of performance:

map:
insert: O(log n)
remove: O(log n)
retrieval: O(log n) for any element

vector:
insert: O(1) or O(n) when it needs to expand
remove: O(n) if you remove something in the middle
retrieval: O(1) if you know the index, O(n) if you don't, or O(log n) if you sort it first

like others have said, unless you have a good reason, use a vector. maps are good because they have guaranteed O(log n) retrieval for any element, and still have good insert/delete times. a vector's downfall is when you need to add/remove in the middle and causes expensive vector copy operations.

A std::set typically is implemented using the same red-black tree as the map. Consider also that you still need to associate each int with it's index from 3 and up and you're back to the same situation as a map with the same memory usage. AKA no good.

Also, a vector is O(1) amortised insertion time. Something must not be O(this) or O(that). It is technically O(1) amortised running time, period.
Actually if the standard was flexible enough to allow amortised O(log n) time operations on a map instead, then implementors could save a few bytes of overhead and use a splay-tree instead or a red-black tree etc. But I digress.
• 09-22-2008
C_ntua
Quote:

Originally Posted by CornedBee
Or you could write a very simple wrapper around vector which just subtracts 3 from every index.

Kind of new to C++, so what do you exactly mean with the term "wrapper"? My thought was create a class that contains the vector which overloads the operator [] to get everything substracted by 3.
• 09-22-2008
Elysia
Quote:

Originally Posted by C_ntua
My thought was create a class that contains the vector which overloads the operator [] to get everything substracted by 3.

Pretty much just that, yes. It doesn't really add any functionality, it merely "wraps" the original functionality through a new interface.
But hey, I think you're doing fine. You may be a "newbie," but you have the brains and desire to learn ;)
• 09-22-2008
C_ntua
Quote:

Originally Posted by Elysia
Pretty much just that, yes. It doesn't really add any functionality, it merely "wraps" the original functionality through a new interface.
But hey, I think you're doing fine. You may be a "newbie," but you have the brains and desire to learn ;)

thanx :)
• 09-22-2008
CornedBee
Quote:

Originally Posted by C_ntua
Kind of new to C++, so what do you exactly mean with the term "wrapper"? My thought was create a class that contains the vector which overloads the operator [] to get everything substracted by 3.

• 09-23-2008
Darryl
Concerning the wrapper idea. A better thought might be inheritance. Generally it's not a good idea to derive a class from the std containers because they don't have virtual destructors.

However, as long as you don't use a base pointer to the derived class, it won't be a problem. You only need to override operator[] and at() with your -3 index offset. The benefit of course is that you won't have to implement all the other functionality of vectors like size(), push_back, etc.
• 09-23-2008
laserlight
Quote:

Originally Posted by Darryl
However, as long as you don't use a base pointer to the derived class, it won't be a problem.

True, though another way to do a less work than using composition would be to use private inheritance and then have public using declarations for those member functions that you wish to expose as part of the interface. It would be more work than public inheritance, of course, but completely avoids possible undefined behaviour.

Quote:

Originally Posted by Darryl
However, as long as you don't use a base pointer to the derived class, it won't be a problem. You only need to override operator[] and at() with your -3 index offset.

It would be overloading and name hiding, not overriding, actually. However, that means that yet another possibility is to extend the standard container by providing accessors that are free functions, and then use these accessors instead of operator[]. The bad part is that this requires a maintainer to remember that these accessors must be used. Also, things like front(), begin() and rbegin() might also need to be considered.