Thread: map or vector, which is less performant and by how much?

  1. #1
    Registered User
    Join Date
    Jun 2008
    Posts
    106

    map or vector, which is less performant and by how much?

    I have vectors of vectors of vectors, ~1 million elements,

    one vector may look like this:

    3 4 5 6 7 ...
    32 95 14 46 32 ...

    Due to the fact that vectors cannot start with an index of 3 (can someone confirm this? explain etails too), and some complications in my programs, I can't get it to do what I want it to do.

    I am faced with the option of lowering the index by 3 ( 1 2 3 ... n-3), or using maps instead of vectors.

    Should I use maps for the sake of making my code easier to read for my collaborators? What are the performance disadvantages of using a map(s) instead of vector(s)

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    The basic rule is "Unless you KNOW another solution is better, use vector". So start by implementing your code using vector. Most of the solutions are pretty much straight replacable, so switching from vector to some other solution is usually quite painless.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Jun 2008
    Posts
    106
    If the price to pay is 10% more memory usage, thats not a problem. The code becomes FUBAR to understand when changing indecees. This would require supporting documentation into the file. Our goal is to produce functioning code ASAP. We don't have time to write doc (var names make the program [and a long program at that] readable). Modifying these indecees will also take more CPU time to process, so having a 10% increase in mem usage will as the single con will be worth it.

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    If memory usage is a concern, then vectors are better than maps, since the map must store the key (index) and the value, whilst vectors store the value only.

    Finding element at the key in a map will be slower than vector, because the map uses a (balanced?) binary tree, whilst the vector is a straight indexing operation.

    So as long as your array isn't VERY sparse (and the way you describe it, it is not a sparse array, it just starts at 3 instead of zero).

    Another option would of course be to implement (now or in the future) your own "based-vector", where all vector operations automatically subtracts 3 [or whatever number you prefer] from the index. So your actual code doesn't have to know that it should subtract 3, but when it comes to ACTUALLY access the content, you do subtract three. But by the sound of things, this doesn't seem particularly urgent.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    So by your description, do you mean that you are using a vector of ints where you have 32 at index 3, 95 at index 4, 14 at index 5, 46 at index 6, 32 at index 7 etc?

    If so, then switching to a map wouldn't mean using an extra 10% more memory usage, it would likely be more like an extra 600% more usage, unless you're compiling for 64-bit in which case it is even worse.
    Assuming 4 byte ints and pointers: Instead of a single number in a position in the vector you'd have a map which stores a key and value. That's 8 bytes. Then there's the fact that a map tends to be implemented as a red-black tree which then requires about 12 bytes overhead per node. That's 20 bytes so far. Now add to that the fact that the CRT aligns allocations to an 8-byte boundary. So thats 24 bytes per node, compared to 4 bytes per node in a vector. (Assumptions holding). Sure, you can bring it back down to 20 bytes with a custom allocator perhaps, but that's basically beating a dead horse.

    If your only problem is that you wont be using indexes 0, 1, and 2 of a vector then I would certainly suggest sticking with a vector anyway. You can subtract 3 from all your indexing operations if you feel like it.
    But if you have 'holes' at lots of positions in the vector, then maybe a map could work for you. Of course you'd also be sacrificing O(1) lookups for O(log n) lookups.
    Another option is to use the assoc_vector from the Loki library.
    Last edited by iMalc; 09-20-2008 at 06:16 PM.

  6. #6
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Or you could just overallocate three elements per vector. Leave the first three empty. The extra overhead is minimal.

    Or you could write a very simple wrapper around vector which just subtracts 3 from every index.

    Or you could use Boost.MultiArray to create a 1-dimensional array that does just that.
    Using a MultiArray would be more efficient than that triple-nested vector anyway, if the inner vectors are all the same size.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  7. #7
    Registered User
    Join Date
    Jun 2008
    Posts
    106
    Unfortunately my inner vectors do not all have the same size. I just love how brilliant the replies I get from you(s) are. If i were to research this on my own it would probably take like 8 months and it would be a sparse vector of knowledge!

  8. #8
    Registered User
    Join Date
    Aug 2008
    Posts
    188
    if the values you're using can be the keys as well, u can use std::set instead.

    anyways, this is what u want in terms of performance:

    map:
    insert: O(log n)
    remove: O(log n)
    retrieval: O(log n) for any element

    vector:
    insert: O(1) or O(n) when it needs to expand
    remove: O(n) if you remove something in the middle
    retrieval: O(1) if you know the index, O(n) if you don't, or O(log n) if you sort it first

    like others have said, unless you have a good reason, use a vector. maps are good because they have guaranteed O(log n) retrieval for any element, and still have good insert/delete times. a vector's downfall is when you need to add/remove in the middle and causes expensive vector copy operations.

  9. #9
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by bling View Post
    if the values you're using can be the keys as well, u can use std::set instead.

    anyways, this is what u want in terms of performance:

    map:
    insert: O(log n)
    remove: O(log n)
    retrieval: O(log n) for any element

    vector:
    insert: O(1) or O(n) when it needs to expand
    remove: O(n) if you remove something in the middle
    retrieval: O(1) if you know the index, O(n) if you don't, or O(log n) if you sort it first

    like others have said, unless you have a good reason, use a vector. maps are good because they have guaranteed O(log n) retrieval for any element, and still have good insert/delete times. a vector's downfall is when you need to add/remove in the middle and causes expensive vector copy operations.
    A std::set typically is implemented using the same red-black tree as the map. Consider also that you still need to associate each int with it's index from 3 and up and you're back to the same situation as a map with the same memory usage. AKA no good.

    Also, a vector is O(1) amortised insertion time. Something must not be O(this) or O(that). It is technically O(1) amortised running time, period.
    Actually if the standard was flexible enough to allow amortised O(log n) time operations on a map instead, then implementors could save a few bytes of overhead and use a splay-tree instead or a red-black tree etc. But I digress.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  10. #10
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    Quote Originally Posted by CornedBee View Post
    Or you could write a very simple wrapper around vector which just subtracts 3 from every index.
    Kind of new to C++, so what do you exactly mean with the term "wrapper"? My thought was create a class that contains the vector which overloads the operator [] to get everything substracted by 3.

  11. #11
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by C_ntua View Post
    My thought was create a class that contains the vector which overloads the operator [] to get everything substracted by 3.
    Pretty much just that, yes. It doesn't really add any functionality, it merely "wraps" the original functionality through a new interface.
    But hey, I think you're doing fine. You may be a "newbie," but you have the brains and desire to learn
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  12. #12
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    Quote Originally Posted by Elysia View Post
    Pretty much just that, yes. It doesn't really add any functionality, it merely "wraps" the original functionality through a new interface.
    But hey, I think you're doing fine. You may be a "newbie," but you have the brains and desire to learn
    thanx

  13. #13
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Quote Originally Posted by C_ntua View Post
    Kind of new to C++, so what do you exactly mean with the term "wrapper"? My thought was create a class that contains the vector which overloads the operator [] to get everything substracted by 3.
    Your intuition speaks for you, young padawan.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  14. #14
    Tropical Coder Darryl's Avatar
    Join Date
    Mar 2005
    Location
    Cayman Islands
    Posts
    503
    Concerning the wrapper idea. A better thought might be inheritance. Generally it's not a good idea to derive a class from the std containers because they don't have virtual destructors.

    However, as long as you don't use a base pointer to the derived class, it won't be a problem. You only need to override operator[] and at() with your -3 index offset. The benefit of course is that you won't have to implement all the other functionality of vectors like size(), push_back, etc.

  15. #15
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Darryl
    However, as long as you don't use a base pointer to the derived class, it won't be a problem.
    True, though another way to do a less work than using composition would be to use private inheritance and then have public using declarations for those member functions that you wish to expose as part of the interface. It would be more work than public inheritance, of course, but completely avoids possible undefined behaviour.

    Quote Originally Posted by Darryl
    However, as long as you don't use a base pointer to the derived class, it won't be a problem. You only need to override operator[] and at() with your -3 index offset.
    It would be overloading and name hiding, not overriding, actually. However, that means that yet another possibility is to extend the standard container by providing accessors that are free functions, and then use these accessors instead of operator[]. The bad part is that this requires a maintainer to remember that these accessors must be used. Also, things like front(), begin() and rbegin() might also need to be considered.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed