Thread: Extensible vector in both ends?

  1. #1
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286

    Extensible vector in both ends?

    With vector you can push_back and pop_back elements, and you can resize it to add or remove multiple elements in the end of the vector. But you can't puch_front or pop_front elements. Neither can you extend the vector in the start if you'd like to (or at least it will take much longer time). It's a pity, cause now I need to do that. Is there some other container with which you can do that? Do I have to create an own container class?

    Thanks
    Come on, you can do it! b( ~_')

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I think what you're looking for is a "deque"(Double Ended Que[ue]).

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    With vectors, you have the erase and insert member functions, though they are more expensive than push_back and pop_back. Also consider either a list or a deque; each has efficiency advantages and disadvantages for various operations. If you are mainly concerned with efficiently adding/deleting elements at either end rather than in the middle, and you want efficient subscripting, a deque is probably best.

    http://www.cplusplus.com/reference/s...tor/erase.html
    http://www.cplusplus.com/reference/s...or/insert.html
    http://www.cplusplus.com/reference/stl/deque/

  4. #4
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    But don't forget the fundamental rules of the containers:
    Rule #1: Use a vector.
    Rule #2: If you think you need something different, use a vector.
    Rule #3: If profiling proves that a vector is bad, then consider using a list or deque.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  5. #5
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    I don't understand the strong preference for a vector. It doesn't require profiling to know that a vector won't allow efficiently adding or deleting elements at the beginning. Even if it's not noticeable for a small vector, it won't scale up, not to mention that the choice of container documents what the needs of the program are.

  6. #6
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    It's pretty simple: vector has the best cache coherency. Unless copying of the individual elements is expensive (std::string is one case - but not once we get r-value references in C++09), it takes a lot of elements to overcome the considerably greater overhead that, say, a list has. When you're dealing with primitives, it's most noticeable. On a 64-bit system, for example, a list<int> needs 5 times as much space to store a single element as a vector. Iteration through a vector is faster by orders of magnitude than through a list (despite both claiming to be O(N)), because a list's nodes may well be all over the place and cache coherency is catastrophic, and also because prefetching is completely impossible.

    Unless your insertions at the beginning really dominate your iteration, vector is still probably faster.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  7. #7
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    What about a deque vs. a vector? I know a deque isn't guaranteed to have contiguous storage like a vector is, but isn't it implemented using something like a relatively small number of contiguous chunks of memory?

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by robatino View Post
    What about a deque vs. a vector? I know a deque isn't guaranteed to have contiguous storage like a vector is, but isn't it implemented using something like a relatively small number of contiguous chunks of memory?
    I don't think that is at all guaranteed.

    http://www.sgi.com/tech/stl/Deque.html

    It does, however, from the description above, appear that it will at least to some extent, store data in blocks, rather than as for example a linked list (which is another possible implementation).

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I wanted to make this an edit to my previous post, but you posted in the meantime. I'll respond to your question at the end.

    Take, for example, a game like space invaders, where you need to store bullets. Let's look at the characteristics of access, shall we?
    1) Every frame, we iterate through all bullets and move them.
    2) If a bullet hits something or goes out of the screen, it is removed.
    3) New bullets are added to the end.

    OK, so we've got linear access, removal from the beginning and middle rather than the end (because older bullets wander to the front), and insertion at the end. Sounds like a job for a list, does it not?

    Well, let's see. You need to iterate through the bullets 30 times per second to keep up the frame rate, whereas you may have bullets being deleted less often - depending on the number of bullets flying around, perhaps 10 per second, perhaps only 2. In other words, a slow iteration will completely make up for any effects fast removal might have. And don't forget that insertion occurs exactly as often as removal, so faster insertion pays off, too.

    Vector: Iteration is advancing of pointers. Computers are good at that. Elements are fetched from RAM into the cache in advance, because the predictor expects them to be accessed and knows where they are: at the next higher address range. RAM latency is perhaps the most important optimization factor, once you've got complexity out of the way. It is so important that it can easily trump, say, the difference between O(N) and O(N*logN) for any but the largest Ns.
    Insertion is copying of a bit of memory. If the vector runs out of space, it means allocating a new block of memory and copying a bit more. Because a vector should always double required space on reallocation, it's at most logN reallocations. It makes for amortized constant time. Because in our scenario bullets come and go, there is something of an upper limit to the number of bullets, so after an initial growing phase, it actually is really constant time. And because the programmer can make a test run and see how large the vector grows when there's a lot going on in the game, and because he can then place a reserve() call at the beginning, it is constant time from the word go.
    Erasure is linear time, element-wise copying of memory until it has all shuffled into position. On the other hand, it is still cache-coherent, similar to iteration.

    List: Iteration is linear time, but it is bouncing from pointer to pointer. Cache coherency is low. Even if some nodes are accidently allocated next to each other, given that there are 2 pointers + heap management overhead there, you can fit very few nodes in a cache line, so you will very often have to wait for RAM to catch up. Because the location of the next node is impossible to know until you've fetched the current one, the predictor is useless.
    Insertion: Allocation of a new node, linking it to the old one, copying the data. Constant time, but the allocation makes this hopelessly slower than vector. A pool allocator can be used to somewhat reduce the overhead. It is also the only way to pre-allocate memory.
    Erasure: quite quick, since you discover it during iteration and it doesn't need shifting of elements. Still, unless you use a pool allocator, you've got heap management overhead on deleting. Oh, and did I mention the heap fragmentation you get if you don't use a pool allocator?

    So, which is faster? My bet is on the vector, but I'm not entirely sure. That's why there's rule #3. What I do know is that the "rules" (linear vs random-access traversal, where are the insertions and erasures?) are insufficient to decide, and that a vector should always be the first choice. (Unless you really, really need the iterator invalidation guarantees of the list.)

    If the vector turns out to be slower, here are three optimizations I can spontaneously think of:
    1) Make sure the vector's element type is a POD. Then make sure your standard library optimizes the memory moves of erase to a memmove() call.
    2) Optimize erasures; implement them yourself. As you iterate through the vector, move the elements back. Because you move them as far as you have deleted elements, you save yourself going through the vector multiple times.
    3) Don't really erase elements. Instead, mark them as erased. Re-use the storage by searching for an erased bullet first. Because of the way aging of bullets work, keep the index of the last insertion and start your search from there; you'll get amortized constant time insertions despite the linear search for a free space, while making erasures constant-time, too.



    Regarding deque: I never looked into the details of deque implementation. I don't know how it would perform in various scenarios. See rules 1, 2 and 3.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  10. #10
    Registered User
    Join Date
    Nov 2006
    Posts
    519
    Thanks, that was an very enlightening post.

    But one question:
    from the link posted by robatino

    deque (usually pronounced like "deck") is an irregular acronym of double-ended queue. Double-ended queues are a kind of sequence containers. As such, their elements are ordered following a strict linear sequence.
    so, shouldn't the deque show the same cache-behavior like the vector?

  11. #11
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    No, that statement is information-theoretical. It merely means that each element has an associated numeric index in the range [0, n).
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    A deque is typically implemented as a vector of vectors.

    I fully support CornedBee's reasoning as to why a vector should be preferred. I hardly ever use a std::list myself, but often use a std::vector.
    I might consider a list are when the objects to go in it are very large though.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286
    Maybe I can use vector, I realised I'll probably don't have to extend it in the beginning, only decrease the length (big ten exponent for bigints creates an awfully lot of zeroes in the beginnig). Hence I can use an offset and add it to the iterator so I can know where to start to read in the vector. Thank you.

    Just curios, a container class extensible in both directions doesn't automatically mean that it has to be slower than vector in time complexity, or even time factor, or does it? I see the extensibility in the beginning as a generalization of something that has already been applied to on end in vector.
    Come on, you can do it! b( ~_')

  14. #14
    The larch
    Join Date
    May 2006
    Posts
    3,573
    If I'm not mistaken deque is required not to move memory around when something is added to the end or beginning, a vector occasionally does is if you push_back. Hence it needs to work with chunks of the contained array that stay in the original place and therefore can't all be contiguous.
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  15. #15
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I'm afraid you are mistaken. Deque has quite horrible invalidation semantics. All insertions, no matter where, invalidate all iterators. Erasures in the middle invalidate all iterators. Only erasures at the beginning and end (pop_front and pop_back) keep iterators valid.

    Just curios, a container class extensible in both directions doesn't automatically mean that it has to be slower than vector in time complexity, or even time factor, or does it?
    It would be possible to implement a vector-like class that keeps empty space at both sides of the actual sequence. But reserving memory for such a container would have a tricky interface because you also need to decide where to reserve that memory. The semantics would be difficult to define.
    Hmm ... I wonder, though, if a conforming deque could be implemented this way ...


    *goes to check*

    Hmm, nope. Worst-case constant time for end insertions required, amortized constant time not allowed.
    Last edited by CornedBee; 10-04-2007 at 06:39 PM.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Extensible language... What?
    By jordanguyoflove in forum C Programming
    Replies: 2
    Last Post: 10-15-2008, 10:02 AM
  2. How to cap the gluCylinder() ends
    By ting in forum Game Programming
    Replies: 1
    Last Post: 06-02-2008, 05:03 PM
  3. program ends immediately
    By willc0de4food in forum Windows Programming
    Replies: 11
    Last Post: 09-04-2005, 06:52 PM
  4. Socket abruptly closes after function ends
    By Sfpiano in forum Networking/Device Communication
    Replies: 1
    Last Post: 08-08-2005, 04:49 PM
  5. Some loose ends
    By pdstatha in forum C Programming
    Replies: 7
    Last Post: 03-31-2002, 04:03 PM