Thread: efficiency issues with downsizing vector or array

  1. #1
    Registered User
    Join Date
    Sep 2006
    Posts
    835

    efficiency issues with downsizing vector or array

    I am presently using an array to store a list of elements such that the exact length is not known in advance, only an upper bound. The elements are progressively added, and when done, it would be nice to tell the OS that the remaining space is not needed. In C, this could be done using realloc(). In C++, Stroustrup says that it's usually easier and just as efficient to use a standard container. I could use a vector and use resize when done to reduce the size to the exact amount used.
    The question is, in this situation (either with realloc(), or with a C++ vector using resize), when the array or vector is downsized, does the OS typically try to copy the data elsewhere, or just leave it where it is? I guess it's a matter of avoiding memory fragmentation vs. the time involved in moving the data. If the data remains in place, it's a no-brainer to release the extra space.

  2. #2
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    I misread the question at first, let me try again.

    If you don't have a good guess at the size of the vector, then don't call reserve at the start. Let it figure it out. If, after you have added all your items, you have much larger capacity() than size() (which can happen if you reserve too much space at the start or if you let vector size itself), then reducing the memory to fit the amount of data will be expensive because I believe the data has to be copied.

    In the overall scheme of things, it is probably more efficient just to let the standard container handle things, and in the rare case that there really is too much memory wasted you can just create a new vector with the appropriate amount of space reserved and copy the data over.

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    I'm inclined now to create a temporary array of the maximum possible size, calculate the elements in it, and once the exact length is known, create a new array of the exact length, copy over, and delete the temporary. The thing is, there will be having tens or hundreds of thousands of these arrays existing simultaneously in memory, and it would be expensive both in terms of memory usage and memory fragmentation not to get rid of the excess. From what you've told me, copying is unavoidable, so I may as well just use plain arrays.
    I also read that size isn't the same as capacity, and it's the latter that needs to be reduced. In

    http://www.informit.com/guides/conte...eqNum=268&rl=1

    it says that using swap() is the most efficient way to reduce the capacity, but that involves copying anyway.

  4. #4
    VA National Guard The Brain's Avatar
    Join Date
    May 2004
    Location
    Manassas, VA USA
    Posts
    903
    would a <list> data structure handle these memory issues more efficiently..?
    (yes, i know it's not an array)
    • "Problem Solving C++, The Object of Programming" -Walter Savitch
    • "Data Structures and Other Objects using C++" -Walter Savitch
    • "Assembly Language for Intel-Based Computers" -Kip Irvine
    • "Programming Windows, 5th edition" -Charles Petzold
    • "Visual C++ MFC Programming by Example" -John E. Swanke
    • "Network Programming Windows" -Jones/Ohlund
    • "Sams Teach Yourself Game Programming in 24 Hours" -Michael Morrison
    • "Mathmatics for 3D Game Programming & Computer Graphics" -Eric Lengyel

  5. #5
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Sort of, but then, not really. A list will only allocate as many nodes as needed (probably - nothing really prevents an implementation from allocating several nodes at once so it doesn't have to allocate on every add, but that is not a very likely implementation), yes. But it has a considerable size overhead for each node (16 bytes on 64-bit machines - if you're storing 4-byte ints in that list, that's 80% wasted space).

    I agree with Daved. Allocate the vectors as they are and just add the elements with push_back. The vector will resize as necessary. If the number of elements differs widely, this is often the most efficient.
    If not - if, after profiling, you've determined that you're losing time exactly there - you can still replace the way the vectors are filled.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  6. #6
    Registered User
    Join Date
    Jan 2003
    Posts
    311
    "Premature optimization is the root of all evil"
    -- Donald Kunth

    One of vectors tricks is that excess capacity is uninitialized. what this means is that the os will never have to allocate any page frames. We don't really care how many pages we have asked for in virtual address space. It does give the os a bit of a headache in that it sees your program as potentially larger that it really is, and gives the heap more entries to track then necessary. But it's mostly just bookkeeping.

    If you can throw away a lot of the entries when you are done then it's worth your time to copy them over to a new vector with an exact capacity.

    as always, profile before you optimize

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Efficiency with c++
    By pastitprogram in forum C++ Programming
    Replies: 17
    Last Post: 08-08-2008, 11:18 AM
  2. Programme Efficiency
    By Cikotic in forum C Programming
    Replies: 3
    Last Post: 06-28-2004, 01:29 PM
  3. Binary tree search efficiency
    By ExCoder01 in forum A Brief History of Cprogramming.com
    Replies: 8
    Last Post: 10-23-2003, 10:11 PM
  4. hexdump issues
    By daluu in forum C Programming
    Replies: 2
    Last Post: 03-04-2003, 09:01 PM