Fun ideas!

Quote Originally Posted by MutantJohn
Lame. I was hoping. What if I allocated specifically from the end of the pool? Like, I was writing from the last registers in memory, assuming ram is used in a from the bottom up sort of fashion. Or is it just scattered about loosely and intermittently?
You can't do anything like that with malloc. The whole point of malloc is to give you "some memory" on request without you having to know where the memory is, whether it's available for use or used by another data structure, how to mark it as unused when it's freed....

But suppose you could. Suppose you allocated your vector at the beginning of the heap and other data at the end, so the vector could grow without running into other data items. What happens when you want to have another vector - where does it go? You'd have to keep track of where all your vectors were in order to know where to allocate new ones.

Like I said, you can't do this with malloc. The documentation for malloc on Linux has links to the low level OS calls you could use if you were very determined: malloc(3): allocate/free dynamic memory - Linux man page
You could use sbrk() to set/get the heap limit and merrily write what you want where you want. You couldn't use malloc in the same code though, as it wouldn't know where your data was and would happily overwrite it.

C++ doesn't avoid doing reallocs - from here: vector - C++ Reference

Quote Originally Posted by vector doc
Libraries can implement different strategies for growth to balance between memory usage and reallocations, but in any case, reallocations should only happen at logarithmically growing intervals of size so that the insertion of individual elements at the end of the vector can be provided with amortized constant time complexity (see push_back).
As vectors get larger, copies get more expensive. But reallocs become less frequent, so the cost is mitigated. Small vectors are cheap to copy, so we can afford to call realloc at smaller intervals. We can do the same in C...

Quote Originally Posted by rcgldr View Post
It would seem that void **data would require the code to allocate space for the array as well as the data. A vector implementation is closer to this (void * data versus void **data):

Code:
struct vector {
    size_t alloc_size;
    size_t last_elem;
    void *data;  // pointer to first element
};
Using this sort of thing, with realloc and a lot of nasty pointer arithmetic, you can make yourself a sort of vector-like thing in C.

I'd use void* rather than void** as above, store the element size, and I'd probably store 'maximum elements' rather than alloc size:
Code:
struct vector {
    size_t alloc_size;
    size_t elements;
    size_t max_elems;
    void *data;  // pointer to first element
};
C doesn't have templates or overloaded functions, so everything will have to be in terms of void* to make this usable generically. Here are a handful of example functions:

Code:
Vect * InitVect(size_t elsize)
{
   Vect *v = malloc(sizeof(Vect));
   v->elements = 0;
   v->max_elems = INITIAL_SZ/elsize;
   v->elsize = elsize;
   v->data = malloc(v->max_elems * v->elsize);   
   return v;
}

void push_back(Vect *v, void* el)
{
   if (v->elements > v->max_elems-1)
   {
      void *temp;
      // exponential growth --> realloc at logarithmic intervals
      v->max_elems *=2;
      temp = realloc(v->data, v->max_elems * v->elsize);
      if (!temp)
      {
          printf("\nFatal error: realloc failed");
          Destroy(v);
          exit(1);
      }
      v->data = temp;
   }
   memcpy(v->data+(v->elements * v->elsize), el, v->elsize);
   v->elements++;   
}

void *get(Vect *v, size_t i)
{
    return v->data + (i * v->elsize);    
}

void set(Vect *v, size_t i, void* el)
{
   memcpy(v->data + (v->elsize * i), el, v->elsize);   
}
Since everything has to be of type void*, you'd have to deal with pointers-to-data when using it. I couldn't think of a way to get around that.
Code:
int main(void)
{
   Vect *v = InitVect(sizeof(int));
   int in;
   int *out;

   push_back(v, 10); // NO.
   in = 10;
   push_back(v, &in); // yes   
}
These snippets are totally unsafe -- if you try to set() something way outside of the vector, the code will happily write to memory outside of the vector, much like array accesses aren't bounds-checked. That could be easily fixed. Also, because everything is a void* you have absolutely no type checking. You could pass a double or a char and the code would still treat it like it was int-sized.

Other vector-like functions can be implemented in much the same way and will have (as far as I can see) the same complexity (efficiency) as C++ vectors. Iterators would just be void* pointers to particular elements. Things like reserve() and capacity() are trivial.

Out of interest, I decided to have a look at how expensive the copy really is. These experiments were with gcc 4.7.2 with -O2 on a 64-bit Ubuntu box. I added some debugging lines to the code to print out when realloc needed to do a copy. This will provoke a copy every realloc:
Code:
   Vect *v = InitVect(sizeof(int));
   Vect *v2 = InitVect(sizeof(int));   
   int i;   
   for (i = 0; i < NUM; i++)
   {   
       push_back(v, &i);              
       push_back(v2, &i);                           
   }
Separating this into two separate loops gets rid of the copies for smaller reallocs, but for large reallocs (4MB and higher) a copy always takes place, and the memory is allocated from somewhere else.
I implemented a reserve, and did a reserve(NUM). That got rid of all the copies except one, but the program didn't run any faster (according to the output of time).
I ran the test with valgrind --tool=callgrind, and had a look in kcachegrind. For large values of NUM (20000000), everything else is simply dwarfed by the time spent memcpying the elements into the vector. Realloc time is negligible.

For a small value of NUM (2000), realloc accounted for 1.8% of execution time when no copying took place, and 2.6% with the pathological interleaved push_back calls. To put that in context: the for loop in the test was responsible for 2.7%. Again, memcpy took the most time.

So given all that, I wouldn't worry about the time taken if realloc needs to do a copy.

Although this has been an interesting topic to investigate, I must say that if you're in love with std::vector, just use std::vector!