renew

Printable View

07-25-2007
siavoshkc

renew

What is realloc() equivalent of new?
07-25-2007
laserlight

There is none (not as directly, anyway). You could read Why doesn't C++ have an equivalent to realloc()?
07-25-2007
cyberfish

to the best of my knowledge, realloc() changes the size of the memory block pointed to by a pointer. Can you give an example case where "renew" can be used?
07-25-2007
siavoshkc

Quote:

Can you give an example case where "renew" can be used?

Exactly where realloc can be used.

If new is the replacement of malloc() it should has a replacement for realloc(). Example:

Code:

int *intArr = new int [200]; //We need a bigger array; intArr = renew (intArr) [400];

Good link laserlight!
07-25-2007
cyberfish

Quote:

Code:

int *intArr = new int [200]; //We need a bigger array; intArr = renew (intArr) [400];

in that case I would use a std::vector
07-25-2007
matsp

Since "new" doesn't quite do the same thing as "malloc"[1], what do you expect "renew" to do?

If you consider something like this:

Code:

class blah { blah() { a = new int[20]; .... } private: int *a; }; ... class blah *p = new blah[100]; .... p = renew blah[200]; ....

Should we call the constructor for blah 100 times or 200 times in the second instance? What do we do with the content of the first 100 instances of *a - allocate new ones, copy the existing ones or something else?

If you have a need to "resize" your datablocks, why not allocate them in "chunks" so that you get some spare size left over, and then fill that out until you get to the "size" and then do your own "renew" by using allocating a new array (or whatever it is you're dealing with) and then copying the old content into the new one.

As for how to pick a "slightly larger size", perhaps picking the next (1 << n) that is bigger than the reuqested size.

Using a fixed addition works well if you often only add a little bit, but if you don't know what size you're going to grow the block of memory to, then you may find that a "double it each time" is a better way [2].

[1] It's certainly possible to implement "malloc" by calling "new", but not the other way around.

[2] A collegue of mine worked on a project where someone used the method of "adding a fixed size more" (I think 4K) each time the buffer was too small. It started out at 4K and grew to 16MB in his case. Growing that at 4K at a time is PAINFULL (lots of memory copying). By using the "double the size when it's not large enough", he could reduce the runtime of that particular app by a factor of 10 or so.

--
Mats
07-25-2007
matsp

Quote:

Originally Posted by siavoshkc

Exactly where realloc can be used.

If new is the replacement of malloc() it should has a replacement for realloc(). Example:

Code:

int *intArr = new int [200]; //We need a bigger array; intArr = renew (intArr) [400];

Good link laserlight!

If that is exactly what you're doing in your code (or at least something that is just a plain array of some data with no "class" or "complex content"), then you could actually just use malloc/realloc. Unless of course where you're using this code FORBIDS use of classic C-functionality. Just remember to NOT use "delete" when you de-allocate the memory, use free instead.

As I explained in the previous post, "renew" isn't quite so easy to implement for the more complex scenarios. For trivial "stuff", you could implement something like this:

Code:

void *renew(void *ptr, size_t origsize, size_t newsize) { // no error checking here - it should be done. void *p = new char[newsize]; memcpy(p, ptr, origsize); return p; }

But this is NOT the same as using new at all. Note also that we're assuming that it's GROWING - allowing the reducing of size is needs more code.

--
Mats
07-25-2007
King Mir

Quote:

Originally Posted by matsp

Since "new" doesn't quite do the same thing as "malloc"[1], what do you expect "renew" to do?

If you consider something like this:

Code:

class blah { blah() { a = new int[20]; .... } private: int *a; }; ... class blah *p = new blah[100]; .... p = renew blah[200]; ....

Should we call the constructor for blah 100 times or 200 times in the second instance? What do we do with the content of the first 100 instances of *a - allocate new ones, copy the existing ones or something else?

This is exactly why renew would be useful. Renew would only reconstruct the objects that it needs to. This would be like vector, except that resizing the vector beyond capacity does not necessarily require the vector to copy elements if the operating system can spare the extra space after the internal dynamic array. It leaves the task of whether to copy the element in the bigging over or not up to the operating system, which knows more about the available memory than the program.

Quote:

If you have a need to "resize" your datablocks, why not allocate them in "chunks" so that you get some spare size left over, and then fill that out until you get to the "size" and then do your own "renew" by using allocating a new array (or whatever it is you're dealing with) and then copying the old content into the new one.

Becomes sometimes you need sequential memory. Vectors are sometimes preferred over deques. Vector and vector like objects would benefit from renew.

Quote:

As for how to pick a "slightly larger size", perhaps picking the next (1 << n) that is bigger than the reuqested size.

Using a fixed addition works well if you often only add a little bit, but if you don't know what size you're going to grow the block of memory to, then you may find that a "double it each time" is a better way [2].

That's the problem. Increasing the capacity of the vector by a little bit should not require copying every element over most of the time.
07-25-2007
matsp

Quote:

Originally Posted by King Mir

This is exactly why renew would be useful. Renew would only reconstruct the objects that it needs to. This would be like vector, except that resizing the vector beyond capacity does not necessarily require the vector to copy elements if the operating system can spare the extra space after the internal dynamic array. It leaves the task of whether to copy the element in the bigging over or not up to the operating system, which knows more about the available memory than the program.

Most operating systems don't do realloc natively - it's done in the C-library, and it's done by looking at the internal data in the heap, to see if there's enough space to expand it [e.g. we asked for a block of 4K, but the heap contained one that is bigger, so the "rest" of that block is free - realloc can just change the size of the block. If it's unable to expand "in situ", it does a fresh malloc and copies the data anyways].

Quote:

Becomes sometimes you need sequential memory. Vectors are sometimes preferred over deques. Vector and vector like objects would benefit from renew.

That's the problem. Increasing the capacity of the vector by a little bit should not require copying every element over most of the time.

It's just a case of "where do you solve the problem" as well as "how do you know what needs to be done". I'm not clued up enough about how vector is implemented, but I'm pretty sure it doesn't specifically say that you couldn't allocate "spare area" whenever it needs to grow, and thus only actually grow it when it runs out of spare area - selecting increasingly larger spare area when it's growing larger. Not saying this is how vector is implemented at the moment, but that it COULD BE done. And it certainly can be done without using "renew".

--
Mats
07-25-2007
laserlight

Quote:

This is exactly why renew would be useful. Renew would only reconstruct the objects that it needs to. This would be like vector, except that resizing the vector beyond capacity does not necessarily require the vector to copy elements if the operating system can spare the extra space after the internal dynamic array. It leaves the task of whether to copy the element in the bigging over or not up to the operating system, which knows more about the available memory than the program.

What I am curious about is whether it is possible to write an allocator for vector that tries to grow the vector without copying existing elements to a newly allocated block unless absolutely necessary, keeping in mind that the vector's elements are expected to be contiguous. It does not sound possible to me, but then I am a beginner at allocators and sophisticated memory management in general.

Quote:

I'm pretty sure it doesn't specifically say that you couldn't allocate "spare area" whenever it needs to grow, and thus only actually grow it when it runs out of spare area - selecting increasingly larger spare area when it's growing larger.

That is precisely what vector is expected to do, hence the member functions capacity(), reserve(), resize() and size(). I believe some existing implementations do double the capacity when it is maxed out.
07-25-2007
King Mir

Quote:

Originally Posted by matsp

Most operating systems don't do realloc natively - it's done in the C-library, and it's done by looking at the internal data in the heap, to see if there's enough space to expand it [e.g. we asked for a block of 4K, but the heap contained one that is bigger, so the "rest" of that block is free - realloc can just change the size of the block. If it's unable to expand "in situ", it does a fresh malloc and copies the data anyways].

Right, that's what realloc does. A renew function/operation would do the same thing but with constructors.

The OS may allocate 4K to the program. But the c++ code cannot directly access that 4K; it relies on calls to malloc, calloc, new, ext. Thus if you have a size 100 dynamic array, which uses objects with nontrivial constructors, and you want to increase the size by 1, there is currently no way to do it without copying the entire array over and deleting the old array. This is inefficient.

Quote:

It's just a case of "where do you solve the problem" as well as "how do you know what needs to be done". I'm not clued up enough about how vector is implemented, but I'm pretty sure it doesn't specifically say that you couldn't allocate "spare area" whenever it needs to grow, and thus only actually grow it when it runs out of spare area - selecting increasingly larger spare area when it's growing larger. Not saying this is how vector is implemented at the moment, but that it COULD BE done. And it certainly can be done without using "renew".

Vector itself can be implemented to work like realloc because there is no requirement that vector is written in c++ (as long as it behaves like it's written in c++). More to the point is how to write an more specialized version vector that uses realloc like functionality in c++ for our own purposes.
07-25-2007
CornedBee

Quote:

Originally Posted by laserlight

What I am curious about is whether it is possible to write an allocator for vector that tries to grow the vector without copying existing elements to a newly allocated block unless absolutely necessary, keeping in mind that the vector's elements are expected to be contiguous. It does not sound possible to me, but then I am a beginner at allocators and sophisticated memory management in general.

It's not possible. There were proposals to make it possible, but I don't know about their status.

Quote:

That is precisely what vector is expected to do, hence the member functions capacity(), reserve(), resize() and size(). I believe some existing implementations do double the capacity when it is maxed out.

In fact, doubling the capacity is required behaviour for push_back if the vector is full, and no implementation I know does anything different for other insertion functions.
07-25-2007
siavoshkc

I don't repeat King Mir.

Quote:

In that case I would use a std::vector

Don't you ever thought how vector grows?

Quote:

Using a fixed addition works well if you often only add a little bit, but if you don't know what size you're going to grow the block of memory to, then you may find that a "double it each time" is a better way [2].

It is not good. Example:

You need to get some numbers from input. Initially ou make space for 100 items. But there are 122 items! If you double the allocation, you will have room for 200 items. 200 - 122 = 78
78 unit of memory is wasted.

Quote:

...then do your own "renew" by using allocating a new array (or whatever it is you're dealing with) and then copying the old content into the new one.

Of course I can do my own renew if I could write the realloc function by myself!

renew should first look at the memory allocated by the last new. If it has more room at its tail then that space will be allocated. In this case the pointer does not change. Else it should find a continous free space at the size we want and copy the contents of our array to this new location, delete the old array, return the address of this new location.
07-25-2007
matsp

There's no guarantee that the "behind the scenes" allocation isn't larger than what you need anyways. It's obviously not necessary to DOUBLE the allocation either - adding 50% or some other proportion would work too. If you grow by 25% at "time to grow", you'd end up with 125 items on the growth, and only waste 3 items of memory if you have 122 items acutally in use - but then you get 4 times as many copies if you grow many times.

It is the usual tradeoff between "memory and processing power" - you either waste CPU-cycles, but save memory or use less memory-space, but use more CPU-cycles. If you want to know exactly how many items you need, you could read your file twice, once to figure out how many items you have, and the second time to actually fill it in - that's probably not a good idea, but it could be done.

realloc also copies the data sometimes, and it's only because of "overallocation" in the first place that realloc is successfull without copying (or that you ONLY have one set of malloc/realloc, so your entire heap is able to grow whenever you realloc - this is unlikely to be the case in any decent-size C++ project, since C++ is very much oriented towards using dynamic memory).

Also, if you want to, you can implement your own new operator - either for ALL objects or for certain object types. With that, you could perhaps also add a functionality to support "renew" for simple cases. The complex cases still needs more knowledge of the data than you have available at that point.

--
Mats
07-25-2007
iMalc

Quote:

Originally Posted by CornedBee

In fact, doubling the capacity is required behaviour for push_back if the vector is full, and no implementation I know does anything different for other insertion functions.

I assume you mean that multiplying the size by some constant > 1, as resizing to 1.5x the size is also acceptable afaik.
07-25-2007
CodeMonkey

Isn't std::vector usually implemented to avoid copying whenever possible? Judging by what I am reading -- no. Yet the overhead involved in the checks necessary seems like it would be trivial.
07-25-2007
Daved

Yes and no. I doubt it is so low level that it is able to do the equivalent of a realloc when it needs more space. However, it attempts to minimize reallocations and is generally pretty efficient at it.

In addition, the new standard will allow move semantics which would have the potential to make reallocations even less expensive.
07-25-2007
CornedBee

Quote:

Originally Posted by iMalc

I assume you mean that multiplying the size by some constant > 1, as resizing to 1.5x the size is also acceptable afaik.

Peculiarly enough, I've just scoured both the C++98 and the C++09 drafts and cannot find any mention of the allocation complexity of repeated inserts or push_backs for vectors. In other words, as far as I can see it's legal for a vector to just reallocate for a single additional element every time push_back is called. The only place where a reallocation complexity is specified is for the range constructor (two iterators), which says that, if the iterators are input iterators, the operation will take at most order of log(N) reallocations, where N is distance(first, last). In other words, this constructor requires increments of a constant factor > 1.

I must be wrong about this. I'm absolutely sure this is required for other operations, too, but I just can't find the place.
07-25-2007
Daved

>> I'm absolutely sure this is required for other operations, too
I've never heard that. I've always been under the impression that a naive "always increase capacity by one" implementation was perfectly standard.
07-25-2007
anon

Quote:

Originally Posted by Daved

>> I'm absolutely sure this is required for other operations, too
I've never heard that. I've always been under the impression that a naive "always increase capacity by one" implementation was perfectly standard.

The standard says that push_back would be an equivalent of insert(end(), x). It also says:

Quote:

In addition, it supports (amortized) constant time insert and erase operations at the end; insert and erase in the middle take linear time.

Now, if you were to copy things around for each push_back, that wouldn't be constant time insert at the end. It would be linear time.
07-26-2007
iMalc

Quote:

Originally Posted by anon

The standard says that push_back would be an equivalent of insert(end(), x). It also says:

Now, if you were to copy things around for each push_back, that wouldn't be constant time insert at the end. It would be linear time.

That's it exactly. It wouldn't state the method used to achieve amortised constant time push_back, as they tend to like to leave that to the implementation details. But then there's probably only one way to achieve it anyway. So I guess growing by size multiplication instead of addition is implied.

Well, back to the original topic, one can easily write their own container that uses VirtualAlloc and Placement new etc, but why bother! Profile first!