Assignment Operator, Memory and Scope

**SevenThunders** · 03-27-2008

Originally Posted by cpjust

But to do it properly, you'd need to run the destructors for all the objects in the huge block of memory you're deleting, so other than extra function call overheads, I don't see how it could make that much of a difference. Besides, then whenever the GC starts its deletion cycle, you'd have to wait a long time until all the memory is freed instead of just a lot of little waits by cleaning up memory as you go.

For memory issues not necessarily. If you had built in GC you might never have any destructors! Now of course there are other things that need finalizing besides memory. Java has finalizers if you really feel you must associate some task with the freeing of memory. D has scope(exit), which is cleaner in some ways.

I will quote from D's page on garbage collection with regards to performance. I actually think they overstate their case a bit:

C and C++ programmers accustomed to explicitly managing memory allocation and deallocation will likely be skeptical of the benefits and efficacy of garbage collection. Experience both with new projects written with garbage collection in mind, and converting existing projects to garbage collection shows that:

* Garbage collected programs are faster. This is counterintuitive, but the reasons are:
o Reference counting is a common solution to solve explicit memory allocation problems. The code to implement the increment and decrement operations whenever assignments are made is one source of slowdown. Hiding it behind smart pointer classes doesn't help the speed. (Reference counting methods are not a general solution anyway, as circular references never get deleted.)
o Destructors are used to deallocate resources acquired by an object. For most classes, this resource is allocated memory. With garbage collection, most destructors then become empty and can be discarded entirely.
o All those destructors freeing memory can become significant when objects are allocated on the stack. For each one, some mechanism must be established so that if an exception happens, the destructors all get called in each frame to release any memory they hold. If the destructors become irrelevant, then there's no need to set up special stack frames to handle exceptions, and the code runs faster.
o All the code necessary to manage memory can add up to quite a bit. The larger a program is, the less in the cache it is, the more paging it does, and the slower it runs.
o Garbage collection kicks in only when memory gets tight. When memory is not tight, the program runs at full speed and does not spend any time freeing memory.
o Modern garbage collectors are far more advanced now than the older, slower ones. Generational, copying collectors eliminate much of the inefficiency of early mark and sweep algorithms.
o Modern garbage collectors do heap compaction. Heap compaction tends to reduce the number of pages actively referenced by a program, which means that memory accesses are more likely to be cache hits and less swapping.
o Garbage collected programs do not suffer from gradual deterioration due to an accumulation of memory leaks.
* Garbage collectors reclaim unused memory, therefore they do not suffer from "memory leaks" which can cause long running applications to gradually consume more and more memory until they bring down the system. GC programs have longer term stability.
* Garbage collected programs have fewer hard-to-find pointer bugs. This is because there are no dangling references to freed memory. There is no code to explicitly manage memory, hence no bugs in such code.
* Garbage collected programs are faster to develop and debug, because there's no need for developing, debugging, testing, or maintaining the explicit deallocation code.
* Garbage collected programs can be significantly smaller, because there is no code to manage deallocation, and there is no need for exception handlers to deallocate memory.

There's also the issue about total memory usage. If your program is a big memory hog, it will continue to grow until the system starts running low on RAM and starts paging out some of your data to disk. If everybody was using GC and using a lot of memory, this situation would occur even faster. Sure you could say install more RAM, but most of it is just junk waiting to be garbage collected, so why force people to install more RAM when they shouldn't have to?

GC is not as memory efficient. I have no argument there. It is improving however as the algorithms improve. It used to be that you needed twice as much memory on average to run efficiently. However now that memory overhead requirement is much reduced. You can look at the literature to follow what's going on I suppose.

**Elysia** · 03-28-2008

Originally Posted by SevenThunders

I'm talking about two types of errors that are usually only found during runtime. Dangling pointers (pointing to already freed blocks) and memory leaks. These two types of errors go away with GC and is one of the major attractions for the technology.

Both of these go away with smart pointers, as well.

For app.s with large blocks of persistent data and/or that rarely use new and or delete I agree with you. For app.s that have a lot of small blocks of data to allocate and delete I'm not so sure. Let's agree to disagree on this.

Can't say I have any proof, but it still sounds bogus to me.

My interest in threads is primarily for multiprocessor systems and partitioning computational cycles. Usually you can partition the data with the computations, otherwise you use locking mechanisms like you describe. I like FIFOs for asynchronous comm.s between threads, but then I think often how hardware would do things instead of software. By the way, even at the hardware level it is theoretically impossible to create race free asynchronous data transfer. You can only reduce the probability to close to 0.

It's still impossible to use a multi-threaded GC without any sort of locks or atomic operations because it can move data around and delete them without warning.

I know that there are multithreaded versions of various GC's. Boehm has one and how he manages it is an interesting question. I suppose there would be a few ways to do it. You could either clone the GC into each thread, or let the GC run in it's own thread and queue services, it's probably an interesting research topic all on it's own.

And aside from that, the way you describe it, "when memory runs tight," the garbage collector kicks in. And if it's single threaded, you're going to get quite a slowdown. Myself, I tend to prefer when apps spread out code thinner so that execution my be a little slower, but no big "gap" where it just stops and does a lot of maintenance. That little slower code will get faster with faster processors and you will notice nothing in the end. But with big gaps, when the app freezes... well, it's obviously not going away anytime soon.

I'll have to think about this statement a bit. Obviously Java does window management and they have a GC and of course there are windows APIs for many other garbage collected languages. I think if you are stuck with the API of an existing library in C++, that's not using garbage collection you are probably correct. You are relying on destructors to do more than free memory, you are probably tearing down other resources. You mess with that logic at your peril. I'm curious what sort of garbage collector you tried to write and why?

No, I do mean Windows API. Tell me if you find something that can deallocate a lot of allocated memory in one sweep. And it has to be able to deallocate all memory for say, a 1000 allocation or more, perhaps spanning several pages.
VirtualFree does not apply, since it can deallocate originally passed pointers to VirtualAlloc or free the pages and make them reserved instead, which is out of the question.
I tried to write one as a test to combine GC's advantages (group deallocation) and C++'s smart pointers to run destructors at appropriate times.

Also it makes sense at first to use object oriented code for windows management. After all each of those windows and dialogues have their own state, and that's how the libraries are structured. I have found personally some problems with the paradigm however as the complexity increases. Control flow can become nightmarish. Just what method am I executing now? If I want to add behavior X which darn object and wretched method do I use or overload to implement it?

You make it worse than it sounds. This is the key to good design and have nothing to do with GCs in general.

**CornedBee** · 03-28-2008

No, I do mean Windows API. Tell me if you find something that can deallocate a lot of allocated memory in one sweep. And it has to be able to deallocate all memory for say, a 1000 allocation or more, perhaps spanning several pages.

You can't use the Windows stock allocator in this case, of course. The garbage collector needs its own allocator, which requests memory in large blocks from the system (via VirtualAlloc) and hands it out to the application in small chunks.

**SevenThunders** · 03-28-2008

Originally Posted by Elysia

It's still impossible to use a multi-threaded GC without any sort of locks or atomic operations because it can move data around and delete them without warning.

A conservative garbage collector won't touch memory locations once allocated (e.g. the Boehm collector). Of course once freed, who cares what happens to the data. To avoid locks with a compacting collector you'd have to let the thread that owns the data do the moving, and probably leave shared data alone. Just a thought. Hopefully I won't be writing my own GC, I'll let the experts do that. Perhaps using locking is not so bad in this case anyway if you use an incremental collector. Only certain memory blocks would be locked out at any given time. I just don't like it because I see locking as an immediate performance hit.

And aside from that, the way you describe it, "when memory runs tight," the garbage collector kicks in. And if it's single threaded, you're going to get quite a slowdown. Myself, I tend to prefer when apps spread out code thinner so that execution my be a little slower, but no big "gap" where it just stops and does a lot of maintenance. That little slower code will get faster with faster processors and you will notice nothing in the end. But with big gaps, when the app freezes... well, it's obviously not going away anytime soon.

These days it's hardly noticable. Depending on your app a collection takes milliseconds. Then the more advanced collectors who collect one region at a time can guarantee max return time from a collection. It's not much of an issue any more. Though I think Java probably underperforms here, since their GC is not very good. Perhaps things have changed.

No, I do mean Windows API. Tell me if you find something that can deallocate a lot of allocated memory in one sweep. And it has to be able to deallocate all memory for say, a 1000 allocation or more, perhaps spanning several pages.
VirtualFree does not apply, since it can deallocate originally passed pointers to VirtualAlloc or free the pages and make them reserved instead, which is out of the question.
I tried to write one as a test to combine GC's advantages (group deallocation) and C++'s smart pointers to run destructors at appropriate times.

The last time I used the windows API directly I was calling it from assembly language so I'm not expert in this area. However ultimately your GC will be used in place of malloc or new and will allocate it's own large buffer via an appropriate API call. Say using GlobalAlloc(), if you want your memory to be relocatable. As such your allocator may never actually call GlobalFree().

Memory is freed into your allocator and kept in your own private free pointer list. So you can do any darn thing you want actually within the bounds of your operating system and how you design your GC or other type of memory manager. (e.g. a fixed block size manager). The performance of your memory management is not dependent on GlobalAlloc() and GlobalFree() since they will be called infrequently.

You make it worse than it sounds. This is the key to good design and have nothing to do with GCs in general.

It is according to the dogma of object oriented programming. Don't get me wrong, I like the paradigm myself, but it's just one programming paradigm. There are many others, like functional programming as an example, which has something much more powerful than mere objects and classes, namely parameterized type classes, though this too can be emulated via C++ templates.
http://www.cs.uwyo.edu/~skothari/cppvshaskell.pdf

Another interesting way to do things is to associate data with functions (coroutines, or function delegates) instead of functions with data (methods). There are also things like data flow machines, logic programming, constraint driven programming etc. etc.

To be honest, some of these things are interesting intellectual exercises, but I'm ultimately only concerned with two things, performance and productivity. There is also the issue of correctness, which is a whole other nasty technical issue. The most productive language I've programmed in is Matlab. I'm waiting for the language with similar productivity but the performance of C

.

C++ is about 10 times slower to write than Matlab, even with operator overloading and it's huge array of built in libraries. In fact as we speak I'm translating about 200 lines of Matlab code into about a thousand lines of C++.

**Elysia** · 03-30-2008

Originally Posted by CornedBee

You can't use the Windows stock allocator in this case, of course. The garbage collector needs its own allocator, which requests memory in large blocks from the system (via VirtualAlloc) and hands it out to the application in small chunks.

I already thought of that, and it has its downsides.
Allocating small chunks has the advantage that I can release a lot of memory (being that as it's a big chance a page of memory is released rather than huge amounts of memory). However, it comes with the drawback that it kindof defines the purpose of the GC - it becomes slow, and deallocation is done is small chunks, as well.
However, using large chunks has its own drawback. Since VirtualFree can only free the base pointer allocated via VirtualAlloc, the entire big block of memory needs to be free. But there's no telling if the application uses, say, 100 integers instead of one big block of data. The drawback is that as long as even one integer remains in that big block of memory, I can't free it. So the memory becomes extremely fragmented like this.
Of course, this can probably be fixed by defragmenting the memory and moving things around, but at what cost? All pointers needs to be updated and maintenance work needs to be done.
This, again, introduces extra complicity I did not want to introduce (read: I wanted it to be a simple garbage collector).

Originally Posted by SevenThunders

The last time I used the windows API directly I was calling it from assembly language so I'm not expert in this area. However ultimately your GC will be used in place of malloc or new and will allocate it's own large buffer via an appropriate API call. Say using GlobalAlloc(), if you want your memory to be relocatable. As such your allocator may never actually call GlobalFree().

Memory is freed into your allocator and kept in your own private free pointer list. So you can do any darn thing you want actually within the bounds of your operating system and how you design your GC or other type of memory manager. (e.g. a fixed block size manager). The performance of your memory management is not dependent on GlobalAlloc() and GlobalFree() since they will be called infrequently.

Yeah, I know how it works... in theory and some experience. But GlobalAlloc is hardly a suitable API for a garbage collector. The point was that there was no suitable API for my own GC.

It is according to the dogma of object oriented programming. Don't get me wrong, I like the paradigm myself, but it's just one programming paradigm. There are many others, like functional programming as an example, which has something much more powerful than mere objects and classes, namely parameterized type classes, though this too can be emulated via C++ templates.
http://www.cs.uwyo.edu/~skothari/cppvshaskell.pdf

Well, I was referring to that it has nothing to do with garbage collectors in general... Which way of programming you use doesn't affect if you use a GC or not...

I'm not really interested in arguing the pros and cons about garbage collector vs smart pointers, but I'm not going to believe a garbage collector is faster than smart pointers unless I have some hard proof. And even then, I'm thinking the lines will blur together, as there are some areas where a GC might be better than smart pointers and the other way around in some areas.

**CornedBee** · 03-30-2008

Your worries are not GC-specific. The operating system allocates memory to processes in pages. That's a simple fact, whether you use automatic or manual memory management. Whether there's a single int on that page or it's all covered in a block of binary data doesn't matter - the page is bound to your process.

**Elysia** · 03-31-2008

Yes, but then again, if I allocated big chunks of memory using VirtualAlloc, I can't just free individual pages on that chunk. I can only decommit them and retain them as reserved (which doesn't lower the memory usage of the program).
And allocating just a page now and then defies the purpose of a GC. Or at least an efficient one.

**CornedBee** · 03-31-2008

(which doesn't lower the memory usage of the program).

Yes, it does. The moment you decommit a page, the OS is free to allocate it to a different process.

**Elysia** · 03-31-2008

OK, wait, this is a little confusing.
There are two ways of freeing commited memory:

MEM_DECOMMIT
0x4000
Decommits the specified region of committed pages. After the operation, the pages are in the reserved state.

The function does not fail if you attempt to decommit an uncommitted page. This means that you can decommit a range of pages without first determining the current commitment state.

Do not use this value with MEM_RELEASE.

MEM_RELEASE
0x8000
Releases the specified region of pages. After this operation, the pages are in the free state.

If you specify this value, dwSize must be 0 (zero), and lpAddress must point to the base address returned by the VirtualAlloc function when the region is reserved. The function fails if either of these conditions is not met.

If any pages in the region are committed currently, the function first decommits, and then releases them.

The function does not fail if you attempt to release pages that are in different states, some reserved and some committed. This means that you can release a range of pages without first determining the current commitment state.

Do not use this value with MEM_DECOMMIT.

The MEM_DECOMMIT flag leaves the memory in a reserved state (thus it can't be reallocated to another process). This can be done on any memory region.
MEM_RELEASE completely releases the memory and makes it available to other processes. However, it can only release the memory indicated by the base pointer allocated via VirtualAlloc. You can't specify size or region.

**matsp** · 03-31-2008

From your MSDN quote:

If any pages in the region are committed currently, the function first decommits, and then releases them.

So, yes, I think it will be available to another process.

--
Mats

**Elysia** · 03-31-2008

I quoted wrongly. MSDN's doc is confusing. Decommit is not associated with releasing the memory - it only leaves them in a reserved state.

**CornedBee** · 03-31-2008

You're confusing virtual address space pages and physical memory pages.
http://msdn2.microsoft.com/en-us/lib...16(VS.85).aspx

**matsp** · 03-31-2008

Reserved state means that the virtual space is reserved, but that the physical memory backing that virtual space is freely available to the OS to use. So for the overall system it's free to use that physical memory, whilst the app itself would require the memory to be "unreserved" as well as decommitted for the virtual space to be available to another application. But if you can just re-use ("recommit") the page, then that's not a problem in itself. It's only causing problems if you have a pathological case where the contiguous memory is not sufficient to satisfy the size, whilst you have sufficient free memory. But this has nothing to do with GC, but with heap-fragmentation.

--
Mats

**Elysia** · 03-31-2008

Well, what can I say? It's good to be corrected.
Perhaps there is hope, after all...
But VirtualAlloc is still a pain to use. I don't think anything can get around that...