Thread: Assignment Operator, Memory and Scope

  1. #1
    Registered User
    Join Date
    Apr 2007
    Posts
    141

    Assignment Operator, Memory and Scope

    Although I am using a garbage collector for this project, I've run into a couple of interesting situations that make me wonder just exactly what does C++ do when it leaves scope with regards to freeing memory or calling a class destructor. See the comments in the code.

    Code:
    MyClass &func1()
    {
    
    /* MyClass has some internal variables allocated with new.  This in theory allocates
    a large buffer internally. */
    MyClass *fp = new Myclass(1000000) ;
    /* a convenient variable for referencing fp. Does fpref get deleted when
    func1 returns and the destructor called invalidating the internal memory of *fp?  */
    MyClass &fpref = *fp ;
    /* the () operator is overloaded */
    fpref(7) = 8 ;
    
    return(*fp) ;
    }
    
    void func2(MyClass2 &x)
    {
    /* memory for v1 will be released and it's destructor called right? 
    But the assignment operator is a copying operator, copying all internal variables
    after using new to create new data.  Thus this creates a memory leak right?  The original
    class created in func1 via the new operator now has no reference */
    MyClass v1 = func1()  ;
    
    /* MyClass2 has as one of it's members a MyClass object */
    x.myclass = v1 ;  // so we create yet another copy of v1
    
    }
    It seems like C++ code has a bit of a conundrum with regards to passing class data around from function to function. If a class has internal data allocated via new in it's constructor, then the assignment operator should probably be overloaded to include allocating internal data again and copying it from the data available from the right hand side operand. Otherwise you run into bugs generated from having surprising multiple references to the same data.

    However this appears to create some problems. First it at times generates unnecessary copies of memory. If func1() returns a large object it appears that it gets copied twice for no particularly good reason. Moreover, doesn't it create a big memory leak on it's first assignment?
    MyClass v1 = func1() ;
    v1 gets copied from the output of func1() due to the assignment operators deep copy correct? Thus the original reference is lost.

    I've also noticed for the compiler I'm using (MS visual studio 2008), that intellisense tagging doesn't work for pointers to classes. For example if I've overloaded the () operator, I'm not sure if something like
    (*fp)(7) is legal code. However I'm concerned that if I create a reference to *fp, e.g. fpref, that it will get deleted and it's destructor called at the return of the function, thereby invalidating *fp's data.

    Is it a true statement, therefore, that in order for me to avoid excessive copying or to prevent automatic freeing of my data, all references to a class must be carried in a pointer in a case like this? If so, a lot of syntax sugar (operator overloads) get's messed up.

  2. #2
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by SevenThunders View Post
    It seems like C++ code has a bit of a conundrum with regards to passing class data around from function to function. If a class has internal data allocated via new in it's constructor, then the assignment operator should probably be overloaded to include allocating internal data again and copying it from the data available from the right hand side operand. Otherwise you run into bugs generated from having surprising multiple references to the same data.
    Yes, this is one of the usual considerations in a class which allocates a bare resource.

    However this appears to create some problems. First it at times generates unnecessary copies of memory. If func1() returns a large object it appears that it gets copied twice for no particularly good reason. Moreover, doesn't it create a big memory leak on it's first assignment?
    Yes, and yes. All common issues.

    v1 gets copied from the output of func1() due to the assignment operators deep copy correct? Thus the original reference is lost.
    My preferred strategy is to use containers to hold resources which do the right thing when they are "deep copied." In other words, design your class so that the default deep copy does the right thing.

    I've also noticed for the compiler I'm using (MS visual studio 2008), that intellisense tagging doesn't work for pointers to classes. For example if I've overloaded the () operator, I'm not sure if something like
    (*fp)(7) is legal code.
    It is legal code.

    However I'm concerned that if I create a reference to *fp, e.g. fpref, that it will get deleted and it's destructor called at the return of the function, thereby invalidating *fp's data.
    Creating a reference could never cause such an effect.

    Is it a true statement, therefore, that in order for me to avoid excessive copying or to prevent automatic freeing of my data, all references to a class must be carried in a pointer in a case like this? If so, a lot of syntax sugar (operator overloads) get's messed up.
    That's not the only way. Invent a container which does what you need, and use it.

  3. #3
    Registered User
    Join Date
    Apr 2007
    Posts
    141
    Quote Originally Posted by brewbuck View Post

    That's not the only way. Invent a container which does what you need, and use it.
    Can you give me a barebones, simple example of what you mean by this? The word container connotes a few ideas in my mind, not all of which are consistent . I am encouraged that variables defined as references with the & do not call their destructors (if I understand you correctly.)

  4. #4
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    Your example is very complicated. In C++, there are much simpler ways to accomplish similar goals.

    >> If a class has internal data allocated via new in it's constructor, then the assignment operator
    >> should probably be overloaded to include allocating internal data again and copying it from the
    >> data available from the right hand side operand.
    This is called the rule of 3. If your class requires a destructor, copy assignment operator, or copy constructor, it usually requires all three. However, good practice often means that you should not require any of the three, and the compiler generated versions should suffice.

    I'm a little confused about your other points, but I think it stems from your overcomplicated example. In C++, you encapsulate resources in class that manage them. For example, you don't allocate a dynamically sized array with new, you use vector. The vector knows how to copy itself correctly.

    If you want to avoid copying where it isn't necessary, you use some other smart container that copies itself without copying the data. For example, you can use a shared_ptr that keeps a reference count and only destroys the allocated memory when the last reference is destroyed.

    >> I'm not sure if something like (*fp)(7) is legal code.
    It is legal.

    >> However I'm concerned that if I create a reference to *fp, e.g. fpref, that it will get deleted
    >> and it's destructor called at the return of the function, thereby invalidating *fp's data.
    No, it won't. A reference is just a reference, it will not cause the destructor to be called for the referenced object.

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by SevenThunders View Post
    Can you give me a barebones, simple example of what you mean by this? The word container connotes a few ideas in my mind, not all of which are consistent . I am encouraged that variables defined as references with the & do not call their destructors (if I understand you correctly.)
    Think of std::vector<T>. It holds objects for you. When you assign it or copy construct it, it duplicates all of its contents. When you delete it (or it goes out of scope), it automatically destructs all of its contents and frees its buffer. Other containers might do other sorts of things during assignment, copy construction, or destruction. The idea is to push the work as far down as possible, preferably into generalized containers, so that you can stop writing assignment operators, copy constructors, and destructors all over the place.

    Other examples of containers are smart pointers (which contain and manage raw pointers), std::i/ofstream, which manages a file resource, std::string, which manages character strings, all the usual STL containers, etc.

  6. #6
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> Can you give me a barebones, simple example of what you mean by this?
    I don't think you need to invent a container. I think boost's shared_ptr or shared_array might work if you want copies to share the internal data.

  7. #7
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Daved View Post
    >> Can you give me a barebones, simple example of what you mean by this?
    I don't think you need to invent a container. I think boost's shared_ptr or shared_array might work if you want copies to share the internal data.
    Obviously you should use the available containers when they do what you need. But if you find yourself repeating the same pattern over and over, think about creating a container which implements that behavior.

  8. #8
    Registered User
    Join Date
    Apr 2007
    Posts
    141
    OK you bring up some interesting points here. I could have used vector as the baseline array inside my class. One reason why I didn't is that the garbage collector I'm using would not be aware of vector. The other reason is that performance is critical and I need to be able to pass raw pointers to arrays to low level C code.

    If what you all are telling me is correct, the default copy and assignment operators will call the class specific assignment and copy operators if my class contains members who are classes or containers themselves (e.g. std::vector<T>).

    It's still odd to me that if a function returns a reference to a class created with new, you could so easily create a memory leak. If you write code to overload the assignment operator (=), does the left hand side call the default constructor to initialize itself first. ie in a statement of the form

    Code:
    MyClass v1 = func1() ;
    where func1() returns a reference to MyClass.

    If I define the overload function
    Code:
    class MyClass {
    public:
    double *arr ;
    .....
    MyClass& MyClass::operator=(const MyClass &rhs) {
    
       /* is  *this created by a call to the default constructor
         MyClass() ?   */
        
    
        // Only do assignment if RHS is a different object from this.
        if (this != &rhs) {
           if (this->arr == NULL) {
            // uninitialized case, just do a shallow copy
          
           }  else {
          ... // Deallocate, allocate new space, copy values...
          }
        }
    
        return *this;
      }
    My thought would be to detect the case wherein the variables in MyClass are uninitialized by forcing the default or no argument constructor to null out the pointers. This would then allow me to test for this case and to do a shallow copy in this case. If I think about it, almost all my uses of = are for initializing new class variables.

    Wait a minute. I think this is unnecessary as well. If I understand what you've told me, then a declaration of the form
    Code:
    MyClass &v1 = func1() ;
    will have the effect of not calling the constructor, copy assignment or even the destructor for v1 ! If I use a garbage collector, then the memory leak created by not destroying v1 will be removed anyway.
    Last edited by SevenThunders; 03-26-2008 at 05:58 PM.

  9. #9
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by SevenThunders View Post
    OK you bring up some interesting points here. I could have used vector as the baseline array inside my class. One reason why I didn't is that the garbage collector I'm using would not be aware of vector. The other reason is that performance is critical and I need to be able to pass raw pointers to arrays to low level C code.
    You can do that with vectors. See: Chapter 78. Use vector (and string::c_str) to exchange data with non-C++ APIs

  10. #10
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    In this example:
    Code:
    MyClass v1 = func1();
    The copy constructor is called.

    However, you mentioned that you are returning a reference. If you are attempting to make v1 be a reference, then you should make it a reference:
    Code:
    MyClass& v1 = func1() ;
    That avoids all copying. You can then call delete &v1 to free the memory (unless you allow your garbage collector to clean it up).

    If you left the code as this:
    Code:
    MyClass v1 = func1();
    then the copy constructor is called rather than the copy assignment operator that you showed. You have to implement both if you implement either. That's the rule of 3 I mentioned earlier (with the third being the destructor).

    Also, that code causes a memory leak (without garbage collection) that the version with the reference does not because you are losing all references to the allocated memory.

  11. #11
    Registered User
    Join Date
    Apr 2007
    Posts
    141
    OK thanks, thats what I'm just figuring out. On a side note, if I have a destructor that deletes an array

    Code:
    ~MyClass {
    delete [] arr ;
    }
    Can I assume that this won't throw an exception if arr=NULL? That's the defined behavior for ANSI C if I free(arr) ;. However I don't know if this carries over to C++.

  12. #12
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> Can I assume that this won't throw an exception if arr=NULL?
    Yes, that is a safe assumption.

    By the way, it is doubtful that proper use of a vector will be slower than the dynamic array. Plus, you will get the added advantages of not having to worry about the destructor, copy constructor and copy assignment operator. I'd encourage you to consider using it as long as you are working on that class. As mentioned, it can even be passed as a C style array to legacy code (use &vec[0] where vec is the vector that isn't empty).

  13. #13
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Read this for more benefits of using vectors: Chapter 76. Use vector by default. Otherwise, choose an appropriate container
    vector alone is:
    • Guaranteed to have the lowest space overhead of any container (zero bytes per object).
    • Guaranteed to have the fastest access speed to contained elements of any container.
    • Guaranteed to have inherent locality of reference, meaning that objects near each other in the container are guaranteed to be near each other in memory, which is not guaranteed by any other standard container.
    • Guaranteed to be layout-compatible with C, unlike any other standard container. (See Items 77 and Chapter 78)
    • Guaranteed to have the most flexible iterators (random access iterators) of any container.
    • Almost certain to have the fastest iterators (pointers, or classes with comparable performance that often compile away to the same speed as pointers when not in debug mode), faster than those of all other containers.

  14. #14
    Registered User
    Join Date
    Apr 2007
    Posts
    141
    Quote Originally Posted by cpjust View Post
    Nice plug for the standard then. Actually for my local matrix computations I'm using a library (cvm), that is std library friendly and I think uses an appropriate such container as it's base array type, probably std::vector.

    For interfacing to a C library I wrote last year and as a multi-D array type for interfacing to other nasty legacy C code, I spun my own class. This class is managed by a garbage collector and is the type I typically return when I pass around a lot of data. It requires that my multi dimensional array deploy a flat single dimensional array instead of the usual C or C++ technique of an array of arrays.

    It doesn't make any sense to try to add GC to any of the standard library code, and I'm already assuming the use of a GC, so it doesn't make sense at this point to change my array type for the multi-D array to std::vector<double>. I'm leaving it as double * allocated by new (GC) (using the Hans Boehm garbage collector) for now.

    We can have altogether another argument about the merits of using GC in C++. I think I've had some fun with that on another thread.

  15. #15
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    I'm not very familiar with garbage collection in C++. Why do you need the delete [] arr if you're using garbage collection?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Getting an error with OpenGL: collect2: ld returned 1 exit status
    By Lorgon Jortle in forum C++ Programming
    Replies: 6
    Last Post: 05-08-2009, 08:18 PM
  2. Post...
    By maxorator in forum C++ Programming
    Replies: 12
    Last Post: 10-11-2005, 08:39 AM
  3. Please Help - Problem with Compilers
    By toonlover in forum C++ Programming
    Replies: 5
    Last Post: 07-23-2005, 10:03 AM
  4. Nested loop frustration
    By caroundw5h in forum C Programming
    Replies: 14
    Last Post: 03-15-2004, 09:45 PM
  5. Question about C# scope rules
    By converge in forum C# Programming
    Replies: 3
    Last Post: 01-30-2002, 06:56 AM