Thread: Copy constructors in Expression templates

  1. #1
    Registered User
    Join Date
    Aug 2010
    Location
    India
    Posts
    4

    Copy constructors in Expression templates

    Hello all,

    I tried to use expression templates to perform arithmetic operations on matrices of complex numbers. Since I am just learning expression templates, I thought I'll write my own templates for both arrays and complex numbers. The files main.cpp and Complex.h and Matrix.h are attached.

    Code:
    	cout << "\n" << "expression ((m1 * m2) + (m1 * m2))" << endl;
    	((m1 * m2) + (m1 * m2));
    	
    	cout << "\n" << "expression ((m1 * m2) + (m1 * m2)): display with .show() function" << endl;
    	((m1 * m2) + (m1 * m2)).show();
    	
    	cout << "\n" << "expression ((m1 * m2) + (m1 * m2)): access one element with operator()" << endl;
    	cout << ((m1 * m2) + (m1 * m2))(0,0) << endl;
    The above three lines of code are the ones I have a query about. m1 and m2 are matrices of complex numbers whose real and imaginary parts are floats.

    I need to interpret the results of these lines of code. The first line gives me this:

    expression ((m1 * m2) + (m1 * m2))
    Constructor Mat_dotProduct(ExprTLeft& x, ExprTRight& y) called
    Constructor Mat_dotProduct(ExprTLeft& x, ExprTRight& y) called
    Constructor Mat_add(ExprTLeft& x, ExprTRight& y) called
    Destructor ~Mat_add() called
    Destructor ~Mat_dotProduct() called
    Destructor ~Mat_dotProduct() called
    This code does not perform any function in terms of computation or output of values. So my question is: Does this only create an expression tree?

    This brings us to the next question regarding the second line of code that gives this output:

    expression ((m1 * m2) + (m1 * m2)): display with .show() function
    Constructor Mat_dotProduct(ExprTLeft& x, ExprTRight& y) called
    Constructor Mat_dotProduct(ExprTLeft& x, ExprTRight& y) called
    Constructor Mat_add(ExprTLeft& x, ExprTRight& y) called
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    (20.0224, 27.1327)
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    (4.56637, 29.1327)

    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    (27.164, 37.3056)
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    (11.708, 39.3056)

    Expression has 2 rows and 2 columns

    Destructor ~Mat_add() called
    Destructor ~Mat_dotProduct() called
    Destructor ~Mat_dotProduct() called
    Here the values of the final array are sent to the screen. However, every time an element of the expression tree is accessed/parsed/executed (or other please tell me the correct word), a copy constructor is called.

    The third line of code that accesses only one element of the expression gives this:
    expression ((m1 * m2) + (m1 * m2)): access one element with operator()
    Constructor Mat_dotProduct(ExprTLeft& x, ExprTRight& y) called
    Constructor Mat_dotProduct(ExprTLeft& x, ExprTRight& y) called
    Constructor Mat_add(ExprTLeft& x, ExprTRight& y) called
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex() constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    Complex(const Complex &cmplx) copy constructor is called
    (20.0224, 27.1327)

    Destructor ~Mat_add() called
    Destructor ~Mat_dotProduct() called
    Destructor ~Mat_dotProduct() called
    Again all these copy constructors for acessing one element?

    If the goal of expression templates is to reduce the number of temporary objects being created, it is a little surprising to see so many copy constructors when trying to use this concept. Is it that expression templates are used with arrays here instead of scalars and so the benefit is lost?

    Can anyone point me to something I might be missing?

    Thank you in advance

  2. #2
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    The thing about expression templates is that you can't measure how well they work by observing them in anything other than fully optimised circumstances with no kind of debugging aids added whatsoever. Your only measures of effectiveness are profiling and looking at the resulting assembly. By their nature that involve more copy-construction, but in such a way that this is optimised out better and results in a net win. That said...

    Remove the Complex copy constructor. It does not matter how many times it is called provided the compiler can optimise it down to nothing. It wont be able to do that as-is, but it probably can with the auto-generated one.
    Remove the Complex assignement operator. The compiler does a better job of generating it for you identically.

    Don't use "this->" everywhere. It defeats the purpose of it being implicit.

    Remove the Mat_add, Mat_dot_Product, BinaryOp, Mat_add, and Mat_dotProduct empty destructors. The compiler generates these better by itself and it then will make no difference how many times it would theoretically be invoked when it results in no extra machine instructions.

    Declaring show() as returning "const void" is overly redundant, as is the "return" at the end of create_matrix.

    Remove the timing information from places such as inside the Matrix copy constructor and assignment operator. The only place for timing this stuff is outside of the class.
    Nothing in the class should be using cout.
    Memory allocation failures are not supposed to be handled locally in this manner. If it can't allocate memory then it is up to the caller to decide how to handle that.

    Once you're done all that, you need a version of the code that dosn't use expression templates, and then compare the fully optimised generated assembly with that.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  3. #3
    Registered User
    Join Date
    Aug 2010
    Location
    India
    Posts
    4
    Thanks for the suggestion. We tried to modify the files and they are attached - Matrix_Templates.h with expression templates and Matrix_Operators.h without expression templates.

    We tried to compare the speed for a simple for loop:
    Code:
    for(int i=0; i<10000; i++){
    		m3 = ((m1 * m2) + (m1 * m2));
    	}
    Where m1, m2 and m3 are square matrices. But we got some results we can't understand:

    1. When m1, m2, m3 are 3x3 matrices, the expression template method is twice as fast as the method without expression templates.

    2. When m1, m2, m3 are 30x30 matrices, the speeds of both methods are the same.

    We would have thought that expression templates would give better results for larger matrices but it seems for larger matrices, expression templates has no effect. So why is it that expression templates works only for smaller matrices?

    Thanks in advance.

  4. #4
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    You're assuming expression templates allow a compiler to perform miracles.

    In the limit, assuming you're using the same basic algorithm, using expression templates versus other code, will tend to converge on having similar speed. With general square matrices, the limiting case will be based on matrix multiplication takes O(n*n*n) numerical multiplications, and O(n*n) additions. Those need to be performed, and the best optimisation that can be performed is minimising overhead on top of those operations.

    There will be some opportunities for optimisation based on avoiding cache misses, but those opportunities are highly machine dependent.

    The fact you are logging calls to constructors and destructors slows your code down: the execution time is limited by the I/O not by CPU cycles. The only way the compiler can reduce that, with your code, is by eliminating actual invocations of constructors and destructors (eg by eliminating temporary objects).
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Hiding constructors, exposing copy operations
    By Mario F. in forum C++ Programming
    Replies: 10
    Last Post: 07-23-2007, 07:44 AM
  2. Help with making a Math Expression DLL
    By MindWorX in forum C Programming
    Replies: 19
    Last Post: 07-19-2007, 11:37 PM
  3. Screwy Linker Error - VC2005
    By Tonto in forum C++ Programming
    Replies: 5
    Last Post: 06-19-2007, 02:39 PM
  4. Copy Constructors
    By bartybasher in forum C++ Programming
    Replies: 11
    Last Post: 07-12-2004, 02:43 PM
  5. Copy Constructor Help
    By Jubba in forum C++ Programming
    Replies: 2
    Last Post: 11-07-2001, 11:15 AM