Thread: Compile time of C versus C++ code

  1. #1
    Registered User
    Join Date
    Feb 2008
    Posts
    28

    Compile time of C versus C++ code

    Hello all,
    Just joined this forum and its nice to be here.

    I am an electrical engineer and do circuit simulations by writing code. So far I have written my code in C. I am thinking of converting the code to C++. I began by writing a C++ code to simulate a simple resistive inductive circuit. I defined a class called matrix and using operator overloading defined all the matrix operations. I used Runge Kutta 4 th order method to solve the ODE performing a transient simulation.

    I find that the C++ program runs 3 times slower than the C program. And this is with basic code. Anyone else wish to share a similar experience with me?

    Thanks in advance.

  2. #2
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Quote Originally Posted by circuitbreaker View Post
    I find that the C++ program runs 3 times slower than the C program. And this is with basic code.
    Without looking at code I doubt anyone can help you. However, I decided to quote the bit above hoping you realize that would make C++ a very useless language and that the problem almost certainly has to do with how you code the program, and not the language you used.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  3. #3
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by circuitbreaker View Post
    I find that the C++ program runs 3 times slower than the C program. And this is with basic code. Anyone else wish to share a similar experience with me?
    Possibilities:

    1. Your C++ compiler sucks
    2. You have somehow set a very bad combination of optimizations
    3. You are using STL without optimization

    Have you tried compiling the C code with the C++ compiler for comparison (i.e. not your rewrite, the original code)?

    EDIT: I see you are using overloaded operators for you matrices and vectors? Make sure these functions are declared inline, and make sure you turn on optimization. Some of the features of C++ are low performers, but only if you don't use optimization. The compiler can do amazing things if you let it.

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Why did you start a thread about compile-times and then talk about run-times? These are two completely different things. Which is the problem?

    I could certainly understand C++ code taking longer to compile, particularly if you made heavy use of templates and template meta-programing. However being a newbie you probably don't have a clue what these are and so surely wouldn't possibly have used them.

    If it's not at worst the same speed at run time then I'd say that being new to the language you have written very poor code and/or the compile options chosen amount to not building a very optimised program. Make sure you're not comparing debug builds.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #5
    Registered User
    Join Date
    Feb 2008
    Posts
    28
    Thanks for the repsonses.
    I'll do the following:
    1. Make sure all class functions are inline.
    2. I am using the gcc compiler in linux. So I'll look at all the options particularly the optimization is turned on.
    3. My subject was misleading. Though compile times are longer, it is the tun time that is far greater. I'll post again after a couple of days.

    Thanks.

  6. #6
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by circuitbreaker View Post
    1. Make sure all class functions are inline.
    No, you shouldn't. Compilers will inline if it thinks it's worth it.
    Post your code and use a profiler to find the bottlenecks and I'm sure more than a few are willing to help.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by circuitbreaker View Post
    Thanks for the repsonses.
    I'll do the following:
    1. Make sure all class functions are inline.
    Probably not a good idea. gcc has a pretty good handle on when to inline and when not in itself.
    2. I am using the gcc compiler in linux. So I'll look at all the options particularly the optimization is turned on.
    Use -O2, -O3 or -Os - try all three and see which works best for you.
    There are some other options that may affect your codes execution time too, but just enumerating "sometimes useful" switches is meaningless, as you may be trying switches for days that aren't meaningful to your code. So seeing the code would help.

    If it's still slow, try using the -pg switch on the compile, and do "man gprof" to see how you can view the results of the profiling run.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Registered User
    Join Date
    Feb 2008
    Posts
    28
    Thanks a lot for all the responses. I am using Slackware 12 with gcc version 4.1.2. I tried out most of the suggestions.
    1. I compiled the C program with g++ and it compiles faster and runs much faster than the C++ code. It is the run time that is three times faster in C than C++. The compile time is only marginally lesser in C.
    2. It tried out some of the optimization flags -O2 , -O3 and -Os. Run time remains as slow.

    So now I'll post the code.
    I appreciate all your help and will probably need a whole lot more of it


    Code:
    #include <iostream>
    #include <cmath>
    #include <fstream>
    using namespace std;
    
    #define SIZE 50
    
    class matrix {
    	int rows, columns;
    	float mat_var[SIZE][SIZE];
    public:
    	matrix() {}
    
    // Constructor
    	matrix(int x, int y) {
    		rows=x;
    		columns=y;
    	}
    
    
    // Setting the dimensions of the matrix
    	void set_mat_dimension(int x, int y) {
    		rows=x;
    		columns=y;
    		return;
    	}
    
    
    // Setting elements of the matrix.
    	void set_mat_elements(int x, int y, float element_val) {
    		mat_var[x][y]=element_val;
    		return;
    	}
    
    // Displaying the elements of the matrix.
    	void mat_show() {
    		int x, y;
    
    		for (x=0;x<rows;x++) {
    			for (y=0;y<columns;y++) {
    				cout << mat_var[x][y] << " ";
    			}
    			cout << "\n";
    		}
    
    		cout << "Matrix has " << rows << " rows and " << columns << " columns \n\n";
    
    		return;
    	}
    
    
    // Obtaining an element of the matrix.
    	float get_mat_elements(int x, int y) {
    		return mat_var[x][y];
    	}
    
    
    
    	matrix operator+(matrix mat2);
    	matrix operator-(matrix mat2);
    	friend matrix operator*(matrix mat1, matrix mat2);
    	friend matrix operator*(matrix mat1, float cnst);
    	friend matrix operator*(float cnst, matrix mat1);
    	matrix operator=(matrix mat2);
    	friend matrix inverse(matrix mat1);
    	friend matrix zeros(int x, int y);
    	friend matrix identity(int x);
    	friend void mat_show(matrix mat1);
    };
    
    
    // Matrix addition
    matrix matrix::operator+(matrix mat2) {
    	int x, y;
    	matrix temp;
    
    // Checking for compatibility.
    	if ((rows!=mat2.rows)||(columns!=mat2.columns)) {
    		cout << "Matrices are not compatible.\n";
    		exit(1);
    	}
    
    // Performing addition.
    	for (x=0;x<rows;x++) {
    		for (y=0;y<columns;y++) {
    			temp.mat_var[x][y]=mat_var[x][y]+mat2.mat_var[x][y];
    		}
    	}
    
    	temp.rows=rows;
    	temp.columns=columns;
    
    	return temp;
    }
    
    
    // Matrix Subtraction
    matrix matrix::operator-(matrix mat2) {
    	int x, y;
    	matrix temp;
    
    // Checking for compatibility.
    	if ((rows!=mat2.rows)||(columns!=mat2.columns)) {
    		cout << "Matrices are not compatible.\n";
    		exit(1);
    	}
    
    // Performing subtraction.
    	for (x=0;x<rows;x++) {
    		for (y=0;y<columns;y++) {
    			temp.mat_var[x][y]=mat_var[x][y]-mat2.mat_var[x][y];
    		}
    	}
    
    	temp.rows=rows;
    	temp.columns=columns;
    
    	return temp;
    }
    
    
    
    // Matrix multiplication.
    matrix operator*(matrix mat1, matrix mat2) {
    	int x, y, z;
    	float sum;
    	matrix temp;
    
    // Checking for compatibility - columns of left matrix equal to rows of right matrix.
    	if (mat1.columns!=mat2.rows) {
    		cout << "Matrices are not compatible.\n";
    		exit(1);
    	}
    
    // Performing multiplication.
    	for (x=0;x<mat1.rows;x++) {
    		for (z=0;z<mat2.columns;z++) {
    			sum=0.0;
    			for (y=0;y<mat1.columns;y++) {
    				sum=sum+mat1.mat_var[x][y]*mat2.mat_var[y][z];
    			}
    			temp.mat_var[x][z]=sum;
    		}
    	}
    
    // Defining dimensions of the product.
    	temp.rows=mat1.rows;
    	temp.columns=mat2.columns;
    
    	return temp;
    }
    
    
    
    // Right multipication my a constant.
    matrix operator*(matrix mat1, float cnst) {
    	int x, y;
    	matrix temp;
    
    	for (x=0;x<mat1.rows;x++) {
    		for (y=0;y<mat1.columns;y++) {
    			temp.mat_var[x][y]=mat1.mat_var[x][y]*cnst;
    		}
    	}
    
    
    // Defining dimensions of the product.
    	temp.rows=mat1.rows;
    	temp.columns=mat1.columns;
    
    	return temp;
    }
    
    
    
    // Left multipication my a constant.
    matrix operator*(float cnst, matrix mat1) {
    	int x, y;
    	matrix temp;
    
    	for (x=0;x<mat1.rows;x++) {
    		for (y=0;y<mat1.columns;y++) {
    			temp.mat_var[x][y]=mat1.mat_var[x][y]*cnst;
    		}
    	}
    
    // Defining dimensions of the product.
    	temp.rows=mat1.rows;
    	temp.columns=mat1.columns;
    
    	return temp;
    }
    
    
    
    // Assigning one matrix to another.
    matrix matrix::operator=(matrix mat2) {
    	int x, y;
    
    	for (x=0;x<mat2.rows;x++) {
    		for (y=0;y<mat2.columns;y++) {
    			mat_var[x][y]=mat2.mat_var[x][y];
    		}
    	}
    
    // Defining dimensions of the new matrix.
    	rows=mat2.rows;
    	columns=mat2.columns;
    
    	return *this;
    }
    
    
    // Creating a zero matrix of any dimention x*y.
    matrix zeros(int x, int y) {
    	matrix temp;
    	int i, j;
    
    	temp.rows=x;
    	temp.columns=y;
    	
    	for (i=0;i<x;i++) {
    		for (j=0;j<y;j++) {
    			temp.mat_var[i][j]=0.0;
    		}
    	}
    
    	return temp;
    }
    
    
    
    // Creating an identity matrix of dimension x*x.
    matrix identity(int x) {
    	matrix temp;
    
    	int i, j;
    
    	temp.rows=x;
    	temp.columns=x;
    
    	for (i=0;i<x;i++) {
    		for (j=0;j<x;j++) {
    			if (i==j) temp.mat_var[i][j]=1.0;
    			else temp.mat_var[i][j]=0.0;
    		}
    	}
    
    	return temp;
    }
    
    
    // Displaying the contents of the matrix.
    void mat_show(matrix mat1) {
    	int x, y;
    
    	for (x=0;x<mat1.rows;x++) {
    		for (y=0;y<mat1.columns;y++) {
    			cout << mat1.mat_var[x][y] << " ";
    		}
    		cout << "\n";
    	}
    
    	return;
    }
    
    
    
    // Calculating the inverse of a matrix.
    matrix inverse(matrix mat1) {
    	int x,y,z,flag;
    	float temp,swap_temp;
    	matrix temp1, temp2;
    
    // Checking if the matrix is square.
    	if (mat1.rows!=mat1.columns) {
    		cout << "Matrix is not square.\n";
    		exit(1);
    	}
    
    
    // Creating a temporary matrix equal to original matrix.
    	temp2=mat1;
    // Creating an identity matrix.
    	temp1=identity(temp2.rows);
    
    	for (x=0;x<temp2.rows;x++) {
    
    // Checking if diagonal element of row x is zero.
    		flag=1;
    		if (!temp2.mat_var[x][x]) {
    			y=x+1;
    // Checking if row y > row x has the (y,x) element as non-zero.
    			while ((flag)&&(y<=temp2.rows)) {
    
    // If row y exceeds dimensions, then matrix in not invertible.
    				if (y>=temp2.rows) {
    					cout << "Matrix is not invertible.\n";
    					exit(1);
    				}
    
    // Otherwise exchange row y and row x of temp1 and temp2.
    				if (temp2.mat_var[y][x]) {
    					for (z=0;z<temp2.columns;z++) {
    						swap_temp=temp2.mat_var[x][z];
    						temp2.mat_var[x][z]=temp2.mat_var[y][z];
    						temp2.mat_var[y][z]=swap_temp;
    						swap_temp=temp1.mat_var[x][z];
    						temp1.mat_var[x][z]=temp1.mat_var[y][z];
    						temp1.mat_var[y][z]=swap_temp;
    					}
    					flag=0;	
    				}
    				y++;	
    			}
    		}
    
    // Making diagonal element of row x unity.
    		temp=temp2.mat_var[x][x];
    		for (y=0;y<temp2.columns;y++) {
    			temp2.mat_var[x][y]=temp2.mat_var[x][y]/temp;
    			temp1.mat_var[x][y]=temp1.mat_var[x][y]/temp;
    		}
    
    // Making all other elements of column x zero by row operations.
    		for (y=0;y<temp2.columns;y++) {
    			if (y!=x) {
    				temp=temp2.mat_var[y][x];
    				for (z=0;z<temp2.columns;z++) {
    					temp2.mat_var[y][z]=temp2.mat_var[y][z]-temp*temp2.mat_var[x][z];
    					temp1.mat_var[y][z]=temp1.mat_var[y][z]-temp*temp1.mat_var[x][z];
    				}
    			}
    		}
    	}
    
    	return temp1;
    }
    
    
    
    matrix rkni(matrix u, matrix linv, matrix r, matrix i, float dt)
    {
    	matrix k1, k2, k3, k4, k5, k6, k;
    
    
    // Runge Kutta 5th order method.
    /*	k1=dt*(-1.0*(linv*(r*i))+(linv*u));
    	k2=dt*(-1.0*linv*r*(i+(1.0/4.0)*k1)+linv*u);
    	k3=dt*(-1.0*linv*r*(i+(3.0/32.0)*k1+(9.0/32.0)*k2)+linv*u);
    	k4=dt*(-1.0*linv*r*(i+(1932.0/2197.0)*k1+(-7200.0/2197.0)*k2+(7296.0/2197.0)*k3)+linv*u);
    	k5=dt*(-1.0*linv*r*(i+(439.0/216.0)*k1-8.0*k2+(3680.0/513.0)*k3-(845.0/4104.0)*k4)+linv*u);
    	k6=dt*(-1.0*linv*r*(i-(8.0/27.0)*k1+2.0*k2-(3544.0/2565.0)*k3+(1859.0/4104.0)*k4-(11.0/40.0)*k5)+linv*u);
    
    	k=((16.0/135.0)*k1+(6656.0/12825.0)*k3+(28561.0/56430.0)*k4-(9.0/50.0)*k5+(2.0/55.0)*k6); */
    
    // Runge Kutta 4th order method.
    	k1=dt*(linv*u-linv*r*i);
    	k2=dt*(linv*u-linv*r*(i+(1.0/2.0)*k1));
    	k3=dt*(linv*u-linv*r*(i+(1.0/2.0)*k2));
    	k4=dt*(linv*u-linv*r*(i+k3));
    
    	k=(k1+2.0*k2+2.0*k3+k4)*(1.0/6.0); 
    
    	return(i+k);
    }
    
    
    int main()
    {
    
    	float ra, rb, rc;
    	ra=100.0;
    	rb=100.0;
    	rc=100.0;
    
    	float la, lb, lc;
    	la=0.1;
    	lb=0.1;
    	lc=0.1;
    
    	float ua, ub, uc;
    	float umax;
    	umax=230.0*sqrt(2.0);
    	float omega;
    	float pi;
    	pi=22.0/7.0;
    	omega=100*pi;
    	float d2r;
    	d2r=pi/180.0;
    
    	matrix r(3,3), l(3,3), i(3,1), u(3,1);
    	i=zeros(3,1);
    	u=zeros(3,1);
    	l=zeros(3,3);
    	r=zeros(3,3);
    
    	l.set_mat_elements(0,0,la);
    	l.set_mat_elements(1,1,lb);
    	l.set_mat_elements(2,2,lc);
    
    	r.set_mat_elements(0,0,ra);
    	r.set_mat_elements(1,1,rb);
    	r.set_mat_elements(2,2,rc);
    	
    	float t, dt;
    	dt=10.0e-6;
    	t=0.0;
    
    	matrix linv;
    	linv=inverse(l);
    
    	ofstream sysdat("system2.dat");
    
    	for (t=0;t<0.5;t=t+dt) {
    		ua=umax*sin(omega*t);
    		ub=umax*sin(omega*t-120.0*d2r);
    		uc=umax*sin(omega*t-240.0*d2r);
    
    		u.set_mat_elements(0,0,ua);
    		u.set_mat_elements(1,0,ub);
    		u.set_mat_elements(2,0,uc);
    
    		i=rkni(u,linv,r,i,dt);
    
    		sysdat << t << " " << u.get_mat_elements(0,0) << " " << u.get_mat_elements(1,0) << " " << u.get_mat_elements(2,0) << " "
    			<< i.get_mat_elements(0,0) << " " << i.get_mat_elements(1,0) << " " << i.get_mat_elements(2,0) << "\n";
    	}
    
    	sysdat.close();
    	return 0;
    }
    Last edited by CornedBee; 02-07-2008 at 10:36 AM. Reason: Un-break layout.

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    For starters:

    1. You pass objects of class types by value instead of by (const) reference.

    2. Many of your functions return a matrix object when they could operate on the current matrix.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    I tried to optimize the code a little. Using const references... the result is about 6 s vs 12.5 s.
    Most of the time now seems to be spent in memcpy. I'm thinking it's the rkni function that's the culprit.
    Maybe that's enough for you to try to optimize a little?
    Last edited by Elysia; 02-06-2008 at 12:09 PM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #11
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    You're potentially wasting a tremendous amount of memory, not to mention running time, by allocating every matrix with size SIZE*SIZE and then only using rows*columns elements of that. You should create each matrix with exactly the memory it needs, based on rows and columns, and if you need a matrix of a different size, just destroy the old one and create a new one.

    Edit: The running time penalty is probably in fewer cache hits - picture the way a 50*50 array is laid out in memory and which elements you're using if rows == columns == 3, for example.
    Last edited by robatino; 02-06-2008 at 12:18 PM.

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Made some changes in a similar vein to Laserlights comments, and the time is about 2.2x faster than the original code. Main changes are:

    Code:
    	matrix operator+(const matrix &mat2) const ;
    	matrix operator-(const matrix &mat2) const ;
    	friend matrix operator*(const matrix &mat1, const matrix &mat2);
    	friend matrix operator*(const matrix &mat1, float cnst);
    	friend matrix operator*(float cnst, const matrix &mat1);
    	matrix operator=(const matrix &mat2);
    	friend matrix inverse(const matrix &mat1);
    	friend matrix zeros(int x, int y);
    	friend matrix identity(int x);
    	friend void mat_show(const matrix &mat1);
    And of course related changes in the implementation of the functions.

    There are futher improvements there, but this was the lowest hanging "fruit".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  13. #13
    Registered User
    Join Date
    Feb 2008
    Posts
    28
    Thanks for the lightning fast responses. The reason why I return objects of the the class matrix is so as to make the main code similar to MATLAB code in case someone else in my lab wants to use it. The use of const matrix is something I'll use in my future code.

    Though this is a C++ forum, I would like to post my C code here for comparison. The main difference here is that almost all function are of the return type void. Also, arrays are passed to functions by reference rather than by value like the copy constructors I have used in my C++ code. This C code runs much faster in my compiler. So it seems like the copy constructors in rkni are the culprits. Will clean up and post again.



    Code:
    #include <stdio.h>
    #include <math.h>
    
    # define SIZE 100
    
    
    /* Matrix computations */
    
    void matinv(float x[][SIZE],int count)
    {
    	int i,j,k;
    	float temp;
    	float y[SIZE][SIZE],z[SIZE][SIZE];
    	
    	for (i=0;i<count;i=i+1)
    	{
    		for (j=0;j<count;j=j+1)
    		{
    			if (i==j)
    			{
    				y[i][j]=1.0;
    			}
    			else
    			{
    				y[i][j]=0.0;
    			}
    		}
    	}
    	
    	for (i=0;i<count;i=i+1)
    	{
    		temp=x[i][i];
    		for (j=0;j<count;j=j+1)
    		{
    			x[i][j]=x[i][j]/temp;
    			y[i][j]=y[i][j]/temp;
    		}
    
    		for (j=0;j<count;j=j+1)
    		{
    			if (j!=i)
    			{
    				temp=x[j][i];
    				for (k=0;k<count;k=k+1)
    				{
    					x[j][k]=x[j][k]-temp*x[i][k];
    					y[j][k]=y[j][k]-temp*y[i][k];
    				}
    
    			}
    		}
    	}
    
    	for (i=0;i<count;i=i+1)
    	{
    		for (j=0;j<count;j=j+1)
    		{
    			x[i][j]=y[i][j];
    		}
    	}
    	
    	return;
    }
    
    void matmult(float x[][SIZE],float y[][SIZE],int count1,int count2,int count3)
    {
    	int i,j,k;
    	float temp;
    	float z[SIZE][SIZE];
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count3;j=j+1)
    		{
    			temp=0.0;
    
    			for (k=0;k<count2;k=k+1)
    			{
    				temp=temp+x[i][k]*y[k][j];
    			}
    
    			z[i][j]=temp;
    		}
    	}
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count3;j=j+1)
    		{
    			x[i][j]=z[i][j];
    		}
    	}
    		
    	return;
    }
    
    
    void postmatmult(float x[][SIZE],float y[][SIZE],int count1,int count2,int count3)
    {
    	int i,j,k;
    	float temp;
    	float z[SIZE][SIZE];
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count3;j=j+1)
    		{
    			temp=0.0;
    
    			for (k=0;k<count2;k=k+1)
    			{
    				temp=temp+x[i][k]*y[k][j];
    			}
    
    			z[i][j]=temp;
    		}
    	}
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count2;j=j+1)
    		{
    			y[i][j]=z[i][j];
    		}
    	}
    		
    	return;
    }
    
    
    void cstmult(float x[][SIZE],float a,int count1,int count2)
    {
    	int i,j;
    	float y[SIZE][SIZE];
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count2;j=j+1)
    		{
    			y[i][j]=a*x[i][j];
    		}
    	}
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count2;j=j+1)
    		{
    			x[i][j]=y[i][j];
    		}
    	}
    		
    	return;
    }
    
    void matadd(float x[][SIZE],float y[][SIZE],int count1,int count2)
    {
    	int i,j;
    	float z[SIZE][SIZE];
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count2;j=j+1)
    		{
    			z[i][j]=x[i][j]+y[i][j];
    		}
    	}
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count2;j=j+1)
    		{
    			x[i][j]=z[i][j];
    		}
    	}
    		
    	return;
    }
    
    void mateq(float x[][SIZE],float y[][SIZE],int count1,int count2)
    {
    	int i,j;
    
    	for (i=0;i<count1;i=i+1)
    	{
    		for (j=0;j<count2;j=j+1)
    		{
    			x[i][j]=y[i][j];
    		}
    	}
    		
    	return;
    }
    
    
    
    /* Runge-Kutta numerical integration */
    
    void rkni(float linv1[][SIZE],float r1[][SIZE],float i1[][SIZE],float u1[][SIZE],float dt,int count)
    {
    	float temp[SIZE][SIZE],temp1[SIZE][SIZE],temp2[SIZE][SIZE];
    	float k1[SIZE][SIZE],k2[SIZE][SIZE],k3[SIZE][SIZE],k4[SIZE][SIZE],k5[SIZE][SIZE],k6[SIZE][SIZE],k[SIZE][SIZE];
    	
    // Runge Kutta 5th order method.
    /*	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	matmult(temp,i1,count,count,1);
    	mateq(temp1,linv1,count,count);
    	matmult(temp1,u1,count,count,1);
    	matadd(temp,temp1,count,1);
    	mateq(k1,temp,count,1);
    	cstmult(k1,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k1,count,1);
    	cstmult(temp1,1/4.0,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k2,temp,count,1);
    	cstmult(k2,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k1,count,1);
    	cstmult(temp1,3.0/32.0,count,1);
    	mateq(temp2,k2,count,1);
    	cstmult(temp2,9.0/32.0,count,1);
    	matadd(temp1,temp2,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k3,temp,count,1);
    	cstmult(k3,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k1,count,1);
    	cstmult(temp1,1932.0/2197.0,count,1);
    	mateq(temp2,k2,count,1);
    	cstmult(temp2,-7200.0/2197.0,count,1);
    	matadd(temp1,temp2,count,1);
    	mateq(temp2,k3,count,1);
    	cstmult(temp2,7296.0/2197.0,count,1);
    	matadd(temp1,temp2,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k4,temp,count,1);
    	cstmult(k4,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k1,count,1);
    	cstmult(temp1,439.0/216.0,count,1);
    	mateq(temp2,k2,count,1);
    	cstmult(temp2,-8.0,count,1);
    	matadd(temp1,temp2,count,1);
    	mateq(temp2,k3,count,1);
    	cstmult(temp2,3650.0/513.0,count,1);
    	matadd(temp1,temp2,count,1);
    	mateq(temp2,k4,count,1);
    	cstmult(temp2,-845.0/4104.0,count,1);
    	matadd(temp1,temp2,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k5,temp,count,1);
    	cstmult(k5,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k1,count,1);
    	cstmult(temp1,-8.0/27.0,count,1);
    	mateq(temp2,k2,count,1);
    	cstmult(temp2,2.0,count,1);
    	matadd(temp1,temp2,count,1);
    	mateq(temp2,k3,count,1);
    	cstmult(temp2,-3544.0/2565.0,count,1);
    	matadd(temp1,temp2,count,1);
    	mateq(temp2,k4,count,1);
    	cstmult(temp2,1859.0/4104.0,count,1);
    	matadd(temp1,temp2,count,1);
    	mateq(temp2,k5,count,1);
    	cstmult(temp2,-11.0/40.0,count,1);
    	matadd(temp1,temp2,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k6,temp,count,1);
    	cstmult(k6,dt,count,1);
    
    	cstmult(k1,16.0/135.0,count,1);
    	cstmult(k3,6656.0/12825.0,count,1);
    	cstmult(k4,28561.0/56430.0,count,1);
    	cstmult(k5,-9.0/50.0,count,1);
    	cstmult(k6,2.0/55.0,count,1);
    
    	matadd(k1,k3,count,1);
    	matadd(k1,k4,count,1);
    	matadd(k1,k5,count,1);
    	matadd(k1,k6,count,1);
    	mateq(k,k1,count,1);
    	matadd(i1,k,count,1); */
    
    
    // Runge Kutta 4th order method.
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	matmult(temp,i1,count,count,1);
    	mateq(temp1,linv1,count,count);
    	matmult(temp1,u1,count,count,1);
    	matadd(temp,temp1,count,1);
    	mateq(k1,temp,count,1);
    	cstmult(k1,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k1,count,1);
    	cstmult(temp1,0.5,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k2,temp,count,1);
    	cstmult(k2,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k2,count,1);
    	cstmult(temp1,0.5,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k3,temp,count,1);
    	cstmult(k3,dt,count,1);
    
    	mateq(temp,linv1,count,count);
    	matmult(temp,r1,count,count,count);
    	mateq(temp1,k3,count,1);
    	cstmult(temp1,1.0,count,1);
    	matadd(temp1,i1,count,1);
    	matmult(temp,temp1,count,count,1);
    	mateq(temp2,linv1,count,count);
    	matmult(temp2,u1,count,count,1);
    	matadd(temp,temp2,count,1);
    	mateq(k4,temp,count,1);
    	cstmult(k4,dt,count,1);
    
    	cstmult(k2,2.0,count,1);
    	cstmult(k3,2.0,count,1);
    
    	matadd(k1,k2,count,1);
    	matadd(k1,k3,count,1);
    	matadd(k1,k4,count,1);
    	cstmult(k1,1/6.0,count,1);
    	mateq(k,k1,count,1);
    	matadd(i1,k,count,1);
    
    	return;
    }
    
    
    
    int main()
    {
    	float r[SIZE][SIZE];
    	float l[SIZE][SIZE],linv[SIZE][SIZE];
    	float i[SIZE][SIZE];
    	float u[SIZE][SIZE], umax;
    	float t,dt;
    	float temp[SIZE][SIZE],temp2[SIZE][SIZE],temp1[SIZE];
    	float pi,omega,d2r;
    	int count1,count2,count,filcount;
        
        	FILE *dtfl1;
        
        	dtfl1=fopen("system1.dat","w");
    
        	pi=3.14159265;
        	omega=100*pi;
        	d2r=pi/180;
    
    	umax=230.0*sqrt(2.0);
    
    	dt=10.0e-6;
    
    	for (count1=0;count1<3;count1=count1+1)
    	{
    		for (count2=0;count2<3;count2=count2+1)
    		{
    			r[count1][count2]=0.0;
    			l[count1][count2]=0.0;
    			u[count1][count2]=0.0;
    			i[count1][count2]=0.0;
    		}
    	}
    
    	r[0][0]=-100.0;
    	r[1][1]=-100.0;
    	r[2][2]=-100.0;
    
    	l[0][0]=0.1;
    	l[1][1]=0.1;
    	l[2][2]=0.1;
    
    	mateq(linv,l,3,3);
    
    	matinv(linv,3);
    
        	for (t=0.0;t<=0.5;t=t+dt)
        	{
    		u[0][0]=umax*sin(omega*t);
    		u[1][0]=umax*sin(omega*t-120*d2r);
    		u[2][0]=umax*sin(omega*t-240*d2r);
    
    		count=3;
    
    		rkni(linv,r,i,u,dt,3);
    
    		fprintf(dtfl1,"%f %f %f %f %f %f %f\n", t, u[0][0], u[1][0], u[2][0], i[0][0], i[1][0], i[2][0]);
        	}
    
        	fclose(dtfl1);
    
        	return 0;
    }

  14. #14
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Yeah, I just added const matrix& to all functions that took matrix and the improvement was about 2x.
    The rest of the time is spent copying memory. Unfortunately, I don't know how to get a call stack from the profiler so I can't see where the copying is being made.

    UPDATE:
    OK, so after a little digging, the following line is the culprit:
    Code:
    k2=dt * (linv * u-linv * r * ((matrix)i + (1.0/2.0) * k1));
    Specifically, this:
    Code:
    (matrix)i + (1.0/2.0) * k1)
    Because i is a const reference, I have to create a temporary object and copy the data into it in order to complete the instruction.

    So, now the question becomes how to optimize this?
    Last edited by Elysia; 02-06-2008 at 12:34 PM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #15
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I changed the constructor to allocate memory instead of having a fixed size 2D array, and it runs in 2.4 seconds instead of 13.5 on my machine. I compared the results, and there were some minor differences in the last decimal point in the calculations, but otherwise it was the same values - don't ask me why there is a difference at all...

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Compile Errors in my Code. Can anyone help?
    By DGLaurynP in forum C Programming
    Replies: 1
    Last Post: 10-06-2008, 09:36 AM
  2. Can you create, edit and compile C code with a C++ IDE?
    By nerdpirate in forum C Programming
    Replies: 4
    Last Post: 05-31-2008, 01:54 AM
  3. Replies: 0
    Last Post: 11-29-2002, 10:24 PM
  4. Interface Question
    By smog890 in forum C Programming
    Replies: 11
    Last Post: 06-03-2002, 05:06 PM
  5. Simple Compile Time Problem - HELP!
    By kamikazeecows in forum Windows Programming
    Replies: 2
    Last Post: 12-02-2001, 01:30 PM