# Taking forever to copy 2d array

Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last
• 11-11-2012
andydufresne
Taking forever to copy 2d array
So I have a function which takes in a 2d array as a parameter and sets its value by copying another (locally declared) 2d array like this (both 2d arrays of unsigned chars):

Code:

```    for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             charstate [temprow][tempcol]=tempstate[temprow[tempcol];          }     }```
but this is massively slowing my program.

Doing this takes a fraction of the time:

Code:

``` unsigned char test=12;     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             charstate [temprow][tempcol] =test;         }     }```
Why is this? Any solutions to speeding up my application?
Thanks!
• 11-11-2012
std10093
Because copying an array is expensive.You have to traverse through the whole array.The array is 2D so you must go through all elements of first row,then trough all elements of 2nd row and so on...This will lead for you to make N times M time to go through all the elements of a N x M array, which in terms of time complexity is O(N*M) and it is the optimum.This means that you can not reduce this.I mean think about it.If you reduce this, you won't access at least an element, thus you will not copy the whole array.

If you want to make your code more efficient you have to think for another approach.

Notice that if this operation - copying arrays - is not the heaviest in your program ( this will happen if you have an operation with time complexity more or equal to (N*M) ) , it has no sense in terms of time complexity to change this operation, because the performance of your code is equal to the one of the heaviest operation your programs does.

Which is your problem that requires such a heavy operation?

Of course i assume that we are talking about one threaded application here.

PS - It is more usual to access the arrays line by line, in other words switching the two for loops :)
• 11-11-2012
andydufresne
Thanks for the reply but I'm confused why this code is much much quicker :

Code:

``` unsigned char test[4][4];       //recopy temp into charstate     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             charstate [temprow][tempcol]= test[temprow][tempcol];         }     }```
I'm thinking it's something to do with the array being passed globally as copy
• 11-11-2012
std10093
The code you provide is faster that the one that you assign test to every cell of the array? Are you sure? How to you measure the time?

An idea :
You said that you create an array inside the function and then copy by value to the charstate array.Why not doing these operation in one double for loop instead of two? ;)

Example (in C code but it will be ok to understand )

Going through the loop twice and a temp array
Code:

```#include <stdio.h> void foo(int a[][5],int n) {     int i,j;     int temp[n][n];     for( i = 0 ; i < n ; i++ )         for(j = 0 ; j < n ; j ++)             temp[i][j] = 15;     for( i = 0 ; i < n ; i++ )         for(j = 0 ; j < n ; j ++)               a[i][j] = temp[i][j]; } int main(int argc, char *argv[]){     int array[5][5];     int i,j;     foo(array,5);     for( i = 0 ; i < 5 ; i++ )     {         for(j = 0 ; j < 5 ; j ++)             printf(" %d " ,array[i][j]);         printf("\n");     }     return 0; }```
Going with one double for loop and a temp array
Code:

``` #include <stdio.h> void foo(int a[][5],int n) {     int i,j;     int temp[n][n];     for( i = 0 ; i < n ; i++ )         for(j = 0 ; j < n ; j ++)         {             temp[i][j] = 15;             a[i][j] = temp[i][j];         } } int main(int argc, char *argv[]){     int array[5][5];     int i,j;     foo(array,5);     for( i = 0 ; i < 5 ; i++ )     {         for(j = 0 ; j < 5 ; j ++)             printf(" %d " ,array[i][j]);         printf("\n");     }     return 0; }```
But we do not need the array in this simple example.So improve space complexity too.
One double for loop ,no temp array.
Code:

```#include <stdio.h> void foo(int a[][5],int n) {     int i,j;     for( i = 0 ; i < n ; i++ )         for(j = 0 ; j < n ; j ++)             a[i][j] = 15; } int main(int argc, char *argv[]){     int array[5][5];     int i,j;     foo(array,5);     for( i = 0 ; i < 5 ; i++ )     {         for(j = 0 ; j < 5 ; j ++)             printf(" %d " ,array[i][j]);         printf("\n");     }     return 0; }```
• 11-11-2012
andydufresne
Urgh. The second piece of code is much faster.

This is my actual code:

Code:

``` void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4]) {     unsigned char tempstate [4][4];     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             tempstate [temprow][tempcol] = charstate [temprow][tempcol];         }     } //do complex stuff to tempstate       //recopy temp into charstate     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             charstate [temprow][tempcol]= tempstate[temprow[tempcol];         }     } }```
• 11-11-2012
std10093
I do not understand why you need tempstate.Can't you play with charstate only?
This will make you forget about the two double for loops you have back there and the extra array.

Also you might do the copy with memcpy . When n is a big number ,it may has difference.For the n you have, it is the same.
• 11-12-2012
iMalc
You are almost certainly mistaken in thinking that the performance issue is where you seem to think it is. A 16-byte copy is nothing.
Until you show the complete unedited function, this is a waste of time.
• 11-12-2012
andydufresne
Quote:

Originally Posted by std10093
I do not understand why you need tempstate.Can't you play with charstate only?
This will make you forget about the two double for loops you have back there and the extra array.

Also you might do the copy with memcpy . When n is a big number ,it may has difference.For the n you have, it is the same.

The reason I need tempstate is that the function continually changes the 2d array and the operations performed depend on the original version of the 2d array.

Tried to memcpy to no effect.
• 11-12-2012
andydufresne
I still don't think you need it but here it is anyway:

Code:

``` void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4]) {     unsigned char tempstate [4][4];     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             tempstate [temprow][tempcol] = charstate [temprow][tempcol];         }     }     for (int mxcol=0;mxcol<4;mxcol++)     {         for (int mxrow=0;mxrow<4;mxrow++)         {             //charstate [mxrow][mxcol] = (charstate[0][mxcol]*multmatrix[mxrow][0])  ^ (charstate[1][mxcol] * multmatrix[mxrow][1]);           // cout << hex << int (charstate[0][mxcol]) << " * " << dec <<int( multmatrix[mxrow][0]) << " -- " << hex << int (charstate[1][mxcol]) << " * " << dec << int(multmatrix[mxrow][1] )<< " -- "<< hex << int (charstate[2][mxcol]) << " * " << dec <<int( multmatrix[mxrow][2]) << " -- "<< hex << int (charstate[3][mxcol]) << " * " << dec << int(multmatrix[mxrow][3]) << " -- "<<endl;             //MX1//             int mxthenxor [4];             for (int xorresult=0;xorresult<4;xorresult++)             {                 bool has0or1 =false;                 if (charstate[xorresult][mxcol] == 1 || multmatrix[mxrow][xorresult]==1)                 {                     if(multmatrix[mxrow][xorresult]==1)                     {                     mxthenxor [xorresult] = charstate [xorresult] [mxcol];                     has0or1 =true;                     }                     if (charstate[xorresult][mxcol] == 1)                     {                         mxthenxor [xorresult] = multmatrix[xorresult] [mxcol];                     }                     if (multmatrix[mxrow][xorresult]==1 && charstate[xorresult][mxcol] == 1)                         mxthenxor [xorresult] =1;                 }                 if (charstate[xorresult][mxcol] == 0 || multmatrix[mxrow][xorresult]==0)                 {                     mxthenxor [xorresult] =0;                     has0or1 =true;                 }                 if (!has0or1)                 {                     int added = mixcolumnslookup1(charstate[xorresult][mxcol])+mixcolumnslookup1(multmatrix[mxrow][xorresult]);                     if (added > 0xFF)                         added = added- 0xFF;                   unsigned  char final = added;                     mxthenxor[xorresult] = mixcolumnslookup2(final);                 }             }             tempstate[mxrow][mxcol] = mxthenxor[0]^mxthenxor[1]^mxthenxor[2]^mxthenxor[3];         }     } unsigned char test[4][4]={{55,2,3,4},{1,22,3,4},{1,2,3,200},{1,2,3,4}};       //recopy temp into charstate     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             charstate [temprow][tempcol]= tempstate[temprow][tempcol];         }     } }```
• 11-12-2012
nvoigt
How did you find out that the copying takes forever? Try to take the time for the copy at the start, the time between the copies and the copy at the end. If it turns out that the time copying stuff actually takes long, we can find ways to optimize it. Most likely it's the stuff that happens in between that takes long. It would also be interesting to know how long exactly.
• 11-12-2012
andydufresne
Quote:

Originally Posted by nvoigt
How did you find out that the copying takes forever? Try to take the time for the copy at the start, the time between the copies and the copy at the end. If it turns out that the time copying stuff actually takes long, we can find ways to optimize it. Most likely it's the stuff that happens in between that takes long. It would also be interesting to know how long exactly.

The difference between when I copy from test to charstate compared to tempstate to charstate is about 3* longer. The thing is that the stuff in between is happening anyway. The only think I'm changing is that the copy at the end is from a fake ('test') 2d array as compared to tempstate. Seems weird to me .
• 11-12-2012
nvoigt
How did you measure it and what were the numbers? long and three times longer is not very informative, considering the copy operation should take so little time it shouldn't even be noticable for the user.
• 11-12-2012
andydufresne
Quote:

Originally Posted by nvoigt
How did you measure it and what were the numbers? long and three times longer is not very informative, considering the copy operation should take so little time it shouldn't even be noticable for the user.

Ok, here is the original code looped from main 500000 times. It took 400 ms.

Code:

```void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4]) {     unsigned char tempstate [4][4];     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             tempstate [temprow][tempcol] = charstate [temprow][tempcol];         }     }     for (int mxcol=0;mxcol<4;mxcol++)     {         for (int mxrow=0;mxrow<4;mxrow++)         {             //charstate [mxrow][mxcol] = (charstate[0][mxcol]*multmatrix[mxrow][0])  ^ (charstate[1][mxcol] * multmatrix[mxrow][1]);           // cout << hex << int (charstate[0][mxcol]) << " * " << dec <<int( multmatrix[mxrow][0]) << " -- " << hex << int (charstate[1][mxcol]) << " * " << dec << int(multmatrix[mxrow][1] )<< " -- "<< hex << int (charstate[2][mxcol]) << " * " << dec <<int( multmatrix[mxrow][2]) << " -- "<< hex << int (charstate[3][mxcol]) << " * " << dec << int(multmatrix[mxrow][3]) << " -- "<<endl;             //MX1//             int mxthenxor [4];             for (int xorresult=0;xorresult<4;xorresult++)             {                 bool has0or1 =false;                 if (charstate[xorresult][mxcol] == 1 || multmatrix[mxrow][xorresult]==1)                 {                     if(multmatrix[mxrow][xorresult]==1)                     {                     mxthenxor [xorresult] = charstate [xorresult] [mxcol];                     has0or1 =true;                     }                     if (charstate[xorresult][mxcol] == 1)                     {                         mxthenxor [xorresult] = multmatrix[xorresult] [mxcol];                     }                     if (multmatrix[mxrow][xorresult]==1 && charstate[xorresult][mxcol] == 1)                         mxthenxor [xorresult] =1;                 }                 if (charstate[xorresult][mxcol] == 0 || multmatrix[mxrow][xorresult]==0)                 {                     mxthenxor [xorresult] =0;                     has0or1 =true;                 }                 if (!has0or1)                 {                     int added = mixcolumnslookup1(charstate[xorresult][mxcol])+mixcolumnslookup1(multmatrix[mxrow][xorresult]);                     if (added > 0xFF)                         added = added- 0xFF;                   unsigned  char final = added;                     mxthenxor[xorresult] = mixcolumnslookup2(final);                 }             }             tempstate[mxrow][mxcol] = mxthenxor[0]^mxthenxor[1]^mxthenxor[2]^mxthenxor[3];         }     }       //recopy temp into charstate     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             charstate [temprow][tempcol]= tempstate[temprow][tempcol];         }     } }```
Here is the fake/test version looped 500000 times from main. It took 25 ms:

Code:

```void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4]) {     unsigned char tempstate [4][4];     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             tempstate [temprow][tempcol] = charstate [temprow][tempcol];         }     }     for (int mxcol=0;mxcol<4;mxcol++)     {         for (int mxrow=0;mxrow<4;mxrow++)         {             //charstate [mxrow][mxcol] = (charstate[0][mxcol]*multmatrix[mxrow][0])  ^ (charstate[1][mxcol] * multmatrix[mxrow][1]);           // cout << hex << int (charstate[0][mxcol]) << " * " << dec <<int( multmatrix[mxrow][0]) << " -- " << hex << int (charstate[1][mxcol]) << " * " << dec << int(multmatrix[mxrow][1] )<< " -- "<< hex << int (charstate[2][mxcol]) << " * " << dec <<int( multmatrix[mxrow][2]) << " -- "<< hex << int (charstate[3][mxcol]) << " * " << dec << int(multmatrix[mxrow][3]) << " -- "<<endl;             //MX1//             int mxthenxor [4];             for (int xorresult=0;xorresult<4;xorresult++)             {                 bool has0or1 =false;                 if (charstate[xorresult][mxcol] == 1 || multmatrix[mxrow][xorresult]==1)                 {                     if(multmatrix[mxrow][xorresult]==1)                     {                     mxthenxor [xorresult] = charstate [xorresult] [mxcol];                     has0or1 =true;                     }                     if (charstate[xorresult][mxcol] == 1)                     {                         mxthenxor [xorresult] = multmatrix[xorresult] [mxcol];                     }                     if (multmatrix[mxrow][xorresult]==1 && charstate[xorresult][mxcol] == 1)                         mxthenxor [xorresult] =1;                 }                 if (charstate[xorresult][mxcol] == 0 || multmatrix[mxrow][xorresult]==0)                 {                     mxthenxor [xorresult] =0;                     has0or1 =true;                 }                 if (!has0or1)                 {                     int added = mixcolumnslookup1(charstate[xorresult][mxcol])+mixcolumnslookup1(multmatrix[mxrow][xorresult]);                     if (added > 0xFF)                         added = added- 0xFF;                   unsigned  char final = added;                     mxthenxor[xorresult] = mixcolumnslookup2(final);                 }             }             tempstate[mxrow][mxcol] = mxthenxor[0]^mxthenxor[1]^mxthenxor[2]^mxthenxor[3];         }     } unsigned char test[4][4]={{55,2,3,4},{1,22,3,4},{1,2,3,200},{1,2,3,4}};       //recopy temp into charstate     for (int tempcol=0;tempcol<4;tempcol++)     {         for (int temprow=0;temprow<4;temprow++)         {             charstate [temprow][tempcol]= test[temprow][tempcol];         }     } }```
Do you think it might be that in the original the numbers are constantly changing whilst in the fake they are always the same?

BTW I really appreciate the help!
• 11-12-2012
std10093
Just a quote : Between the array copying you have three for loops that do work.The copying of arrays has two double for loops.

Which one is heavier operation?The one with three for loops, so if you read the posts i made above you will understand what is happening.

Also a friendly tip :

Code:

`The code you provide is faster that the one that you assign test to every cell of the array? Are you sure? How to you measure the time?`
If you answered this question ,when it was given to you , you would make it more easy for others to help you and moreover for you to be helped!
Then nvoigt had to make you the same questions again..
• 11-12-2012
andydufresne
Quote:

Originally Posted by std10093
Just a quote : Between the array copying you have three for loops that do work.The copying of arrays has two double for loops.

Which one is heavier operation?The one with three for loops, so if you read the posts i made above you will understand what is happening.

Also a friendly tip :

`The code you provide is faster that the one that you assign test to every cell of the array? Are you sure? How to you measure the time?`