Thread: Taking forever to copy 2d array

  1. #1
    Registered User
    Join Date
    Apr 2012
    Posts
    19

    Taking forever to copy 2d array

    So I have a function which takes in a 2d array as a parameter and sets its value by copying another (locally declared) 2d array like this (both 2d arrays of unsigned chars):

    Code:
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                charstate [temprow][tempcol]=tempstate[temprow[tempcol];   
            }
        }
    but this is massively slowing my program.

    Doing this takes a fraction of the time:

    Code:
    
    unsigned char test=12;
    
    
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                charstate [temprow][tempcol] =test;
            }
        }
    Why is this? Any solutions to speeding up my application?
    Thanks!

  2. #2
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    Because copying an array is expensive.You have to traverse through the whole array.The array is 2D so you must go through all elements of first row,then trough all elements of 2nd row and so on...This will lead for you to make N times M time to go through all the elements of a N x M array, which in terms of time complexity is O(N*M) and it is the optimum.This means that you can not reduce this.I mean think about it.If you reduce this, you won't access at least an element, thus you will not copy the whole array.

    If you want to make your code more efficient you have to think for another approach.

    Notice that if this operation - copying arrays - is not the heaviest in your program ( this will happen if you have an operation with time complexity more or equal to (N*M) ) , it has no sense in terms of time complexity to change this operation, because the performance of your code is equal to the one of the heaviest operation your programs does.

    Which is your problem that requires such a heavy operation?

    Of course i assume that we are talking about one threaded application here.

    PS - It is more usual to access the arrays line by line, in other words switching the two for loops

  3. #3
    Registered User
    Join Date
    Apr 2012
    Posts
    19
    Thanks for the reply but I'm confused why this code is much much quicker :


    Code:
    unsigned char test[4][4];
    
    
          //recopy temp into charstate
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                charstate [temprow][tempcol]= test[temprow][tempcol];
    
    
            }
        }
    I'm thinking it's something to do with the array being passed globally as copy
    Last edited by andydufresne; 11-11-2012 at 06:59 PM.

  4. #4
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    The code you provide is faster that the one that you assign test to every cell of the array? Are you sure? How to you measure the time?

    An idea :
    You said that you create an array inside the function and then copy by value to the charstate array.Why not doing these operation in one double for loop instead of two?

    Example (in C code but it will be ok to understand )

    Going through the loop twice and a temp array
    Code:
    #include <stdio.h>
    
    void foo(int a[][5],int n)
    {
        int i,j;
        int temp[n][n];
    
    
        for( i = 0 ; i < n ; i++ )
            for(j = 0 ; j < n ; j ++)
                 temp[i][j] = 15;
    
    
        for( i = 0 ; i < n ; i++ )
            for(j = 0 ; j < n ; j ++)
                  a[i][j] = temp[i][j];
    
    }
    
    int main(int argc, char *argv[]){
        int array[5][5];
        int i,j;
    
        foo(array,5);
    
        for( i = 0 ; i < 5 ; i++ )
        {
            for(j = 0 ; j < 5 ; j ++)
                printf(" %d " ,array[i][j]);
            printf("\n");
        }
    
        return 0;
    }
    Going with one double for loop and a temp array
    Code:
    #include <stdio.h>
    
    void foo(int a[][5],int n)
    {
        int i,j;
        int temp[n][n];
    
    
        for( i = 0 ; i < n ; i++ )
            for(j = 0 ; j < n ; j ++)
            {
                 temp[i][j] = 15;
                 a[i][j] = temp[i][j];
            }
    
    }
    
    int main(int argc, char *argv[]){
        int array[5][5];
        int i,j;
    
        foo(array,5);
    
        for( i = 0 ; i < 5 ; i++ )
        {
            for(j = 0 ; j < 5 ; j ++)
                printf(" %d " ,array[i][j]);
            printf("\n");
        }
    
        return 0;
    }
    But we do not need the array in this simple example.So improve space complexity too.
    One double for loop ,no temp array.
    Code:
    #include <stdio.h>
    
    void foo(int a[][5],int n)
    {
        int i,j;
    
        for( i = 0 ; i < n ; i++ )
            for(j = 0 ; j < n ; j ++)
                 a[i][j] = 15;
    
    }
    
    int main(int argc, char *argv[]){
        int array[5][5];
        int i,j;
    
        foo(array,5);
    
        for( i = 0 ; i < 5 ; i++ )
        {
            for(j = 0 ; j < 5 ; j ++)
                printf(" %d " ,array[i][j]);
            printf("\n");
        }
    
        return 0;
    }

  5. #5
    Registered User
    Join Date
    Apr 2012
    Posts
    19
    Urgh. The second piece of code is much faster.

    This is my actual code:

    Code:
    
    void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4])
    {
    
    
    
    
    
    
        unsigned char tempstate [4][4];
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                tempstate [temprow][tempcol] = charstate [temprow][tempcol];
            }
        }
    
    
    
    
    //do complex stuff to tempstate
    
    
    
    
    
    
    
    
    
          //recopy temp into charstate
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                charstate [temprow][tempcol]= tempstate[temprow[tempcol];
    
    
            }
        }
    
    
    
    
    
    
    }

  6. #6
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    I do not understand why you need tempstate.Can't you play with charstate only?
    This will make you forget about the two double for loops you have back there and the extra array.

    Also you might do the copy with memcpy . When n is a big number ,it may has difference.For the n you have, it is the same.
    Last edited by std10093; 11-11-2012 at 07:17 PM.

  7. #7
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    You are almost certainly mistaken in thinking that the performance issue is where you seem to think it is. A 16-byte copy is nothing.
    Until you show the complete unedited function, this is a waste of time.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  8. #8
    Registered User
    Join Date
    Apr 2012
    Posts
    19
    Quote Originally Posted by std10093 View Post
    I do not understand why you need tempstate.Can't you play with charstate only?
    This will make you forget about the two double for loops you have back there and the extra array.

    Also you might do the copy with memcpy . When n is a big number ,it may has difference.For the n you have, it is the same.
    The reason I need tempstate is that the function continually changes the 2d array and the operations performed depend on the original version of the 2d array.

    Tried to memcpy to no effect.

  9. #9
    Registered User
    Join Date
    Apr 2012
    Posts
    19
    I still don't think you need it but here it is anyway:

    Code:
    
    
    void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4])
    {
    
    
    
    
    
    
        unsigned char tempstate [4][4];
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                tempstate [temprow][tempcol] = charstate [temprow][tempcol];
            }
        }
    
    
    
    
        for (int mxcol=0;mxcol<4;mxcol++)
        {
            for (int mxrow=0;mxrow<4;mxrow++)
            {
    
    
                 //charstate [mxrow][mxcol] = (charstate[0][mxcol]*multmatrix[mxrow][0])  ^ (charstate[1][mxcol] * multmatrix[mxrow][1]);
    
    
               // cout << hex << int (charstate[0][mxcol]) << " * " << dec <<int( multmatrix[mxrow][0]) << " -- " << hex << int (charstate[1][mxcol]) << " * " << dec << int(multmatrix[mxrow][1] )<< " -- "<< hex << int (charstate[2][mxcol]) << " * " << dec <<int( multmatrix[mxrow][2]) << " -- "<< hex << int (charstate[3][mxcol]) << " * " << dec << int(multmatrix[mxrow][3]) << " -- "<<endl;
    
    
                //MX1//
                int mxthenxor [4];
    
    
                for (int xorresult=0;xorresult<4;xorresult++)
                {
                    bool has0or1 =false;
    
    
    
    
                    if (charstate[xorresult][mxcol] == 1 || multmatrix[mxrow][xorresult]==1)
                    {
    
    
                        if(multmatrix[mxrow][xorresult]==1)
                        {
                        mxthenxor [xorresult] = charstate [xorresult] [mxcol];
                        has0or1 =true;
                        }
    
    
                        if (charstate[xorresult][mxcol] == 1)
                        {
                            mxthenxor [xorresult] = multmatrix[xorresult] [mxcol];
    
    
                        }
    
    
                        if (multmatrix[mxrow][xorresult]==1 && charstate[xorresult][mxcol] == 1)
                            mxthenxor [xorresult] =1;
    
    
    
    
                    }
    
    
    
    
                    if (charstate[xorresult][mxcol] == 0 || multmatrix[mxrow][xorresult]==0)
                    {
                        mxthenxor [xorresult] =0;
                        has0or1 =true;
    
    
    
    
                    }
    
    
                    if (!has0or1)
                    {
    
    
    
    
                        int added = mixcolumnslookup1(charstate[xorresult][mxcol])+mixcolumnslookup1(multmatrix[mxrow][xorresult]);
    
    
                        if (added > 0xFF)
                            added = added- 0xFF;
    
    
                      unsigned  char final = added;
                        mxthenxor[xorresult] = mixcolumnslookup2(final);
    
    
    
    
                    }
    
    
    
    
    
    
                }
    
    
    
    
                tempstate[mxrow][mxcol] = mxthenxor[0]^mxthenxor[1]^mxthenxor[2]^mxthenxor[3];
    
    
    
    
            }
    
    
        }
    
    
    unsigned char test[4][4]={{55,2,3,4},{1,22,3,4},{1,2,3,200},{1,2,3,4}};
    
    
    
    
    
    
          //recopy temp into charstate
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                charstate [temprow][tempcol]= tempstate[temprow][tempcol];
    
    
            }
        }
    
    
    
    
    
    
    }

  10. #10
    the hat of redundancy hat nvoigt's Avatar
    Join Date
    Aug 2001
    Location
    Hannover, Germany
    Posts
    3,130
    How did you find out that the copying takes forever? Try to take the time for the copy at the start, the time between the copies and the copy at the end. If it turns out that the time copying stuff actually takes long, we can find ways to optimize it. Most likely it's the stuff that happens in between that takes long. It would also be interesting to know how long exactly.
    hth
    -nv

    She was so Blonde, she spent 20 minutes looking at the orange juice can because it said "Concentrate."

    When in doubt, read the FAQ.
    Then ask a smart question.

  11. #11
    Registered User
    Join Date
    Apr 2012
    Posts
    19
    Quote Originally Posted by nvoigt View Post
    How did you find out that the copying takes forever? Try to take the time for the copy at the start, the time between the copies and the copy at the end. If it turns out that the time copying stuff actually takes long, we can find ways to optimize it. Most likely it's the stuff that happens in between that takes long. It would also be interesting to know how long exactly.
    The difference between when I copy from test to charstate compared to tempstate to charstate is about 3* longer. The thing is that the stuff in between is happening anyway. The only think I'm changing is that the copy at the end is from a fake ('test') 2d array as compared to tempstate. Seems weird to me .

  12. #12
    the hat of redundancy hat nvoigt's Avatar
    Join Date
    Aug 2001
    Location
    Hannover, Germany
    Posts
    3,130
    How did you measure it and what were the numbers? long and three times longer is not very informative, considering the copy operation should take so little time it shouldn't even be noticable for the user.
    hth
    -nv

    She was so Blonde, she spent 20 minutes looking at the orange juice can because it said "Concentrate."

    When in doubt, read the FAQ.
    Then ask a smart question.

  13. #13
    Registered User
    Join Date
    Apr 2012
    Posts
    19
    Quote Originally Posted by nvoigt View Post
    How did you measure it and what were the numbers? long and three times longer is not very informative, considering the copy operation should take so little time it shouldn't even be noticable for the user.
    Ok, here is the original code looped from main 500000 times. It took 400 ms.

    Code:
    void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4])
    {
    
    
    
    
    
    
        unsigned char tempstate [4][4];
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                tempstate [temprow][tempcol] = charstate [temprow][tempcol];
            }
        }
    
    
    
    
        for (int mxcol=0;mxcol<4;mxcol++)
        {
            for (int mxrow=0;mxrow<4;mxrow++)
            {
    
    
                 //charstate [mxrow][mxcol] = (charstate[0][mxcol]*multmatrix[mxrow][0])  ^ (charstate[1][mxcol] * multmatrix[mxrow][1]);
    
    
               // cout << hex << int (charstate[0][mxcol]) << " * " << dec <<int( multmatrix[mxrow][0]) << " -- " << hex << int (charstate[1][mxcol]) << " * " << dec << int(multmatrix[mxrow][1] )<< " -- "<< hex << int (charstate[2][mxcol]) << " * " << dec <<int( multmatrix[mxrow][2]) << " -- "<< hex << int (charstate[3][mxcol]) << " * " << dec << int(multmatrix[mxrow][3]) << " -- "<<endl;
    
    
                //MX1//
                int mxthenxor [4];
    
    
                for (int xorresult=0;xorresult<4;xorresult++)
                {
                    bool has0or1 =false;
    
    
    
    
                    if (charstate[xorresult][mxcol] == 1 || multmatrix[mxrow][xorresult]==1)
                    {
    
    
                        if(multmatrix[mxrow][xorresult]==1)
                        {
                        mxthenxor [xorresult] = charstate [xorresult] [mxcol];
                        has0or1 =true;
                        }
    
    
                        if (charstate[xorresult][mxcol] == 1)
                        {
                            mxthenxor [xorresult] = multmatrix[xorresult] [mxcol];
    
    
                        }
    
    
                        if (multmatrix[mxrow][xorresult]==1 && charstate[xorresult][mxcol] == 1)
                            mxthenxor [xorresult] =1;
    
    
    
    
                    }
    
    
    
    
                    if (charstate[xorresult][mxcol] == 0 || multmatrix[mxrow][xorresult]==0)
                    {
                        mxthenxor [xorresult] =0;
                        has0or1 =true;
    
    
    
    
                    }
    
    
                    if (!has0or1)
                    {
    
    
    
    
                        int added = mixcolumnslookup1(charstate[xorresult][mxcol])+mixcolumnslookup1(multmatrix[mxrow][xorresult]);
    
    
                        if (added > 0xFF)
                            added = added- 0xFF;
    
    
                      unsigned  char final = added;
                        mxthenxor[xorresult] = mixcolumnslookup2(final);
    
    
    
    
                    }
    
    
    
    
    
    
                }
    
    
    
    
                tempstate[mxrow][mxcol] = mxthenxor[0]^mxthenxor[1]^mxthenxor[2]^mxthenxor[3];
    
    
    
    
            }
    
    
        }
    
    
    
    
    
    
          //recopy temp into charstate
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                charstate [temprow][tempcol]= tempstate[temprow][tempcol];
    
    
            }
        }
    
    
    
    
    
    
    }
    Here is the fake/test version looped 500000 times from main. It took 25 ms:

    Code:
    void applymixcolumns (unsigned char charstate[][4],unsigned char multmatrix[][4])
    {
    
    
    
    
    
    
        unsigned char tempstate [4][4];
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                tempstate [temprow][tempcol] = charstate [temprow][tempcol];
            }
        }
    
    
    
    
        for (int mxcol=0;mxcol<4;mxcol++)
        {
            for (int mxrow=0;mxrow<4;mxrow++)
            {
    
    
                 //charstate [mxrow][mxcol] = (charstate[0][mxcol]*multmatrix[mxrow][0])  ^ (charstate[1][mxcol] * multmatrix[mxrow][1]);
    
    
               // cout << hex << int (charstate[0][mxcol]) << " * " << dec <<int( multmatrix[mxrow][0]) << " -- " << hex << int (charstate[1][mxcol]) << " * " << dec << int(multmatrix[mxrow][1] )<< " -- "<< hex << int (charstate[2][mxcol]) << " * " << dec <<int( multmatrix[mxrow][2]) << " -- "<< hex << int (charstate[3][mxcol]) << " * " << dec << int(multmatrix[mxrow][3]) << " -- "<<endl;
    
    
                //MX1//
                int mxthenxor [4];
    
    
                for (int xorresult=0;xorresult<4;xorresult++)
                {
                    bool has0or1 =false;
    
    
    
    
                    if (charstate[xorresult][mxcol] == 1 || multmatrix[mxrow][xorresult]==1)
                    {
    
    
                        if(multmatrix[mxrow][xorresult]==1)
                        {
                        mxthenxor [xorresult] = charstate [xorresult] [mxcol];
                        has0or1 =true;
                        }
    
    
                        if (charstate[xorresult][mxcol] == 1)
                        {
                            mxthenxor [xorresult] = multmatrix[xorresult] [mxcol];
    
    
                        }
    
    
                        if (multmatrix[mxrow][xorresult]==1 && charstate[xorresult][mxcol] == 1)
                            mxthenxor [xorresult] =1;
    
    
    
    
                    }
    
    
    
    
                    if (charstate[xorresult][mxcol] == 0 || multmatrix[mxrow][xorresult]==0)
                    {
                        mxthenxor [xorresult] =0;
                        has0or1 =true;
    
    
    
    
                    }
    
    
                    if (!has0or1)
                    {
    
    
    
    
                        int added = mixcolumnslookup1(charstate[xorresult][mxcol])+mixcolumnslookup1(multmatrix[mxrow][xorresult]);
    
    
                        if (added > 0xFF)
                            added = added- 0xFF;
    
    
                      unsigned  char final = added;
                        mxthenxor[xorresult] = mixcolumnslookup2(final);
    
    
    
    
                    }
    
    
    
    
    
    
                }
    
    
    
    
                tempstate[mxrow][mxcol] = mxthenxor[0]^mxthenxor[1]^mxthenxor[2]^mxthenxor[3];
    
    
    
    
            }
    
    
        }
    
    
    unsigned char test[4][4]={{55,2,3,4},{1,22,3,4},{1,2,3,200},{1,2,3,4}};
    
    
    
    
    
    
          //recopy temp into charstate
    
    
        for (int tempcol=0;tempcol<4;tempcol++)
        {
            for (int temprow=0;temprow<4;temprow++)
            {
                charstate [temprow][tempcol]= test[temprow][tempcol];
    
    
            }
        }
    
    
    
    
    
    
    }
    Do you think it might be that in the original the numbers are constantly changing whilst in the fake they are always the same?

    BTW I really appreciate the help!

  14. #14
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    Just a quote : Between the array copying you have three for loops that do work.The copying of arrays has two double for loops.

    Which one is heavier operation?The one with three for loops, so if you read the posts i made above you will understand what is happening.

    Also a friendly tip :

    In post number 4 there was this sentence
    Code:
    The code you provide is faster that the one that you assign test to every cell of the array? Are you sure? How to you measure the time?
    If you answered this question ,when it was given to you , you would make it more easy for others to help you and moreover for you to be helped!
    Then nvoigt had to make you the same questions again..

  15. #15
    Registered User
    Join Date
    Apr 2012
    Posts
    19
    Quote Originally Posted by std10093 View Post
    Just a quote : Between the array copying you have three for loops that do work.The copying of arrays has two double for loops.

    Which one is heavier operation?The one with three for loops, so if you read the posts i made above you will understand what is happening.

    Also a friendly tip :

    In post number 4 there was this sentence
    Code:
    The code you provide is faster that the one that you assign test to every cell of the array? Are you sure? How to you measure the time?
    If you answered this question ,when it was given to you , you would make it more easy for others to help you and moreover for you to be helped!
    Then nvoigt had to make you the same questions again..
    Sorry, I find it a little difficult to understand you. From the my above post is it clear why one piece of code is much faster? Is it, as I may think, that for the original piece of code the 2d array being copied (tempstate) keeps changing (from one function call to another) whilst the test version the 2d array remains the same from one function call to another?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 26
    Last Post: 06-04-2012, 07:58 PM
  2. Input to array loops forever [C]
    By Quixiotic in forum C Programming
    Replies: 5
    Last Post: 11-08-2011, 12:59 PM
  3. Taking a variable and input each value of it into an array
    By jammysammy in forum C Programming
    Replies: 4
    Last Post: 05-17-2011, 04:46 AM
  4. Replies: 1
    Last Post: 04-08-2011, 11:22 AM
  5. Int Array - Stop taking values at [zero]
    By bunko in forum C Programming
    Replies: 3
    Last Post: 12-04-2008, 12:54 AM