Thread: Matrix product

  1. #1
    Registered User
    Join Date
    Aug 2008
    Location
    Romania
    Posts
    3

    Matrix product

    Hello!

    I searched the forum on this topic. My problem is as follows: I wrote a C++ algorithm which involved segment trees and matrix product. I had a timelimit of 1.1 seconds.

    In the "matrix product" function I reversed the order of the 'for'-s as follows:

    Code:
    for (k = 0; k<n; ++k)
    	for (i=0; i<n; ++i)
    		for (j=0; j<n; ++j)
    			C[i][j] += A[i][k] * B[k][j];
    It was implemented in a matrix class and A, B and C are in fact instances (didn't write all the code here)... The normal (and logical) order of the for-s was (i,j,k), but putting them this way (k,i,j), my code ran faster (worst case, from 1 sec -> 0.8 sec)...

    I would like to ask you why is this happening? Any permutation of (i,j,k) I would use, one array would be accessed "normally", one on one row at a time and one on one column at a time... so the cache is not working in some parts, but does in others.

    I hope my English is not bad enough to make this post impossible to be followed!

    Thank you!

  2. #2
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    You are using a loop like this and wondering and then asking questions as to why the order causes performance differences? My main question to you is why you even care. My second question is that this type of loop (with 3 for's) is not going to get good performance no matter what you do with the order. Unroll some of the loops and you will get better performance.

    There has to be a better way.

  3. #3
    Registered User
    Join Date
    Aug 2008
    Location
    Romania
    Posts
    3
    Yes, my concern is why the order of the loops makes any difference. I don't know anything about "unrolling loops" (or maybe I do but I don't know it's called like that, could you please explain me a bit?), but the complexity needed for this operation was O(N^3).

    There's no doubt it would be very nice to find a better way, but as far as I know, doing a matrix product requires 3 for loops.

    Edit:
    is not going to get good performance no matter what you do with the order.
    It is not, but the difference was enough for me to finish that code and not search for further optimization. But, I would like to find a solution - if there is any.
    Last edited by Cotizo; 08-03-2008 at 09:34 AM.

  4. #4
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Unroll some of the loops and you will get better performance.
    It should be tested with profiler. Some optimizers have problems with optimizing unrolled loops
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  5. #5
    Registered User
    Join Date
    Aug 2008
    Location
    Romania
    Posts
    3
    Ok, found out more about loop unrolling on:

    http://en.wikipedia.org/wiki/Loop_unrolling

    Is it enough? Is it the thing you said I could do (in the deep-most for to compute several steps ahead)?

    If it is, I will give it a try!

    Edit: found out even more: http://en.wikipedia.org/wiki/Loop_nest_optimization
    Last edited by Cotizo; 08-03-2008 at 10:24 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C - access violation
    By uber in forum C Programming
    Replies: 2
    Last Post: 07-08-2009, 01:30 PM
  2. Matrix Help
    By HelpmeMark in forum C++ Programming
    Replies: 27
    Last Post: 03-06-2008, 05:57 PM
  3. Matrix and vector operations on computers
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 11
    Last Post: 05-11-2004, 06:36 AM
  4. Matrix Reloaded Questions (SPOILERS_
    By Xei in forum A Brief History of Cprogramming.com
    Replies: 73
    Last Post: 10-19-2003, 02:21 PM

Tags for this Thread