Thread: allocation for global variables in extern

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    9

    allocation for global variables in extern

    i have a multithreaded program that does the following:
    Code:
    //global variables
    double ***T;	//T is the temporary array of matrices
    double ***S;	//S is the global version of A for the thread function
    int *aa;            //aa is the global version of a for the thread function
    int matrnum;    //the number of the matrix currently being calculated
    
    //function called by each thread
    void *calc_row(void * t_num)
    {
    .....
    .....
    .....
    }
    
    double **programY(double ***A, int *a, int mat_num)
    {
        #ifdef SESC
    	sesc_init();
        #endif
    	double **result;
    	int i, j;
    	T = (double ***) malloc((mat_num)*sizeof(double **));
    	S = (double ***) malloc((mat_num)*sizeof(double **));
         T[0] = A[0];
         	S=A;
         	aa=a; 
         for(i=1; i<mat_num; i++)
    	 {
    	 	T[i] = (double **)malloc(a[0]*sizeof(double *));
    		for(j=0; j<a[0]; j++)
    		{
    			T[i][j] = (double *)malloc(a[i+1] * sizeof(double));
    		}
             }
    .......
    .......
    .......
    	for(i=1; i<mat_num-1;i++)
    	{
    		for(j=0; j<a[0]; j++)
    		{
    			free(T[i][j]);
    		}
    		free(T[i]);
    	}
    	free(T);
    	return result;
    }
    i'm getting all sorts of weird errors. i'm not sure what more i need. do i need to have a malloc and a free for aa? do i need to do additional allocation or deallocation for S? please help.

    this file (programy.c) is actually accessed by a different program (matmult.c) using:
    extern double **programY(double ***A, int *a, int mat_num);
    i don't know if that makes a difference.

    here's a variable list:
    T is an array of matrices used for storing results and is accessed in calc_row()
    A is an array of matrices
    S is a global version of A accessed in calc_row()
    a is an array of integers
    aa is a global version of a accessed in calc_row()
    matrnum is not the same as mat_num, but is incremented in programY() after the threads finish
    t_num is the number of the row currently being worked on
    mat_num is the number of matrices in A
    Last edited by Bobert; 12-13-2007 at 11:00 PM.

  2. #2
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    1. do not use globals in the multithreading program - you just asking for troubles (you need different lockes - but you are not expirienced enogh for it)
    2. S = (double ***) malloc((mat_num)*sizeof(double **));
    S=A;
    It is a memory leak

    3. do not cast malloc in C - read FAQ
    4. do you allocate result?
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  3. #3
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    1. i have no choice, i almost definitely have to use globals
    2. how do i fix it? would it work if i do this:
    Code:
         for(i=0;i<mat_num;i++)
         {
          S[i] = (double **) malloc(a[i] * sizeof(double *));
    		for(j=0; j<a[i]; j++)
    		{
    			S[i][j] = (double *) malloc(a[i+1] * sizeof(double));
    		}
         }
    before S=A ? or:
    for(i=0;i<mat_num;i++){
    S[i]=A[i];
    }
    3. the original casting for T was done by someone else and i can't change it. besides, i know it works.

    4. how?

    5. please can someone answer the original questions
    Last edited by Bobert; 12-13-2007 at 11:20 PM.

  4. #4
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    would it work if i do this
    I do not know where and how the A is stored
    I do not know where and how S is used
    So I definitely do not know if IT will work. You can srew it in soooo many ways...
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  5. #5
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    A is allocated and filled in the original program (matmul.c) and passed as a parameter to this external program (programy.c)

    the individual elements of S are accessed in calc_row(). i didn't post it before because it's confusing but here it is:
    Code:
    void *calc_row(void * t_num)
    {
     int threadnum = (int)t_num;        //the thread number is also the row number
     int z,m;
      for(z=0;z<aa[matrnum];z++){       //aa[matrnum] is the number of columns 
       for(m=0;m<aa[matrnum-1];m++){
        T[matrnum][threadnum][z] = T[matrnum-1][threadnum][z]*S[matrnum][z][m];
       } 
      }
      #ifdef SESC
       sesc_exit(0);
      #endif
      exit(0);
    }
    what i'm trying to do with S is have a global array of matrices equal to A so that i can use it in calc_row(), i don't need to modify it, just set it equal to A and then read it's values.
    Last edited by Bobert; 12-13-2007 at 11:28 PM.

  6. #6
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Why not pass the T and A as parameters to calc_row?
    As well as aa?

    And where is the return statement?
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  7. #7
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    i'm supposed to optimize this code. doing what you suggest would force me to create a struct with all of those values, and create (as well as modify) a copy of that struct for each thread. each of those structs alone would take a minimum of 3 casts as well as a separate cast for each internal variable i would have to modify ( both in programy() and calc_row() ). that doesn't seem very efficient or simple. if i have to i'll do it, but i don't know why i'd have to.

    the return statement is in the part of the code i didn't post (again to avoid confusion):
    Code:
    #ifdef SESC
       for(i=1;i<mat_num;i++)
       {
        matrnum=i-1;
        for(j=0;j<a[j];j++)
        { 
           sesc_spawn((void *) *calc_row, (void *)j, 0);      //one thread per row of first matrix (for T[1]*A[2], this means T[1]) 
        }
        sesc_wait();
       }
       sesc_exit(0);
       result = T[mat_num-1];
      #else
       result = NULL;
      #endif
    Last edited by Bobert; 12-13-2007 at 11:47 PM.

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    > i'm supposed to optimize this code.
    You'd be better off trying to make it work first, before you try to make it quick.

    > one thread per row of first matrix
    How many REAL processors does your machine have? 1? 2?
    Instead of getting on with the job, you're just going to be thrashing around inside the thread library working out which thread to run next.


    > for(i=1; i<mat_num-1;i++)
    Why the -1 here, when the allocation loop went one step further?

    Why (in your original code) is a[1] never accessed?

    > i'm getting all sorts of weird errors.
    And you expect us to magic up a definitive answer from such a vague bug report?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    as i said, before i started multithreading and optimizing, it worked.

    i'm running this on a simulator which allows me to change various parts of the system architecture including the number of processors, so thrashing won't be an issue.

    because the threading library is specific to the simulator, i can't really give you a bug report. what i do get is a "segmentation fault <core dumped>", but i figure that's because of some of the indexing and array size issues, which you correctly pointed out i have. i'm still working on those.

    while i appreciate whatever help you can give with those issues and with anything else you might notice, i posted here because i thought there might be additional problems related to memory allocation. i wasn't expecting you magic up anything, i just wanted to know what i still had to allocate/deallocate.

  10. #10
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    to work with globals - you need a locking mechanism that will kill the idea of multithreading for speed optimization.
    As I said - to get a speed boost from multithreading - you need to provide each thread with its own data to be processed to avoid locking.

    So start with modifiyng the code in a way - so no globals will be used.
    When succeded in this task - start working on multithreading
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  11. #11
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    okay, you're definitely right, at least about matrnum and aa(the rest of the individual variables are part of an array or a matrix and are not accessed by more than one thread, so there might be someway around that), so i will need a struct, which is annoying.
    but i'll need to either pass T (and probably A) by reference, which i'm not sure how to do. any ideas of how to do this or get around it?
    i'll post the new code with the structs as soon as i sort through the various casts. might take a while.
    Last edited by Bobert; 12-14-2007 at 01:58 AM.

  12. #12
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    okay:
    Code:
    #include <stdlib.h>
    #ifdef SESC
    #include "sescapi.h"
    #endif
    
    struct args{
    double ****TT;	//TT is a pointer to T 
    double ****S;	//S is a pointer to A
    int *aa;        //aa is a copy of a
    int matr_num;    //the number of the matrix currently being calculated
    int t_num;       //the number of the thread curently running
    };
    
    //main function for calculating the row values
    void *calc_row(void * t_args)
    {
     struct args *aargs = (struct args *) t_args;   
     int threadnum = aargs->t_num;       //the thread number is also the row number
     int matrnum = aargs->matr_num; 
     int col,m;
      for(col=0;col<(aargs->aa)[matrnum+1];col++){       //aa[matrnum+1] columns in TT 
       (*(aargs->TT))[matrnum][threadnum][z]=0;
       for(m=0;m<(aargs->aa)[matrnum];m++){              //aa[matrnum+1] rows in S
        (*(aargs->TT))[matrnum][threadnum][col] += (*(aargs->TT))[matrnum-1][threadnum][col]*(*(aargs->S))[matrnum][col][m];
       } 
      }
      #ifdef SESC
       sesc_exit(0);
      #endif
      exit(0);
    }
    
    double **programY(double ***A, int *a, int mat_num)
    {
        #ifdef SESC
    	sesc_init();
        #endif
    	double **result;
    	double ***T;
    	int i, j;
    	/* memory allocation for intermediate results */ 
    	T = (double ***) malloc((mat_num)*sizeof(double **));
    	for(i=1; i<mat_num;i++)
    	{
    		T[i] = (double **)malloc(a[0]*sizeof(double *));
    		for(j=0; j<a[0]; j++)
    		{
    			T[i][j] = (double *)malloc(a[i+1] * sizeof(double));
    		}
    	}
    	T[0] = A[0];
    	/* matrix multiplication */
    #ifdef SESC
       for(i=1;i<mat_num;i++)
       {
        for(j=0;j<a[0];j++) //a[0] rows in T[i]
        {  
         struct args targs;
    	 targs.TT=&T;
    	 targs.S=&A;
    	 targs.aa=a;
    	 targs.matr_num=i;
           targs.t_num=j;                               
           sesc_spawn((void *) *calc_row, (void *)&targs, 0);      //one thread per row of first matrix (for T[1]*A[2], this means T[1]) 
        }
        sesc_wait();
       }
       sesc_exit(0);
       result = T[mat_num-1];
      #else
       result = NULL;
      #endif
    	/* memory deallocation - this loop does not deallocate T[0](= A[0]) and T[mat_num](=result).*/
    	for(i=1; i<mat_num-1;i++)
    	{
    		for(j=0; j<a[0]; j++)
    		{
    			free(T[i][j]); 
    		}
    		free(T[i]);
    	}
    	free(T);
    	return result;
    }
    to pass T by reference, i made the struct's TT variable into a pointer to a three dimensional array, which i initialize in programY() to the reference to T, and i dereference it in calc_row. i did the same basic thing for A. my head hurts. and not so surprisingly this didn't work.
    Last edited by Bobert; 12-14-2007 at 03:52 AM.

  13. #13
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    i found the indexing problem and i think it's working now. though i suspect it has memory leaks and the like. please tell me if you guys notice anything. and thank you.

    update:
    i was wrong i still have yet to find the issue. please tell me if you happen to see it
    Last edited by Bobert; 12-14-2007 at 03:53 AM.

  14. #14
    Registered User
    Join Date
    Dec 2007
    Posts
    9
    i ended up using a different means of optimization thank you all for your help.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. esbo's data sharing example
    By esbo in forum C Programming
    Replies: 49
    Last Post: 01-08-2008, 11:07 PM
  2. Replies: 16
    Last Post: 10-29-2006, 05:04 AM
  3. Extern Question, really confused
    By SourceCode in forum C Programming
    Replies: 10
    Last Post: 03-26-2003, 11:11 PM
  4. functions to return 2 variables?
    By tim in forum C Programming
    Replies: 5
    Last Post: 02-18-2002, 02:39 PM
  5. Variable Allocation in a simple operating system
    By awkeller in forum C Programming
    Replies: 1
    Last Post: 12-08-2001, 02:26 PM