# Thread: allocation for global variables in extern

1. ## allocation for global variables in extern

i have a multithreaded program that does the following:
Code:
```//global variables
double ***T;	//T is the temporary array of matrices
double ***S;	//S is the global version of A for the thread function
int *aa;            //aa is the global version of a for the thread function
int matrnum;    //the number of the matrix currently being calculated

void *calc_row(void * t_num)
{
.....
.....
.....
}

double **programY(double ***A, int *a, int mat_num)
{
#ifdef SESC
sesc_init();
#endif
double **result;
int i, j;
T = (double ***) malloc((mat_num)*sizeof(double **));
S = (double ***) malloc((mat_num)*sizeof(double **));
T = A;
S=A;
aa=a;
for(i=1; i<mat_num; i++)
{
T[i] = (double **)malloc(a*sizeof(double *));
for(j=0; j<a; j++)
{
T[i][j] = (double *)malloc(a[i+1] * sizeof(double));
}
}
.......
.......
.......
for(i=1; i<mat_num-1;i++)
{
for(j=0; j<a; j++)
{
free(T[i][j]);
}
free(T[i]);
}
free(T);
return result;
}```
i'm getting all sorts of weird errors. i'm not sure what more i need. do i need to have a malloc and a free for aa? do i need to do additional allocation or deallocation for S? please help.

this file (programy.c) is actually accessed by a different program (matmult.c) using:
extern double **programY(double ***A, int *a, int mat_num);
i don't know if that makes a difference.

here's a variable list:
T is an array of matrices used for storing results and is accessed in calc_row()
A is an array of matrices
S is a global version of A accessed in calc_row()
a is an array of integers
aa is a global version of a accessed in calc_row()
matrnum is not the same as mat_num, but is incremented in programY() after the threads finish
t_num is the number of the row currently being worked on
mat_num is the number of matrices in A 2. 1. do not use globals in the multithreading program - you just asking for troubles (you need different lockes - but you are not expirienced enogh for it)
2. S = (double ***) malloc((mat_num)*sizeof(double **));
S=A;
It is a memory leak

3. do not cast malloc in C - read FAQ
4. do you allocate result? 3. 1. i have no choice, i almost definitely have to use globals
2. how do i fix it? would it work if i do this:
Code:
```     for(i=0;i<mat_num;i++)
{
S[i] = (double **) malloc(a[i] * sizeof(double *));
for(j=0; j<a[i]; j++)
{
S[i][j] = (double *) malloc(a[i+1] * sizeof(double));
}
}```
before S=A ? or:
for(i=0;i<mat_num;i++){
S[i]=A[i];
}
3. the original casting for T was done by someone else and i can't change it. besides, i know it works.

4. how? 4. would it work if i do this
I do not know where and how the A is stored
I do not know where and how S is used
So I definitely do not know if IT will work. You can srew it in soooo many ways... 5. A is allocated and filled in the original program (matmul.c) and passed as a parameter to this external program (programy.c)

the individual elements of S are accessed in calc_row(). i didn't post it before because it's confusing but here it is:
Code:
```void *calc_row(void * t_num)
{
int threadnum = (int)t_num;        //the thread number is also the row number
int z,m;
for(z=0;z<aa[matrnum];z++){       //aa[matrnum] is the number of columns
for(m=0;m<aa[matrnum-1];m++){
}
}
#ifdef SESC
sesc_exit(0);
#endif
exit(0);
}```
what i'm trying to do with S is have a global array of matrices equal to A so that i can use it in calc_row(), i don't need to modify it, just set it equal to A and then read it's values. 6. Why not pass the T and A as parameters to calc_row?
As well as aa?

And where is the return statement? 7. i'm supposed to optimize this code. doing what you suggest would force me to create a struct with all of those values, and create (as well as modify) a copy of that struct for each thread. each of those structs alone would take a minimum of 3 casts as well as a separate cast for each internal variable i would have to modify ( both in programy() and calc_row() ). that doesn't seem very efficient or simple. if i have to i'll do it, but i don't know why i'd have to.

the return statement is in the part of the code i didn't post (again to avoid confusion):
Code:
```#ifdef SESC
for(i=1;i<mat_num;i++)
{
matrnum=i-1;
for(j=0;j<a[j];j++)
{
sesc_spawn((void *) *calc_row, (void *)j, 0);      //one thread per row of first matrix (for T*A, this means T)
}
sesc_wait();
}
sesc_exit(0);
result = T[mat_num-1];
#else
result = NULL;
#endif``` 8. > i'm supposed to optimize this code.
You'd be better off trying to make it work first, before you try to make it quick.

> one thread per row of first matrix
How many REAL processors does your machine have? 1? 2?
Instead of getting on with the job, you're just going to be thrashing around inside the thread library working out which thread to run next.

> for(i=1; i<mat_num-1;i++)
Why the -1 here, when the allocation loop went one step further?

Why (in your original code) is a never accessed?

> i'm getting all sorts of weird errors.
And you expect us to magic up a definitive answer from such a vague bug report? 9. as i said, before i started multithreading and optimizing, it worked.

i'm running this on a simulator which allows me to change various parts of the system architecture including the number of processors, so thrashing won't be an issue.

because the threading library is specific to the simulator, i can't really give you a bug report. what i do get is a "segmentation fault <core dumped>", but i figure that's because of some of the indexing and array size issues, which you correctly pointed out i have. i'm still working on those.

while i appreciate whatever help you can give with those issues and with anything else you might notice, i posted here because i thought there might be additional problems related to memory allocation. i wasn't expecting you magic up anything, i just wanted to know what i still had to allocate/deallocate. 10. to work with globals - you need a locking mechanism that will kill the idea of multithreading for speed optimization.
As I said - to get a speed boost from multithreading - you need to provide each thread with its own data to be processed to avoid locking.

So start with modifiyng the code in a way - so no globals will be used. 11. okay, you're definitely right, at least about matrnum and aa(the rest of the individual variables are part of an array or a matrix and are not accessed by more than one thread, so there might be someway around that), so i will need a struct, which is annoying.
but i'll need to either pass T (and probably A) by reference, which i'm not sure how to do. any ideas of how to do this or get around it?
i'll post the new code with the structs as soon as i sort through the various casts. might take a while. 12. okay:
Code:
```#include <stdlib.h>
#ifdef SESC
#include "sescapi.h"
#endif

struct args{
double ****TT;	//TT is a pointer to T
double ****S;	//S is a pointer to A
int *aa;        //aa is a copy of a
int matr_num;    //the number of the matrix currently being calculated
int t_num;       //the number of the thread curently running
};

//main function for calculating the row values
void *calc_row(void * t_args)
{
struct args *aargs = (struct args *) t_args;
int threadnum = aargs->t_num;       //the thread number is also the row number
int matrnum = aargs->matr_num;
int col,m;
for(col=0;col<(aargs->aa)[matrnum+1];col++){       //aa[matrnum+1] columns in TT
for(m=0;m<(aargs->aa)[matrnum];m++){              //aa[matrnum+1] rows in S
}
}
#ifdef SESC
sesc_exit(0);
#endif
exit(0);
}

double **programY(double ***A, int *a, int mat_num)
{
#ifdef SESC
sesc_init();
#endif
double **result;
double ***T;
int i, j;
/* memory allocation for intermediate results */
T = (double ***) malloc((mat_num)*sizeof(double **));
for(i=1; i<mat_num;i++)
{
T[i] = (double **)malloc(a*sizeof(double *));
for(j=0; j<a; j++)
{
T[i][j] = (double *)malloc(a[i+1] * sizeof(double));
}
}
T = A;
/* matrix multiplication */
#ifdef SESC
for(i=1;i<mat_num;i++)
{
for(j=0;j<a;j++) //a rows in T[i]
{
struct args targs;
targs.TT=&T;
targs.S=&A;
targs.aa=a;
targs.matr_num=i;
targs.t_num=j;
sesc_spawn((void *) *calc_row, (void *)&targs, 0);      //one thread per row of first matrix (for T*A, this means T)
}
sesc_wait();
}
sesc_exit(0);
result = T[mat_num-1];
#else
result = NULL;
#endif
/* memory deallocation - this loop does not deallocate T(= A) and T[mat_num](=result).*/
for(i=1; i<mat_num-1;i++)
{
for(j=0; j<a; j++)
{
free(T[i][j]);
}
free(T[i]);
}
free(T);
return result;
}```
to pass T by reference, i made the struct's TT variable into a pointer to a three dimensional array, which i initialize in programY() to the reference to T, and i dereference it in calc_row. i did the same basic thing for A. my head hurts. and not so surprisingly this didn't work. 13. i found the indexing problem and i think it's working now. though i suspect it has memory leaks and the like. please tell me if you guys notice anything. and thank you.

update:
i was wrong i still have yet to find the issue. please tell me if you happen to see it 14. i ended up using a different means of optimization thank you all for your help. Popular pages Recent additions 