Thread: Malloc and MPI

  1. #1
    Registered User
    Join Date
    Jun 2007
    Posts
    26

    Malloc and MPI

    I have a couple of questions. First, when using malloc to dynamically create an array, the new array is still contiguous in memory correct?

    Second, I'm trying to send a C structure in MPI in which one of the members is an array. If I specify the array size at compile time, everything works wonderful. The problem comes when I dynamically create the array, the information does not get passed. Here's an example of some simplified code I'm working on.

    Code:
    #include<stdio.h>
    #include<stdlib.h>
    #include "mpi.h"
    
    //Structure I'm Trying to Send
    struct ListNode
    {
    	double* a;
    	double b;
    	int n;
    };
    typedef struct ListNode Node;
    
    //Function to build derived MPI datatype
    void Build_derived_type(Node* myNode, MPI_Datatype* mesg_mpi_t_ptr);
    
    int main(int argc, char* argv[])
    {
    	int nodeRank=0, numNodes,i;
    	MPI_Datatype MPI_NODE; //New Datatype to send structure
    	MPI_Status status;
    	Node myNode;
    	
    	MPI_Init(&argc,&argv); //Initializes MPI
    	MPI_Comm_size(MPI_COMM_WORLD,&numNodes); //find out how many nodes there are total
    	MPI_Comm_rank(MPI_COMM_WORLD,&nodeRank); //Finds out rank of this node
    
    	if(nodeRank == 0)
    	{
    		myNode.a = malloc(sizeof(double)*5); //Dynamically create array
    		//Fill array
    		myNode.a[0] = 1.1;
    		myNode.a[1] = 2.2;
    		myNode.a[2] = 3.3;
    		myNode.a[3] = 4.4;
    		myNode.a[4] = 5.5;
    		
    		myNode.b = 4.5;
    		myNode.n = 6;
    	
    	}
    	
    	
    	Build_derived_type(&myNode, &MPI_NODE); //Builds derived type
    	
    	//Send structure from node 0 to node 5
    	if(nodeRank == 0)
    	{
    		MPI_Send(&myNode,1,MPI_NODE,5,1,MPI_COMM_WORLD);
    	}
    	//Node 5 recieves structure and displays data
    	if(nodeRank == 5)
    	{
    		MPI_Recv(&myNode,1,MPI_NODE,0,1,MPI_COMM_WORLD,&status);
    		printf("Noderank = %d\n",nodeRank);
    		for(i=0;i<5;i++)
    			printf("a[%d] = %f\n",i,myNode.a[i]);
    		printf("b = %f\nn = %d\n",myNode.b,myNode.n);
    	}
    
    	
    	MPI_Finalize();
    	
    	return 0;
    }
    
    void Build_derived_type(Node* myNode, MPI_Datatype* MPI_NODE_ptr)
    {
    	int block_lengths[3];
    	MPI_Aint displacements[3];
    	
    	MPI_Datatype typelists[3];
    	MPI_Aint start_address;
    	MPI_Aint address;
    	//fill block lengths
    	block_lengths[0] = 5;
    	block_lengths[1] = block_lengths[2] = 1;
    	//Fill Typelists
    	typelists[0] = MPI_DOUBLE;
    	typelists[1] = MPI_DOUBLE;
    	typelists[2] = MPI_INT;
    	
    	//Calculate Displacements
    	displacements[0] = 0;
    	MPI_Get_address(&((*myNode).a),&start_address);
    	MPI_Get_address(&((*myNode).b),&address);
    	displacements[1] = address - start_address;
    	MPI_Address(&((*myNode).n),&address);
    	displacements[2] = address - start_address;
    	//Create and Commit new mpi type
    	MPI_Type_create_struct(3,block_lengths,displacements,typelists,MPI_NODE_ptr);
    	MPI_Type_commit(MPI_NODE_ptr);
    }
    The program still runs with no errors, but Node 5 never gets the data from the array, but it gets all the other members of the structure just fine.

    any help appreciated

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Yes, the data allocated by malloc is contiguous from a software standpoint (but the backing pages that you get may be ANY memory in (usually) 4K blocks).

    But the problem here isn't that: The problem here is that you are passing a data-structure with a pointer inside it - the pointer is passed, but not the data pointed to by the pointer, so whilst a is the same value as on node 0, the memory pointed to by a is something completely random (including the possibility that this location isn't valid on node 5 at all).

    There are several solutions, the most trivial one is to use a "flexible length struct", such as this:
    Code:
    struct ListNode
    {
    	double b;
    	int n;
    	double a[1];
    };
    
    .... 
    
    struct ListNode *myNode;
    
    myNode = malloc(sizeof(myNode) + sizeof(double) * (5-1));    
    // -1 is because we already have a space for a[0], so we need only allocate the following ones
    
    // Now just use "myNode->" instead of "myNode."
    ... 
    // Send the pointer, not the address of the pointer, so remove &. 
    MPISend(myNode, ...);
    ...
    --
    Mats

  3. #3
    Registered User
    Join Date
    Jun 2007
    Posts
    26
    Thanks for the quick reply. I knew it had to be something with the memory, I just was unable to pin it down. It makes a lot more sense now. With that being said, I tried implementing what you suggested, but the program is aborting when it hits the recv. It aborts even when I don't dynamically allocate the memory. I've looked over my code, but I'm not sure where I'm going wrong. If you could look over my code again it would be a great help. I'm sure it is just something simple that I missed.

    thanks

    Code:
    #include<stdio.h>
    #include<stdlib.h>
    #include "mpi.h"
    
    //Structure I'm Trying to Send
    struct ListNode
    {
    	double a[1];
    	double b;
    	int n;
    };
    typedef struct ListNode Node;
    
    //Function to build derived MPI datatype
    void Build_derived_type(Node myNode, MPI_Datatype* mesg_mpi_t_ptr);
    
    int main(int argc, char* argv[])
    {
    	int nodeRank, numNodes,i;
    	MPI_Datatype MPI_NODE; //New Datatype to send structure
    	MPI_Status status;
    	Node *myNode;
    	
    	MPI_Init(&argc,&argv); //Initializes MPI
    	MPI_Comm_size(MPI_COMM_WORLD,&numNodes); //find out how many nodes there are total
    	MPI_Comm_rank(MPI_COMM_WORLD,&nodeRank); //Finds out rank of this node
            myNode = malloc(sizeof(myNode) + sizeof(double)*(5-1)); //Dynamically create array
    	if(nodeRank == 0)
    	{
    		
    		//Fill array
    		myNode->a[0] = 1.1;
    		myNode->a[1] = 2.2;
    		myNode->a[2] = 3.3;
    		myNode->a[3] = 4.4;
    		myNode->a[4] = 5.5;
    		
    		myNode->b = 4.5;
    		myNode->n = 6;
    	}
    	
    	
    	Build_derived_type(*myNode, &MPI_NODE); //Builds derived type
    
    	//Send structure from node 0 to node 5
    	if(nodeRank == 0)
    	{
    		MPI_Send(myNode,1,MPI_NODE,5,1,MPI_COMM_WORLD);
    	}
    	//Node 5 recieves structure and displays data
    	if(nodeRank == 5)
    	{
    		MPI_Recv(myNode,1,MPI_NODE,0,1,MPI_COMM_WORLD,&status);		
    		printf("Noderank = &#37;d\n",nodeRank);
    		for(i=0;i<5;i++)
    			printf("a[%d] = %f\n",i,myNode->a[i]);
    		printf("b = %f\nn = %d\n",myNode->b,myNode->n);
    	}
    	
    	MPI_Finalize();
    	
    	return 0;
    }
    
    void Build_derived_type(Node myNode, MPI_Datatype* MPI_NODE_ptr)
    {
    	int block_lengths[3];
    	MPI_Aint displacements[3];
    	
    	MPI_Datatype typelists[3];
    	MPI_Aint start_address;
    	MPI_Aint address;
    	//fill block lengths
    	block_lengths[0] = 5;
    	block_lengths[1] = block_lengths[2] = 1;
    	//Fill Typelists
    	typelists[0] = MPI_DOUBLE;
    	typelists[1] = MPI_DOUBLE;
    	typelists[2] = MPI_INT;
    	
    	//Calculate Displacements
    	displacements[0] = 0;
    	MPI_Get_address(&(myNode.a),&start_address);
    	MPI_Get_address(&(myNode.b),&address);
    	displacements[1] = address - start_address;
    	MPI_Get_address(&(myNode.n),&address);
    	displacements[2] = address - start_address;
    	//Create and Commit new mpi type
    	MPI_Type_create_struct(3,block_lengths,displacements,typelists,MPI_NODE_ptr);
    	MPI_Type_commit(MPI_NODE_ptr);
    }
    Also, should the malloc statement be contain "sizeof(myNode)" or "sizeof(*myNode)"?

    Thanks again

    EDIT: moved malloc statement out of if statement. The code will now run but with interesting out put. Here is what I get.
    Code:
    Noderank = 5
    a[0] = 1.1000000
    a[1] = 4.5000000
    a[2] = 3.2999999
    a[3] = 4.4000000
    a[5] = 5.5000000
    b = 4.5000000
    n = 6
    Last edited by moddinati; 08-02-2007 at 09:28 AM.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    There can be only one flexible array member in a struct, and it needs to be the last member of the struct.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Jun 2007
    Posts
    26
    in my actual program, I will end up having three dynamically created arrays as members of my struct. Is there another way to work around this issue?

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Yes, send each array separately (along with a "start-struct" perhaps).

    Or, of course, you could have fixed allocation for the data structure passed, so you send (say) a set of 256 variables.

    If all the data is the same size, you could do something like this:

    Code:
    struct  node_data {
    ...  // some data that is "always there".
      struct variable_size_data {
          double a;
          double b;
          double c;
      } datablock[1];
    };
    --
    Mats

  7. #7
    Registered User
    Join Date
    Jun 2007
    Posts
    26
    So I've been doing some reading, and it seems I have two choices:
    1) Either to send each array separately like already mentioned, although the memory would still be non-contiguous which might cause issues.

    2) I noticed MPI has a Pack/unpack command that looks like it might work. From my reading it said this function requires additional overhead every single time it is called. In option 1, building the derived datatypes only require the additional overhead once at creation.

    So here is my question, does anyone know which would be more efficient in the end? (I will be sending three dynamically created arrays and an integer).

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Assuming your MPI transaction is over gigabit or slower ethernet comms, it's probably not important which method you use, as the slowest part will be getting the data to the other node.

    I'm not entirely sure that MPI pack does what you want anyways.

    --
    Mats

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Further to this - I have looked a bit more at pack/unpack, and I think you CAN do what you want with that, if you would like to.

    There is an advantage with doing a single send, compared to multiple send operations, as you use less bandwidth on the network (for small packets of data - if it's large packets, then it's not much difference).

    Are your three data arrays significantly different in size, or could you make something that works along the lines of my a, b and c struct in the earlier post by me? A small waste isn't going to make much of a difference, but if starts being a big bunch of "empty data" then it's probably better to use pack/unpack.

    --
    Mats

  10. #10
    Registered User
    Join Date
    Jun 2007
    Posts
    26
    Thanks for taking the time to follow up on it.

    two of the arrays are always exactly the same size. The magnitude of these can vary a lot, but it will be anywhere in the range from 1 to 500ish, maybe a little more.

    the other array also varies, but most of the time it will probably be anywhere from 2 to 10 or so.

    As far as speeds of communicators, this code will be run on regular clusters but it will also be run on SMP's, so it will potentially have a faster communication time.

    What I have been trying to do is for each array create two send commands, one for the index 0 of the array since it is stored in the structure and not contiguous with the rest of the array, and another send command to send the rest of the array(by starting at index 1 and using the ability to specify a larger count). I would then manually rebuild the structure on receiving side. My test cluster has been acting up today combined with the fact that my experience in MPI is limited I have not made much progress. I think I might look more into the pack/unpack since optimization is desirable.
    Last edited by moddinati; 08-02-2007 at 05:10 PM.

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by moddinati View Post
    Thanks for taking the time to follow up on it.

    two of the arrays are always exactly the same size. The magnitude of these can vary a lot, but it will be anywhere in the range from 1 to 500ish, maybe a little more.
    So make this a "variable size at the end of struct".

    the other array also varies, but most of the time it will probably be anywhere from 2 to 10 or so.
    If you can determine the max limit, then I'd make this a fixed size array of 10 or 12 or whatever you think is big enough for every use-case - loosing some 8-10 float sizes isn't much in the whole scheme of things]

    So your total structure would look something like this:
    e.g
    Code:
    struct something {
       ...
       int n_xy;
       int n_z;
       ...
       double z[16];
       struct _xy {
          double x;
          double y;
       } xy[1];
    }

    As far as speeds of communicators, this code will be run on regular clusters but it will also be run on SMP's, so it will potentially have a faster communication time.

    What I have been trying to do is for each array create two send commands, one for the index 0 of the array since it is stored in the structure and not contiguous with the rest of the array, and another send command to send the rest of the array(by starting at index 1 and using the ability to specify a larger count). I would then manually rebuild the structure on receiving side. My test cluster has been acting up today combined with the fact that my experience in MPI is limited I have not made much progress. I think I might look more into the pack/unpack since optimization is desirable.
    I would have thought that my suggestion above the most optimal given the conditions you describe - a few dozen bytes "waste" because the "z" array is too large, but not too much to make much of a difference. The "xy" array is the size it needs to be, so it's not wasting any space.

    I added the fields "n_xy" and "n_z" to indicate how much data there is in each of the arrays.

    --
    Mats

  12. #12
    Registered User
    Join Date
    Jun 2007
    Posts
    26
    So make this a "variable size at the end of struct".
    I have two different arrays of that size, so i don't think that would work.

    If you can determine the max limit, then I'd make this a fixed size array of 10 or 12 or whatever you think is big enough for every use-case - loosing some 8-10 float sizes isn't much in the whole scheme of things]
    I'm not sure if I'm going to be allowed to do that. The code itself is a clustering algorithm in which the magnitudes of data input can change dramatically, those were just rough estimates. Technically, those arrays could be pretty much any size.


    I'm playing around with something right now, but I'm running into an issue. What is the difference between the following two sets of code, because the program likes the first set fine but not the second?
    Code:
    struct Node myNode;
    myNode.array = malloc(sizeof(double)*5); //array is a double* inside the structure
    and

    Code:
    struct Node* myNode;
    myNode->array = malloc(sizeof(double)*5); //array is a double* inside the structure
    Thanks, I'm not really a CS major so my understanding of the deeper levels of the code is limited.

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    My struct suggestion contains two elements that are variable size. As long as the third element has relatively few elements and it is possible to predict the maximum reasonable number, you should be able to use a "two element" variable size array as I've described above.

    But obviously, you can solve the problem by doing multiple send operations - only problem is that the machine sending goes idle until the other machine has received each packet.

    --
    Mats

  14. #14
    Sasquatch mog's Avatar
    Join Date
    Dec 2006
    Location
    Caves of Narshe
    Posts
    16
    Quote Originally Posted by moddinati View Post
    ..
    I'm playing around with something right now, but I'm running into an issue. What is the difference between the following two sets of code, because the program likes the first set fine but not the second?
    Code:
    struct Node myNode;
    myNode.array = malloc(sizeof(double)*5); //array is a double* inside the structure
    and

    Code:
    struct Node* myNode;
    myNode->array = malloc(sizeof(double)*5); //array is a double* inside the structure
    You need to allocate space for the structure also.

    Code:
    struct Node *n = malloc(sizeof(struct Node));
    n->array = malloc(5 * sizeof(double));
    Last edited by mog; 08-03-2007 at 02:22 PM. Reason: typo

  15. #15
    Registered User
    Join Date
    Jun 2007
    Posts
    26
    Quote Originally Posted by matsp View Post
    My struct suggestion contains two elements that are variable size. As long as the third element has relatively few elements and it is possible to predict the maximum reasonable number, you should be able to use a "two element" variable size array as I've described above.

    --
    Mats
    I think i misunderstood you initially, thanks for clarifying.

    Quote Originally Posted by mog View Post
    You need to allocate space for the structure also.

    Code:
    struct Node *n = new malloc(sizeof(struct Node));
    n->array = malloc(5 * sizeof(double));
    Thanks for the reply. I realized the problem was that I was using the Node ptr without first pointing it to anything. oops.

    Do you know if this is the best forum to post questions for MPI or is there one specifically for MPI somewhere? I appreciate the help, its hard to find resources for MPI.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Sorting whit MPI
    By isato in forum C Programming
    Replies: 0
    Last Post: 03-03-2009, 10:38 AM
  2. Alternative to malloc
    By stellastarr in forum C Programming
    Replies: 13
    Last Post: 04-30-2007, 04:10 PM