Thread: Messages Between Different Architectures

  1. #1
    PhysicistTurnedProgrammer Cell's Avatar
    Join Date
    Jan 2009
    Location
    New Jersey
    Posts
    72

    Messages Between Different Architectures

    Hey all,

    I am trying to send messages between processors in a few different configurations.

    I have processor with architecture A and a few processors with architecture B. All processors are connected together via a network.

    I am using MPI to send messages between processors. In my programs I like to find the time between messages. For this I am using MPI_Wtime(). I measure the time before a message is sent and then after it is received and from that I find the elapsed time.

    Now, when I time messages sent between one architecture B to another B, I get good time data - that is, the time values are of a reasonable order of magnitude. I can also log times from processor B to processor A.

    However, when I try to time message sent from processor architecture A to architecture B, I get complete gibberish. I have time values in the range of BILLIONS of second for a 1 KB message. There is no way that is correct.

    To recap, I get reasonable timing data for messages sent from:

    Architecture B to Architecture B
    Architecture B to Architecture A

    I get very unreasonable timing data when I time message sent from:

    Architecture A to Architecture B.

    A very rough idea of what I am doing is:

    Code:
        
        double start_time, end_time;
        FILE	*timing;
    
        timing = fopen("timing", "w");
    
        start_time = MPI_Wtime();  
    
        if (myrank == 0)
          MPI_Send(message, 10, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
    
        if (myrank != 0)
          MPI_Recv(message, 10, MPI_CHAR, 0, 99, MPI_COMM_WORLD, &status);
    
        end_time = MPI_Wtime();
    
        fprintf(timing, "Time: %f\n", (end_time - start_time));
    Again, that is just a very rough outline of the code.

    Any ideas?

  2. #2
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    My guess this is an issue with the architectures using different endianness. What is the format of the message you are sending across. Are you embedding integer values in this message?

  3. #3
    PhysicistTurnedProgrammer Cell's Avatar
    Join Date
    Jan 2009
    Location
    New Jersey
    Posts
    72
    I've actually been testing the timing by just sending an array filled with the character 's'.

  4. #4
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    My mistake, I misread your post. I thought you were sending the time in your message.

    In that case, you might be overwriting some stack memory which is messing up your timing calculation. I suggest this:

    Code:
        double start_time, end_time;
        FILE	*timing;
    
        timing = fopen("timing", "w");
    
        start_time = MPI_Wtime();  
        printf("Initial start time: %f\n", start_time);
        if (myrank == 0)
          MPI_Send(message, 10, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
        printf("start time after Send(): %f\n", start_time);
        if (myrank != 0)
          MPI_Recv(message, 10, MPI_CHAR, 0, 99, MPI_COMM_WORLD, &status);
        printf("start time after Recv(): %f\n", start_time);
        end_time = MPI_Wtime();
        printf("start time: %f, end time: %f\n", start_time, end_time);
        fprintf(timing, "Time: %f\n", (end_time - start_time));
    See if any of those print statement show you that the start_time is being modified when it shouldn't.

  5. #5
    PhysicistTurnedProgrammer Cell's Avatar
    Join Date
    Jan 2009
    Location
    New Jersey
    Posts
    72
    I tried what you said and this is the output:

    Code:
    What size message do you want to test?
    A) 1 KB
    B) 10 KB
    
    A
    
    Start time after Send(): 1241648586.974699
    End time after Recv(): 0.000000
    Done.
    End time after Recv(): 1241648586.975898
    Done.
    Now, there are two "End time..." and "Done." statements because that code is running on both processes. I am starting to wonder if this method is going to give me reasonable results.

    This is a shortened version of the code:


    Code:
        if (myrank == 0){
            printf("What size message do you want to test?\n");
    	printf("A) 1 KB\n");
    	printf("B) 10 KB\n");
    			
    	choice = getchar();
    
            start_time = 0.0;
    	start_time = MPI_Wtime();  
     
            switch(choice){ 
       
        	   case 'A':                  
        	   { 
        	       char message[1000];
        	       for(int i = 0; i <= 1000; i++){
        			message[i] = 'a';
        		}	
        		
        		if (myrank == 0){
    		    MPI_Send(message, 1000, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
    		    printf("Start time after Send(): %f\n", start_time); 
    	        } 
    	        break;
         	    }
        	
               case 'B':                  
        	   { 
        		char message[10000];
        		for(int i = 0; i <= 10000; i++){
        			message[i] = 'a';
        		}	
        		
        		if (myrank == 0){
    		    MPI_Send(message, 10000, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
    		    printf("Start time after Send(): %f\n", start_time); 
    	        }
          	        break;
         	    }
    	
              default:   break;
    	}	  
    } 
    
        if (myrank != 0){
            MPI_Recv(message, 100000, MPI_CHAR, 0, 99, MPI_COMM_WORLD, &status);
    	printf("Start time after Send(): %f\n", start_time); 
        }
    
        end_time = MPI_Wtime();	
        printf("End time after Recv(): %f\n", end_time); 		
        fprintf(timing, "Total communication time was: %f\n", (end_time - start_time));	
        fprintf(timing, "Message: %s\n", message);			
        fprintf(timing, "Precision is: %f\n", precision);	
        printf("Done.\n");     	 
        
        MPI_Finalize();
    
        fclose(timing);
    	
        return 0;
    
    }
    As you can see, because the information at the bottom is running on both processes, we get two "End time after..." and "Done." statements.

    The issue is the astronomical values for the time periods. Do you have any ideas?
    Last edited by Cell; 05-07-2009 at 03:39 AM.

  6. #6
    PhysicistTurnedProgrammer Cell's Avatar
    Join Date
    Jan 2009
    Location
    New Jersey
    Posts
    72
    Edited.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. CTabCtrl not receiving messages
    By VirtualAce in forum Windows Programming
    Replies: 3
    Last Post: 07-23-2009, 09:37 AM
  2. Spy++ view messages posted/sent to a control, or from it?
    By hanhao in forum Windows Programming
    Replies: 2
    Last Post: 06-24-2007, 11:07 PM
  3. Sending windows messages
    By Ideswa in forum Windows Programming
    Replies: 2
    Last Post: 03-02-2006, 01:27 PM
  4. Windows Messages Queuing Up Whilst Looping
    By mrpickle in forum Windows Programming
    Replies: 12
    Last Post: 12-16-2003, 03:23 PM