Writing MATLAB programs using MPI (Open MPI, MPICH, Sun MPI) and executing them in a machine using Sun MPI
This manual will guide you in the development of parallel programs written in MATLAB that will be run in the machine the machine using Sun MPI although this interface is also valid for OpenMPI and MPICH. For the use of MPI with MATLAB with another MPI implementation such as LAM/MPI, take a look at: "MPI Toolbox for Matlab (MPITB)" that offers valuable information about this topic.
1) The MATLAB code
1.1) The main program must be a function
The program we want to execute must be declared as a MATLAB function, for example:
f_p1.m:
function [c]=p1(a,b)
c=a+b;
return
For more information check a MATLAB tutorial.
1.2) The MPI function
All the standard MPI functions will be called through an interface MPI.c file that will be compiled into a mex file that will make possible to call the MPI function included in the MPI.c file as a regular MATLAB function.
The header of the MPI function is:
MPI(MPI_Op, …);
MPI_Op: is a char string that identifies the MPI function to be called, this will condition the rest of the parameters.
The naming rules that have been followed totally agree with the MPI standard. In this standard the prefix MPI_ is included in all function calls and then the concrete function is specified. In our case, we will just have to specify the concrete function, avoiding the MPI_ prefix, therefore, to call the MPI_Init and MPI_Finalize functions:
MPI(‘Init’);
MPI(‘Finalize’);
For more documentation about the compiler check: http://www.mathworks.com/access/helpdesk/help/pdf_doc/compiler/Compiler4.pdf
http://www.mathworks.com/access/helpdesk/help/pdf_doc/compiler/rn.pdf
1.3) MPI functions implemented in MPI.c
int MPI_Init(int *argc, char ***argv)
The function is called as:
[rank, size]=MPI(‘Init’);
Where rank is the number that the processor that calls this function get assigned and size is the number of processors in MPI_COMM_WORLD.
This function calls internally to the MPI functions:
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
to get the values for rank and size.
int MPI_Finalize()
The function is called as:
MPI(‘Finalize’);
int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request)
The function is called as:
MPI('Isend',a,sa,'MPI_DOUBLE',dest,tag,commu,req)
Where:
a is a vector with the elements to be sent
sa is the number of elements in a
dest is the rank of the processor that will receive the message
tag is an integer specifying the tag of the operation
commu is an integer defining the communicator, if the operation is in
MPI_COMM_WORLD then commu must be ‘MPI_COMM_WORLD’
req is the integer identifying the number of the request
int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Request *request)
Which is called by:
b=MPI('Irecv',sb,'MPI_DOUBLE', source, tag,commu,req1);
where
b will contain the received message
sb is the size of the message that will be received
source is the rank of the process that sends the message
tag is the label of the operation
req is the integer identifying the number of the request
int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status)
Which is called by:
condition=MPI('Test', req);
where
req is the integer indicating the request number
condition will be 0 if the operation has not been done, and 1 if it has been done
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
Which is called by:
MPI('Send',b, sb, 'MPI_DOUBLE', dest, tag, comm);
where
b is the vector with the elements to be sent
sb is the number of elements of b
dest is the rank of the destination of the message
tag is the tag of the operation
comm can be an integer specifying the communicator or the string
‘MPI_COMM_WORLD’
int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)
Which is called by:
b=MPI('Recv',sb,'MPI_DOUBLE', source, tag,comm);
where
b will contain the message received
n_elements is the integer specifying the number of elements of b
source y the integer specifying the rank of the process sending the message
tag is the tag of the operation
comm is the integer specifying the communicator or the string ‘MPI_COMM_WORLD’
int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm)
Which is called by:
b=MPI('Bcast',[],sb,'MPI_DOUBLE', source, rank,comm);
if the process is receiving and by:
MPI('Bcast',b,sb,'MPI_DOUBLE',rank,rank,comm);
if the process is sending, where
b will contain the message received in case we are receiving and if we are sending, b must
contain the message to be sent
sb is the integer with the number of elements to be sent or recived
source contains the rank of the process that is sending
rank is the integer that identifies the process calling the function
comm is the integer specifying the communicator or ‘MPI_COMM_WORLD’
int MPI_Comm_group(MPI_Comm comm, MPI_Group *group)
Which is called by:
MPI('Comm_group',comm, group);
where
comm is the integer identifying the communicator from where the group will be obtained
group is the integer where the group will be stored
For comm.=’MPI_COMM_WORLD’ group will be always 0
int MPI_Comm_create(MPI_Comm comm, MPI_Group group, MPI_Comm *newcomm)
Which is called by:
MPI('Comm_create',group,comm);
where
group is the integer defining the group already created
comm is the communicator that will be created using the processes from the group
int MPI_Group_incl(MPI_Group group, int n, int *ranks, MPI_Group *newgroup)
Which is called by:
MPI('Group_incl', group, n_procs, gr_array, new_group); %creamos el grupo 1
where
group is the the group from where the processors are taken
n_procs is the number of processes that will be in the new group
gr_array is an array of integers specifying the the rank of the processes that will be
included in the new group
new_group is the integer identifying the group that will be created
2) Deploying the MATLAB application
MATLAB provides the Compiler ToolBox (TB) that allows us to create a stand-alone application that can be run independently of MATLAB. To be able to run programs in the machine it will be necessary to have that TB available.
After positioning in the directory where the application is (using the command “cd”), the steps to obtain the files required are:
1) If the MPI.c was modified it will be necessary to execute the command:
>> mex MPI.c
in a machine running Solaris. This command will create the MPI.mexsol file. It is necessary to have the file “mpi.h” in the same directory where this process is being done.
2) Now, we will invoque the Compiler with the command:
>> mcc -m -R -nojvm application.m file1.m ../dir1/file2.m –a MPI.mexsol
Here we are telling the compiler to create the files required to run the program without MATLAB. application is the name of the main function and the order of the other files is not important. The other files must contain all the functions that are called by the program excluding the ones that MATLAB has in its TBs.
The compiler will try to build the application and, depending on the configuration, this might give an error:
ld: fatal: library -lmwmclmcrrt: not found
ld: fatal: File processing errors. No output written to application
mbuild: link of 'application' failed.
This doesn’t affect us at all since the files we are interested on have been already created. This could be avoided by generating only the wrapper files but then, it’s not possible to set the option –R –nojvm. These files generated are:
application.ctf
application_main.c
application_mcc_component_data.c
Then the three files will be ready to be copied in the machine and to start working with them.
Figures 1 and 2 below show the two procedures to obtain the files needed for the compilation. In the Figures, f1.m is the main file that uses the function f2.m.
Figure 1
Figure 2
3) Compiling in the machine
Since the machine using Sun MPI (the machine) is not able to run MATLAB, the previous step must be done in other machine. To copy the files in a fast and clean way from the machine the files were created to the machine, it is recommended to use the command rsync. This command is not available in the machine although we can copy this command from europium into our home account and use it. An example of this syntax can be:
# rsync -e ssh application* machine_name:/work/username/destination_dir/ --stats --progress --rsync-path=/home/username/bin/rsync
If this command is invocated with these parameters, all the files in the current directory that start with the string application will be copied into the directory /work/username/destination_dir/ by using the ssh protocol, showing the statistics and the progress. In this example the rsync binary must be copied in /home/username/bin/, otherwise the machine won’t be able to run the command.
Once the files have been copied to the directory we must compile it as a regular C code program:
#mpcc -c -I /home/parmat/v72/extern/include -D_sun -DX11 -O -DNDEBUG application_mcc_component_data.c
#mpcc -c -I /home/parmat/v72/extern/include -D_sun -DX11 -O -DNDEBUG application_main.c
#mpcc -O -o app_exec application_main.o application_mcc_component_data.o -Wm,--rpath-link,/home/parmat/v72/extern/include -L/home/parmat/v72/runtime/sol2 -lmwmclmcrrt -lm -lmpi
Now the program is almost ready to be run. The .ctf file contains compressed the files required to execute the application and that will be extracted on the first execution. If the program is executed by several processes at the same time, there will be some access problems and the file won’t be able to be extracted properly. Therefore it will be necessary to execute the program only using one processor, that will extract the components, and then it should end (or be killed using Ctrl-C). This is because MATLAB Compiler 4 uses the MATLAB Component Runtime (MCR), which is a stand-alone set of shared libraries that enable the execution of M-files. The MATLAB Component Runtime (MCR) makes use of thread locking so that only one thread is allowed to access the MCR at a time. Once the .ctf file was extracted there won’t be any problems with several processes accessing to the directory it generates. The execution of a deployed program requires setting these two variables:
export LD_LIBRARY_PATH=/opt/local/lib:/usr/openwin/lib:/usr/dt/lib:/opt/sge/lib/solaris64:/home/parmat/v72/runtime/sol2:/home/parmat/v72/bin/sol2:/home/parmat/v72/sys/os/sol2:/home/parmat/v72/sys/java/jre/sol2/jre1.5.0/lib/sparc/native_threads:/home/parmat/v72/sys/java/jre/sol2/jre1.5.0/lib/sparc/client:/home/parmat/v72/sys/java/jre/sol2/jre1.5.0/lib/sparc:
export XAPPLRESDIR=/home/parmat/v72/X11/app-defaults
Now we can execute the program using one processor:
mprun -np 1 ./app_exec
and if everything goes well, we will see something like this:
Extracting CTF archive. This may take a few seconds, depending on the
size of your application. Please wait...
...CTF archive extraction complete.
The script to submit the program must also include the variable definitions. Then a Warning message will appear indicating that MATLAB can’t start a display, so now we can kill the process since the extraction of the MCR elements was successful.
4) Example: Simple Program
Now we will show a very simple program that uses a couple of functions of the MPI.c file and we will perform step by step showing the results of the screen.
The program is written in the file p1.m:
function p1()
disp('Initializing MPI \n')
[rank,size_w]=MPI('Init');
fprintf('\n I am %d MPI was initialized\n',rank);
source=0;
destination=2;
tag=201;
a=[669,669];
n_elements=size(a,2);
if(rank==source)
fprintf('I am %d and I will send to %d the vector %s \n',rank,destination,num2str(a));
MPI('Send',a, numel(a), 'MPI_DOUBLE', destination, tag, 'MPI_COMM_WORLD');
else
if(rank==destination)
fprintf('I am %d, I am going to receive \n',rank);
b=MPI('Recv',n_elements,'MPI_DOUBLE', source, tag,'MPI_COMM_WORLD');
fprintf('I am %d and I received the vector %s from %d \n',rank,num2str(a),source);
else %if-rank
fprintf('\n I am %d and I do nothing\n', rank);
end %if-rank
end
MPI('Finalize');
fprintf('\n rank %d : End \n', rank);
To deploy the application we type:
>> mcc -m -R -nojvm p1.m MPI.mexsol
Once the application has been deployed, we can copy the files to the directory “ex1 “ in our home account:
rsync -e ssh p1* mahicne_name:/home/aguillen/ex1 --stats --progress --rsync-path=/home/aguillen/bin/rsync
aguillen@machinea_name's password:
48470 100% 31.25MB/s 0:00:00
918 100% 0.00kB/s 0:00:00
3563 100% 0.00kB/s 0:00:00
10522 100% 0.00kB/s 0:00:00
Number of files: 4
Number of files transferred: 4
Total file size: 63473 bytes
Total transferred file size: 63473 bytes
Literal data: 63473 bytes
Matched data: 0 bytes
File list size: 81
Total bytes written: 63730
Total bytes read: 84
wrote 63730 bytes read 84 bytes 18232.57 bytes/sec
total size is 63473 speedup is 0.99
It copies 4 files, the 3 that we need and p1.m that it is not needed but is good to keep it there so we can be sure of the code of the deployed application when we log into the machine.
Now we have to compile it:
mpcc -c -I/home/parmat/v72/extern/include -D_sun -DX11 -O -DNDEBUG p1_mcc_component_data.c
mpcc -c -I/home/parmat/v72/extern/include -D_sun -DX11 -O -DNDEBUG p1_main.c
mpcc -O -o p1_exec p1_main.o p1_mcc_component_data.o -Wm,--rpath-link,/home/parmat/v72/extern/include -L/home/parmat/v72/runtime/sol2 -lmwmclmcrrt -lm -lmpi
Then we have to run it on one processor and kill it using Ctrl-C once we see the warning message. (remember that the variables LD_LIBRARY_PATH= and XAPPLRESDIR must be defined as above) :
aguillen@frontend$ mprun -np 1 ./app_exec
Extracting CTF archive. This may take a few seconds, depending on the
size of your application. Please wait...
...CTF archive extraction complete.
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
Job cre.6454 on frontend: received signal INT.
aguillen@frontend$
Finally we can send it to the queue using the following script (scpt_ex1):
#!/usr/bin/bash
#$ -A e03-vis
#$ -pe hpc 4
#$ -cwd
export LD_LIBRARY_PATH=/opt/local/lib:/usr/openwin/lib:/usr/dt/lib:/opt/sge/lib/solaris64:/home/parmat/v72/runtime/sol2:/hom\
e/parmat/v72/bin/sol2:/home/parmat/v72/sys/os/sol2:/home/parmat/v72/sys/java/jre/sol2/jre1.5.0/lib/sparc/native_threads:/hom\
e/parmat/v72/sys/java/jre/sol2/jre1.5.0/lib/sparc/client:/home/parmat/v72/sys/java/jre/sol2/jre1.5.0/lib/sparc:
export XAPPLRESDIR=/home/parmat/v72/X11/app-defaults
mprun -np 4 ./p1_exec
Obtaining the following result:
--------------------------------------------------
START BATCH JOB on host backend at 04/04/06 02:28:37
Username: aguillen
Job id: 9790
Jobfile: scpt_ex1
Jobname: scpt_ex1
Run queue: 8pe_6hours
Job slots: 4
Environment: hpc
Project id: e03-vis
--------------------------------------------------
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
Initializing MPI \n
I am 0 MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
Initializing MPI \n
I am 2 MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
Initializing MPI \n
I am 3 MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
Initializing MPI \n
I am 1 MPI was initialized
I am 2, I am going to receive
I am 3 and I do nothing
I am 1 and I do nothing
I am 0 and I will send to 2 the vector 669 669
I am 2 and I received the vector 669 669 from 0
rank 1 : End
rank 3 : End
rank 2 : End
rank 0 : End
------------------------------------------------
END BATCH JOB on host backend at 04/04/06 02:28:44
------------------------------------------------
The warning messages refer to the impossibility of starting a display for MATLAB, they do not represent any problem and they can be switched off. All the files required for this example are in the directory /home/parmat/examples/ex1.
5) Example: Using Requests
In order to use non-blocking communications, the standard MPI provides a data structure called MPI_Request. This data type controls the status of a non-blocking communication. The following program shows a non-blocking send/receive between two processors:
function p2()
[rank,size_w]=MPI('Init');
fprintf('\n rank %d : MPI was initialized\n',rank);
source=0;
dest=2;
b=[];
if(rank==source)
a=[123,123];
tag=201;
%COMMENT set the integer value for the request
req1=0;
fprintf('rank %d : I am going to send %s to process %d',rank,num2str(a), dest);
MPI('Isend',a,2,'MPI_DOUBLE',dest,tag,'MPI_COMM_WORLD',req1)
fprintf('\n rank %d : Isend performed, I am done \n', rank);
else %ELSE DEL IF DLE RANK
if(rank==dest)
tag=201;
%COMMENT set the integer value for the request
req1=0;
fprintf(' rank %d : I start working \n',rank);
%COMMENT do some work
z=1;
for j=1:200,
z(1)=z(1)+(rand.*0.5)-0.5;
end
fprintf('rank %d : I am going to receive \n',rank);
b=MPI('Irecv',2,'MPI_DOUBLE', source, tag,'MPI_COMM_WORLD',req1);
fprintf('rank %d : I keep on working until I receive \n',rank);
%COMMENT the process continues working until it gets the message
condition=0;
while(0==condition)
for i=1:200,
for j=1:5
z(1)=z(1)+(rand.*0.5)-0.5;
end
end
fprintf('\n rank %d : working...', rank);
%COMMENT Check the request to see if the message arrived
condition=MPI('Test', req1);
end %end-while
fprintf('\n rank %d : message recived, it is: %s ',rank, num2str(b));
end %end-if-dest
end %end-if-source
MPI('Finalize');
fprintf('\n rank %d : End \n', rank);
This program sets an array that will be sent from the source process (rank 0) to the destination (rank 2). The destination process will call the Irecv function and will keep on working until it gets the message. The process uses the function MPI_Test to check if the message arrived. The result of the execution is:
--------------------------------------------------
START BATCH JOB on host backend at 26/04/06 17:17:53
Username: aguillen
Job id: 11053
Jobfile: scpt_ex2
Jobname: scpt_ex2
Run queue: 8pe_30mins
Job slots: 4
Environment: hpc
Project id: e03-vis
--------------------------------------------------
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 0 : MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 1 : MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 3 : MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 2 : MPI was initialized
rank 2 : I start working
rank 2 : I am going to receive
rank 2 : I keep on working until I receive
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...
rank 2 : working...rank 0 : I am going to send 123 123 to process 2
rank 2 : working...
rank 2 : working...
rank 0 : Isend performed, I am done
rank 2 : working...
rank 2 : message recived, it is: 123 123
rank 1 : End
rank 2 : End
rank 3 : End
rank 0 : End
------------------------------------------------
END BATCH JOB on host backend at 26/04/06 17:18:00
------------------------------------------------
6) Example: Using Communicators
function p3()
[rank,size_w]=MPI('Init');
fprintf('\n rank %d : MPI was initialized\n',rank);
%COMMENT Get the original group of the communicator MPICOMMWORLD
MPI('Comm_group','MPI_COMM_WORLD', 0);
%COMMENT Create the new groups (g_1 and g_2) of processors and the
%respective communicators (1 and 2)
g_1=[0,1,2];
MPI('Group_incl', 0, 3, g_1, 1); %COMMENT create group 1
MPI('Comm_create',1,1); %COMMENT create communicator 1 with the
%elements from group 1
g_2=[0,3];
MPI('Group_incl', 0, 2, g_2, 2);%COMMENT create group 2
MPI('Comm_create',2,2);%COMMENT create communicator 2 with the
%elements from group 2
%COMMENT set the source processor that will send in the broadcast
source=0;
%COMMENT If I am the source
if(rank==source)
tag=201;
%COMMENT Set the value for the message 1 and send it
a=[669,669];
MPI('Bcast',a,2,'MPI_DOUBLE',rank,rank,1);
%COMMENT Bcast the same for the second broadcast
aa=[123,123];
MPI('Bcast',aa,2,'MPI_DOUBLE',rank,rank,2);
fprintf('\n rank %d : I sent %s to group 1 (0,1,2) and %s to group 2 (0,3)\n',rank,num2str(a), nu\
m2str(aa));
else %if-rank-source
%COMMENT Dependeing on my group I make the broadcast using one communicator
if(sum(rank==g_1)==1)
b=[];
b=MPI('Bcast',[], 2,'MPI_DOUBLE', 0, rank,1);
fprintf(' rank %d : I recevied %s in comm 1 \n',rank,num2str(b));
end
if(sum(rank==g_2)==1)
b=[];
b=MPI('Bcast',[], 2,'MPI_DOUBLE', 0, rank,2);
fprintf(' rank %d : I recevied %s in comm 2 \n',rank,num2str(b));
end
end %else-if-rank-source
MPI('Finalize');
fprintf('\n rank %d : Done \n',rank);
The result is:
--------------------------------------------------
START BATCH JOB on host backend at 26/04/06 18:20:23
Username: aguillen
Job id: 11057
Jobfile: scpt_ex3
Jobname: scpt_ex3
Run queue: 24pe_6hours
Job slots: 4
Environment: hpc
Project id: e03-vis
--------------------------------------------------
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 0 : MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 1 : MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 2 : MPI was initialized
Warning: Unable to open display , MATLAB is starting without a display.
You will not be able to display graphics on the screen.
rank 3 : MPI was initialized
rank 2 : I recevied 669 669 in comm 1
rank 3 : I recevied 123 123 in comm 2
rank 1 : I recevied 669 669 in comm 1
rank 0 : I sent 669 669 to group 1 (0,1,2) and 123 123 to group 2 (0,3)
rank 1 : Done
rank 2 : Done
rank 3 : Done
rank 0 : Done