Difference between revisions of "GPC MPI Versions"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
m
 
(42 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
{| style="border-spacing: 8px; width:100%"
 +
| valign="top" style="cellpadding:1em; padding:1em; border:2px solid; background-color:#f6f674; border-radius:5px"|
 +
'''WARNING: SciNet is in the process of replacing this wiki with a new documentation site. For current information, please go to [https://docs.scinet.utoronto.ca https://docs.scinet.utoronto.ca]'''
 +
|}
 +
 
You can only use ONE mpi version at a time as most versions use the same  
 
You can only use ONE mpi version at a time as most versions use the same  
 
names for the mpirun and compiler wrappers, so be careful which modules you  
 
names for the mpirun and compiler wrappers, so be careful which modules you  
have loaded in your <tt>~/.bashrc</tt>.
+
have loaded in your <tt>~/.bashrc</tt>. For this reason, we recommend generally not loading any modules in your .bashrc.
 +
 
 +
__TOC__
  
 
===OpenMPI===
 
===OpenMPI===
Line 14: Line 21:
  
 
<pre>
 
<pre>
module load openmpi/1.3.3-gcc-v4.4.0-ofed gcc
+
module load gcc openmpi/1.4.2-gcc-v4.4.0-ofed  
 
</pre>  
 
</pre>  
  
Line 21: Line 28:
 
OpenMPI has been built to support various communication methods and automatically uses the best method
 
OpenMPI has been built to support various communication methods and automatically uses the best method
 
depending on how and where it is run.  To explicitly specify the method you can use the following <tt>--mca</tt>
 
depending on how and where it is run.  To explicitly specify the method you can use the following <tt>--mca</tt>
flags on ethernet
+
flags following for infiniband
  
 
<pre>
 
<pre>
mpirun --mca btl self,sm,tcp -np 16 -hostfile $PBS_NODEFILE ./a.out
+
mpirun --mca pml ob1 --mca btl self,sm,openib -np 16 ./a.out
</pre>  
+
</pre>
  
and the following for infiniband
+
and the following for IPoIB (not recommended)
 +
<pre>
 +
mpirun --mca pml ob1 --mca btl self,sm,tcp --mca btl_tcp_if_exclude lo,eth0 -np 16 ./a.out
 +
</pre>
  
 +
If your program runs into what appears to be memory issues with larger jobs try using the xrc communications
 
<pre>
 
<pre>
mpirun --mca btl self,sm,openib -np 16 -hostfile $PBS_NODEFILE ./a.out
+
mpirun --mca pml ob1 --mca btl_openib_receive_queues X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32 --mca btl_openib_max_send_size 12288"
 
</pre>
 
</pre>
  
For mixed openMP/MPI applications, set OMP_NUM_THREADS to the number of threads per process and add '<tt>--bynode</tt>' to the mpirun command, e.g.,
+
If you are still having issues you can try openmpi > 1.6 which uses MellanoX Messaging library (MXM) by default over 128 mpi tasks
 
<pre>
 
<pre>
export OMP_NUM_THREADS=4
+
module load openmpi/intel/1.6.4
mpirun -np 6 --bynode -hostfile $PBS_NODEFILE ./a.out
 
 
</pre>
 
</pre>
would start 6 MPI processes on different nodes, each with 4 openMP threads. If your script requests 3 nodes, each node gets 2 MPI processes.
 
 
For more information on available flags see the OpenMPI [http://www.open-mpi.org/faq/ FAQ]
 
 
<!-- ===MVAPICH2===
 
 
[http://mvapich.cse.ohio-state.edu/ MVAPICH2] is a [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] derivative
 
primarily designed for MPI communications over Infiniband. To use MVAPICH2 compiled with the intel compilers for infiniband
 
load the modules 
 
  
 +
or can be explicitly asked for by using if using less than 128 mpi tasks
 
<pre>
 
<pre>
module load mvapich2 intel
+
mpirun--mca mtl mxm --mca mtl_mxm_np 2  -np 16 ./a.out
</pre>  
+
</pre>
  
or for the ethernet version use
+
To NOT use MXM, regardless the number of mpi tasks use:
  
 
<pre>
 
<pre>
module load mvapich2/1.4rc1-3378_intel-v11.0-tcpip intel
+
mpirun --mca pml ob1 -np 16 ./a.out
</pre>  
+
</pre>
 
 
The MPI library wrappers for compiling are  mpicc/mpicxx/mpif90/mpif77.
 
  
MVAPICH2 requires a <tt>.mpd.conf</tt> file containing the variable "MPD_SECRETWORD=..." in
 
your $HOME directory.  To create this file use
 
  
 +
For mixed openMP/MPI applications, set OMP_NUM_THREADS to the number of threads per process and add '<tt>--bynode</tt>' to the mpirun command, e.g.,
 
<pre>
 
<pre>
echo "MPD_SECRETWORD=ABC123" > ~/.mpd.conf
+
export OMP_NUM_THREADS=4
chmod 600  ~/.mpd.conf
+
mpirun -np 6 --bynode ./a.out
 
</pre>
 
</pre>
 +
would start 6 MPI processes on different nodes, each with 4 openMP threads. If your script requests 3 nodes, each node gets 2 MPI processes.
  
The easiest way to run is to use <tt>mpirun_rsh</tt> as follows
+
For more information on available flags see the OpenMPI [http://www.open-mpi.org/faq/ FAQ]
 
 
<pre>
 
mpirun_rsh -np 16 -hostfile $PBS_NODEFILE ./a.out
 
</pre> -->
 
  
 
===IntelMPI===
 
===IntelMPI===
  
 
[http://software.intel.com/en-us/intel-mpi-library/ IntelMPI] is also a [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] derivative
 
[http://software.intel.com/en-us/intel-mpi-library/ IntelMPI] is also a [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] derivative
customized by Intel. To use IntelMPI compiled with the intel compilers load the modules   
+
customized by Intel. To use IntelMPI (4.x) compiled with the intel compilers load the modules   
  
 
<pre>
 
<pre>
module load intelmpi intel
+
module load intel intelmpi
 
</pre>  
 
</pre>  
  
Line 93: Line 89:
  
 
IntelMPI, like OpenMPI, has been built to support various communication methods and automatically uses the best method
 
IntelMPI, like OpenMPI, has been built to support various communication methods and automatically uses the best method
depending on how and where it is run.
+
depending on how and where it is run.  
  
 
<pre>
 
<pre>
mpirun -r ssh -np 16 ./a.out
+
mpirun -np 16 ./a.out
 
</pre>
 
</pre>
 +
  
 
To explicitly specify the method you can use the following flags to use  
 
To explicitly specify the method you can use the following flags to use  
ethernet (tcp) and shared memory
+
the following flags for infiniband (dapl) and shared memory (best performance on IB)
 +
 
 +
<pre>
 +
mpirun -np 16 -env I_MPI_FABRICS=shm:dapl ./a.out
 +
</pre>
  
 +
or the following for openfabrics (ibverbs) infiniband
 
<pre>
 
<pre>
mpirun -r ssh -np -env I_MPI_DEVICE ssm  ./a.out
+
mpirun -np 16 -genv I_MPI_FABRICS=shm:ofa ./a.out
 
</pre>
 
</pre>
  
or the following flags for infiniband (rdma udapl) and shared memory
+
or the following for IPoIB
  
 
<pre>
 
<pre>
mpirun -r ssh -np 2  -env I_MPI_DEVICE rdssm  ./a.out
+
mpirun -np 16 -env I_MPI_TCP_NETMASK=ib -env I_MPI_FABRICS shm:tcp ./a.out
 
</pre>
 
</pre>
  
or the following flags using Intel's "DET" over ethernet (EXPERIMENTAL!!!) and shared memory
+
If you run into communication errors that appear related to memory try some of the following flags
 
 
 
<pre>
 
<pre>
mpirun -r ssh -np 8 -env I_MPI_DEVICE rdssm:Det-eth0 -env I_MPI_USE_DYNAMIC_CONNECTIONS 0 ./a.out
+
module load intelmpi/4.0.3.008
 +
mpirun -np 16 -genv I_MPI_FABRICS=shm:ofa -genv I_MPI_OFA_USE_XRC=1 -genv I_MPI_OFA_DYNAMIC_QPS=1 -genv I_MPI_OFA_NUM_RDMA_CONNECTIONS=-1 ./a.out
 
</pre>
 
</pre>
  
 
For hybrid openMP/MPI runs, set <tt>OMP_NUM_THREADS</tt> to the desired number of OpenMP threads per MPI process and specify the number of MPI processes per node on the mpirun command line with <tt>-ppn <num></tt>. E.g.
 
For hybrid openMP/MPI runs, set <tt>OMP_NUM_THREADS</tt> to the desired number of OpenMP threads per MPI process and specify the number of MPI processes per node on the mpirun command line with <tt>-ppn <num></tt>. E.g.
 
<pre>
 
<pre>
    export OMP_NUM_THREADS=4
+
export OMP_NUM_THREADS=4
    mpirun -r ssh -np 6 -ppn 2 -env I_HYBRID_DEVICE ssm ./a.out
+
mpirun -ppn 2 -np 6 ./a.out
 
</pre>
 
</pre>
 
would start a total of 6 MPI processes each with 4 threads, with each node running 2 MPI processes. Your script should request 3 nodes in this case.
 
would start a total of 6 MPI processes each with 4 threads, with each node running 2 MPI processes. Your script should request 3 nodes in this case.
 +
''Note: to compile for hybrid openMP/MPI with IntelMPI, you need to add the flag <tt>-mt_mpi</tt> to your compilation command (i.e. mpicc/mpif90/mpicxx).''
  
 
For more information on these an other flags see Intel's [http://software.intel.com/en-us/articles/intel-mpi-library-documentation/ Documentation]
 
For more information on these an other flags see Intel's [http://software.intel.com/en-us/articles/intel-mpi-library-documentation/ Documentation]
 
page especially the "Getting Started (Linux)" Guide.
 
page especially the "Getting Started (Linux)" Guide.
 
===MPICH2 with hydra/nemesis===
 
 
[http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] 1.3a1 is a preview release of MPICH2 which uses the Hydra process manager. '''Note that this release is not recommended for production systems at this time.'''
 
To use MPICH2 1.3a1 compiled with the intel compilers load the modules 
 
 
<pre>
 
module load intel mpich2/mpich2-1.3a1-intel
 
</pre>
 
 
To run mpich2 applications:
 
 
<pre>
 
mpiexec -rmk pbs -n 16 ./a.out
 
</pre>
 
 
Mixed openMP/MPI runs with MPICH2 are a bit clumsy.  You have to set <tt>OMP_NUM_THREADS</tt> to the desired number of OpenMP threads per MPI process and specify the number of MPI processes per node in a 'machine file'.
 
The machine file should contain the names of the nodes followed by a colon and the number of MPI processes for that node. Since the nodes on which you run are not known beforehand, you have to generate this file. One way is as follows:
 
 
<pre>
 
uniq $PBS_NODEFILE | awk '{print $1":2"}' > $PBS_JOBID.mf
 
export OMP_NUM_THREADS=4
 
mpiexec -f $PBS_JOBID.mf -rmk pbs -n 6  ./hw_hybrid_icc
 
</pre>
 
This launches 2 MPI processes per core, each with 4 threads, for a total of 6 MPI processes. The job script should therefore request 3 nodes in this case.
 
 
For more information on these an other flags, see the [http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-1.2.1-userguide.pdf MPICH2 User’s Guide].
 

Latest revision as of 13:32, 9 August 2018

WARNING: SciNet is in the process of replacing this wiki with a new documentation site. For current information, please go to https://docs.scinet.utoronto.ca

You can only use ONE mpi version at a time as most versions use the same names for the mpirun and compiler wrappers, so be careful which modules you have loaded in your ~/.bashrc. For this reason, we recommend generally not loading any modules in your .bashrc.

OpenMPI

To use OpenMPI compiled with the intel compilers load the modules

module load intel openmpi

or for the gcc version use

module load gcc openmpi/1.4.2-gcc-v4.4.0-ofed 

The MPI library wrappers for compiling are mpicc/mpicxx/mpif90/mpif77.

OpenMPI has been built to support various communication methods and automatically uses the best method depending on how and where it is run. To explicitly specify the method you can use the following --mca flags following for infiniband

mpirun --mca pml ob1 --mca btl self,sm,openib -np 16  ./a.out

and the following for IPoIB (not recommended)

mpirun --mca pml ob1 --mca btl self,sm,tcp --mca btl_tcp_if_exclude lo,eth0 -np 16 ./a.out

If your program runs into what appears to be memory issues with larger jobs try using the xrc communications

mpirun --mca pml ob1 --mca btl_openib_receive_queues X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32 --mca btl_openib_max_send_size 12288" 

If you are still having issues you can try openmpi > 1.6 which uses MellanoX Messaging library (MXM) by default over 128 mpi tasks

module load openmpi/intel/1.6.4

or can be explicitly asked for by using if using less than 128 mpi tasks

mpirun--mca mtl mxm --mca mtl_mxm_np 2  -np 16 ./a.out

To NOT use MXM, regardless the number of mpi tasks use:

mpirun --mca pml ob1 -np 16 ./a.out


For mixed openMP/MPI applications, set OMP_NUM_THREADS to the number of threads per process and add '--bynode' to the mpirun command, e.g.,

export OMP_NUM_THREADS=4
mpirun -np 6 --bynode ./a.out

would start 6 MPI processes on different nodes, each with 4 openMP threads. If your script requests 3 nodes, each node gets 2 MPI processes.

For more information on available flags see the OpenMPI FAQ

IntelMPI

IntelMPI is also a MPICH2 derivative customized by Intel. To use IntelMPI (4.x) compiled with the intel compilers load the modules

module load intel intelmpi

The MPI library wrappers for compiling are mpicc/mpicxx/mpif90/mpif77.

IntelMPI requires a .mpd.conf file containing the variable "MPD_SECRETWORD=..." in your $HOME directory. To create this file use

echo "MPD_SECRETWORD=ABC123" > ~/.mpd.conf
chmod 600  ~/.mpd.conf

IntelMPI, like OpenMPI, has been built to support various communication methods and automatically uses the best method depending on how and where it is run.

mpirun -np 16 ./a.out


To explicitly specify the method you can use the following flags to use the following flags for infiniband (dapl) and shared memory (best performance on IB)

mpirun -np 16 -env I_MPI_FABRICS=shm:dapl ./a.out

or the following for openfabrics (ibverbs) infiniband

mpirun -np 16 -genv I_MPI_FABRICS=shm:ofa ./a.out

or the following for IPoIB

mpirun -np 16 -env I_MPI_TCP_NETMASK=ib -env I_MPI_FABRICS shm:tcp ./a.out

If you run into communication errors that appear related to memory try some of the following flags

module load intelmpi/4.0.3.008
mpirun -np 16 -genv I_MPI_FABRICS=shm:ofa -genv I_MPI_OFA_USE_XRC=1 -genv I_MPI_OFA_DYNAMIC_QPS=1 -genv I_MPI_OFA_NUM_RDMA_CONNECTIONS=-1 ./a.out

For hybrid openMP/MPI runs, set OMP_NUM_THREADS to the desired number of OpenMP threads per MPI process and specify the number of MPI processes per node on the mpirun command line with -ppn <num>. E.g.

export OMP_NUM_THREADS=4
mpirun -ppn 2 -np 6 ./a.out

would start a total of 6 MPI processes each with 4 threads, with each node running 2 MPI processes. Your script should request 3 nodes in this case. Note: to compile for hybrid openMP/MPI with IntelMPI, you need to add the flag -mt_mpi to your compilation command (i.e. mpicc/mpif90/mpicxx).

For more information on these an other flags see Intel's Documentation page especially the "Getting Started (Linux)" Guide.