Latest revision as of 15:01, 19 December 2013

WARNING: The last edit of this page is over two years old. The information on this page may be out-of-date.

Download and general information: http://www.gromacs.org

Search the mailing list archives: http://www.gromacs.org/Support/Mailing_Lists/Search

Gromacs 4.6.3 (Single Precision)

Gromacs has been updated to its latest version to date (August 7th 2013). It does use the new intel and openmpi libraries so the modules that it depends on are not the same as for Gromacs 4.5.1. If you do use the older libraries, you will get an error stating that libirng.so cannot be found. This file is part of intel/13.1.1. In order to load all the associated libraries as well as the latest version of gromacs every time you login, have the following in the .bashrc file:

  module load intel/13.1.1 openmpi/intel/1.6.4 fftw extras gromacs/4.6.3

</source>

== mnaqvi 7 August 2013

Note: loading modules within .bashrc files is no longer recommended by SciNet staff because it can lead to unpredictable behaviour, since the .bashrc file is referenced by every bash script which is run. It is instead recommended that module load commands be typed in at the command line, and/or be placed within job submission scripts.

== ejspence (SciNet staff) 19 December 2013

Gromacs 4.5.1 (Single Precision)

Built with Intel compilers and OpenMPI

Used cmake

In "build" directory:

  module load gcc intel openmpi extras cmake

  cmake -D GMX_MPI=ON -D CMAKE_INSTALL_PREFIX=/scinet/gpc/Applications/gromacs/4.5.1 -D FFTW3F_INCLUDE_DIR=$SCINET_FFTW_INC\
      -D FFTW3F_LIBRARIES=$SCINET_FFTW_LIB/libfftw3f.a ../gromacs-4.5.1/

  make >& make.out

  make install

Created the gromacs/4.5.1 module. Prerequisite modules: intel, openmpi, extras.

 #%Module -*- tcl -*-

 # gromacs

 proc ModulesHelp { } {
   puts stderr "\tThis module adds gromacs 4.5.1 (single precision) environment variables"
 }

 module-whatis "adds gromacs 4.5.1 (single precision) environment variables"

 # gromacs was compiled with Intel compilers and OpenMPI
 prereq intel 
 prereq openmpi
 prereq extras

 setenv SCINET_GROMACS_HOME /scinet/gpc/Applications/gromacs/4.5.1
 setenv SCINET_GROMACS_BIN /scinet/gpc/Applications/gromacs/4.5.1/bin
 setenv SCINET_MDRUN /scinet/gpc/Applications/gromacs/4.5.1/bin/mdrun

</source>

Here is a sample script for running Gromacs on 4 nodes, on the IB partition of the GPC:

!/bin/bash
PBS -l nodes=4:ib:ppn=8,walltime=08:50:00
PBS -N test

cd $PBS_O_WORKDIR mpirun -np 32 -hostfile $PBS_NODEFILE $SCINET_MDRUN -v -s test.tpr -deffnm after_test </source>

== dgruner 1 October 2010

Peculiarities of running single node GROMACS jobs on SCINET

This is VERY IMPORTANT !!! Please read the [relevant user tips section] for information that is essential for your single node (up to 8 core) MPI GROMACS jobs.

-- cneale 14 September 2009

Compiling GROMACS on SciNet

Please refer to the GROMACS compilation page

Submitting GROMACS jobs on SciNet

Please refer to the GROMACS submission page

-- cneale 18 August 2009

GROMACS benchmarks on Scinet

This is a rudimentary list of scaling information.

I have a 50K atom system running performance on GPC right now. On 56 cores connected with IB I am getting 55 ns/day. I set up 50 such simulations, each with 2 proteins in a bilayer and I'm getting a total of 5.5 us per day. I am using gromacs 4.0.5 and a 5 fs timestep by fixing the bond lengths and all angles involving hydrogen.

I can get about 12 ns/day on 8 cores of the non-IB part of GPC -- also excellent.

As for larger systems, My speedup over saw.sharcnet.ca for a 1e6 atom system is only 1.2x running on 128 cores in single precision. Although saw.sharcnet.ca is composed of xeons, they are running at 2.83 GHz (https://www.sharcnet.ca/my/systems/show/41), which is a faster clock speed than the Scinet 2.5 GHz for Intel's next-generation X86-CPU architecture. While GROMACS is generally not excellent for scaling up to or beyond 128 cores (even for large systems), our benchmarking of this system on saw.sharcnet.ca indicated that it was running at about 65% efficiency. Benchmarking was also done on Scinet for this system, but was not recorded as we were mostly tinkering with the -npme option to mdrun in an attempt to optimize it. My recollection, though, is that the scaling was similar on scinet.

-- cneale 19 August 2009

Strong scaling for GROMACS on GPC

Requested, and on our list to complete, but not yet available in a complete chart form.

-- cneale 19 August 2009

Scientific studies being carried out using GROMACS on GPC

Requested, but not yet available

-- cneale 19 August 2009

Hyperthreading with Gromacs

Using -np 16 on an 8 core box, I get an 8% to 18% performance increase when using -np 16 and optimizing -npme as compared to -np 8 and optimizing -npme (using gromacs 4.0.7). I now regularly overload the number of processes.

selected examples: System A with 250,000 atoms:

 mdrun -np 8  -npme -1    1.15 ns/day
 mdrun -np 8  -npme  2    1.02 ns/day
 mdrun -np 16 -npme  2    0.99 ns/day
 mdrun -np 16 -npme  4    1.36 ns/day <-- 118 % performance vs 1.15 ns/day
 mdrun -np 15 -npme  3    1.32 ns/day

System B with 35,000 atoms (4 fs timestep):

 mdrun -np 8  -npme -1    22.66 ns/day
 mdrun -np 8  -npme  2    23.06 ns/day
 mdrun -np 16 -npme -1    22.69 ns/day
 mdrun -np 16 -npme  4    24.90 ns/day <-- 108 % performance vs 23.06 ns/day
 mdrun -np 56 -npme 16    14.15 ns/day

Cutoffs and timesteps differ between these runs, but both use PME and explicit water.

And according to gromacs developer Berk Hess ( http://lists.gromacs.org/pipermail/gmx-users/2010-August/053033.html )

"In Gromacs 4.5 there is no difference [between -np and -nt based hyperthreading], since it does not use real thread parallelization. Gromacs 4.5 has a built-in threaded MPI library, but openmpi also has an efficient MPI implementation for shared memory machines. But even with proper thread parallelization I expect the same 15 to 20% performance improvement."

Difference between revisions of "Gromacs"

Latest revision as of 15:01, 19 December 2013

Contents

Gromacs 4.6.3 (Single Precision)

Gromacs 4.5.1 (Single Precision)

Peculiarities of running single node GROMACS jobs on SCINET

Compiling GROMACS on SciNet

Submitting GROMACS jobs on SciNet

GROMACS benchmarks on Scinet

Strong scaling for GROMACS on GPC

Scientific studies being carried out using GROMACS on GPC

Hyperthreading with Gromacs

Navigation menu

Search

@@ Line 1: / Line 1: @@
+{{OutOfDate}}
 Download and general information: http://www.gromacs.org
-Search the mailing list archives: http://oldwww.gromacs.org/swish-e/search/search2.php
+Search the mailing list archives: http://www.gromacs.org/Support/Mailing_Lists/Search
-=====Peculiarities of running single node GROMACS jobs on SCINET=====
+=Gromacs 4.6.3 (Single Precision)=
-This is '''VERY IMPORTANT !!!'''
+Gromacs has been updated to its latest version to date (August 7th 2013). It does use the new intel and openmpi libraries so the modules that it depends on are not the same as for Gromacs 4.5.1. If you do use the older libraries, you will get an error stating that libirng.so cannot be found. This file is part of intel/13.1.1. In order to load all the associated libraries as well as the latest version of gromacs every time you login, have the following in the .bashrc file:
-Please read the [[https://support.scinet.utoronto.ca/wiki/index.php/User_Tips#Running_single_node_MPI_jobs relevant user tips section]] for information that is essential for your single node (up to 8 core) MPI GROMACS jobs.
--- [[User:Cneale|cneale]] 14 September 2009
+<source lang="bash">
+   module load intel/13.1.1 openmpi/intel/1.6.4 fftw extras gromacs/4.6.3
+</source>
-=====Compiling and Running GROMACS on Scinet (general information)=====
+== [[User:Mnaqvi|mnaqvi]] 7 August 2013
-Chris Neale has compiled gromacs on GPC, with assistance from Scott Northrup, and
-on the power6 cluster with assistance from Ching-Hsing Yu. Users are welcome to utilize
-these binary executables, but only at their own peril since compiling and testing your own
-executable is safer and more stable.
-''Gromacs executables:''
+Note: loading modules within .bashrc files is no longer recommended by SciNet staff because it can lead to unpredictable behaviour, since the .bashrc file is referenced by every bash script which is run.  It is instead recommended that <tt>module load</tt> commands be typed in at the command line, and/or be placed within job submission scripts.
-'''GPC:''' /scratch/cneale/exe/intel/gromacs-4.0.5/exec/bin
+== [[User:Erik Spence|ejspence]] (SciNet staff) 19 December 2013
-'''TCS:''' /scratch/cneale/exe/gromacs-4.0.4_aix/exec/bin
+=Gromacs 4.5.1 (Single Precision)=
-Below you will find, in order, scripts for the different compilations that you can follow
+Built with Intel compilers and OpenMPI
-to make your own binaries.
-'''NOTE:''' ''the steps are not listed in order! You must compile fftw before compiling gromacs, and if you are going to use mvapich2-1.4rc1 then you must compile it also before compiling parallel gromacs.''
-# Compiling serial single precision gromacs on GPC
-# Compiling openmpi parallel gromacs on GPC
-# Compiling serial gromacs on the power6 (submitted to the queue):
-# Compiling parallel gromacs on the power6 (submitted to the queue):
-# fftw single precision compilation
-# Change to get mvapich2-1.4rc1 to compile gromacs
-# Compiling mvapich2-1.4rc1
-# Compiling gromacs on GPC using mvapich2-1.4rc1
-# Submitting an IB GPC job using openmpi
-# Submitting an IB GPC job using mvapich2-1.4rc1
-# Submitting a non-IB GPC job using openmpi
--- [[User:Cneale|cneale]] 18 August 2009
-=====Compiling serial single precision gromacs on GPC=====
-<source lang="sh">
-cd /scratch/cneale/exe/intel/gromacs-4.0.5
-mkdir exec
-module purge
-module load intel
-export FFTW_LOCATION=/scratch/cneale/exe/intel/fftw-3.1.2/exec
-export GROMACS_LOCATION=/scratch/cneale/exe/intel/gromacs-4.0.5/exec
-export CPPFLAGS=-I$FFTW_LOCATION/include
-export LDFLAGS=-L$FFTW_LOCATION/lib
-./configure --prefix=$GROMACS_LOCATION --without-motif-includes
---without-motif-libraries --without-x --without-xml >output.configure
->&1
-make  >output.make 2>&1
-make install  >output.make_install 2>&1
-make distclean
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
+Used cmake
-=====Compiling openmpi parallel gromacs on GPC:=====
-<source lang="sh">
+In "build" directory:
-cd /scratch/cneale/exe/intel/gromacs-4.0.5
-mkdir exec
-module purge
-module load openmpi intel
-export FFTW_LOCATION=/scratch/cneale/exe/intel/fftw-3.1.2/exec
-export GROMACS_LOCATION=/scratch/cneale/exe/intel/gromacs-4.0.5/exec
-export CPPFLAGS="-I$FFTW_LOCATION/include
--I/scinet/gpc/mpi/openmpi/1.3.2-intel-v11.0-ofed/include
--I/scinet/gpc/mpi/openmpi/1.3.2-intel-v11.0-ofed/lib"
-export LDFLAGS=-L$FFTW_LOCATION/lib
-/gpc/mpi/openmpi/1.3.2-intel-v11.0-ofed/lib/openmpi
--I/scinet/gpc/x1/intel/Compiler/11.0/081/lib/intel64
--I/scinet/gpc/x1/intel/Compiler/11.0/081/mkl/lib/em64t/"
-./configure --prefix=$GROMACS_LOCATION --without-motif-includes
---without-motif-libraries --without-x --without-xml --enable-mpi
---program-suffix="_openmpi" >output.configure.mpi 2>&1
-make  >output.make.mpi 2>&1
-make install-mdrun  >output.make_install.mpi 2>&1
-make distclean
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
+   module load gcc intel openmpi extras cmake
-=====Compiling serial gromacs on the power6 (submitted to the queue)=====
-Note that the -O5 flag for the power6 compilation makes it take about
+   cmake -D GMX_MPI=ON -D CMAKE_INSTALL_PREFIX=/scinet/gpc/Applications/gromacs/4.5.1 -D FFTW3F_INCLUDE_DIR=$SCINET_FFTW_INC\
-hours to compile. You can drop that if you want, but it does give
+       -D FFTW3F_LIBRARIES=$SCINET_FFTW_LIB/libfftw3f.a ../gromacs-4.5.1/
-you a few more percent performance.
-<source lang="sh">
+   make >& make.out
-#======================================================================
-# Specifies the name of the shell to use for the job
-# @ shell = /usr/bin/ksh
-# @ job_type = serial
-# @ class = verylong
-## # @ node = 1
-## # @ tasks_per_node = 1
-# @ output = $(jobid).out
-# @ error = $(jobid).err
-# @ wall_clock_limit = 40:00:00
-#=====================================
-## this is necessary in order to avoid core dumps for batch files
-## which can cause the system to be overloaded
-# ulimits
-# @ core_limit = 0
-#=====================================
-## necessary to force use of infiniband network for MPI traffic
-### TURN IT OFF # @ network.MPI = csss,not_shared,US,HIGH
-#=====================================
-# @ environment=COPY_ALL
-# @ queue
-export
-PATH=/usr/lpp/ppe.hpct/bin:/usr/vacpp/bin:.:/usr/bin:/etc:/usr/sbin:/usr/ucb:/usr/bin/X11:/sbin:/usr/java14/jre/bin:/usr/java14/bin:/usr/lpp/LoadL/full/bin:/usr/local/bin
-export F77=xlf_r
-export CC=xlc_r
-export CXX=xlc++_r
-export FFLAGS="-O5 -qarch=pwr6 -qtune=pwr6"
-export CFLAGS="-O5 -qarch=pwr6 -qtune=pwr6"
-export CXXFLAGS="-O5 -qarch=pwr6 -qtune=pwr6"
-export FFTW_LOCATION=/scratch/cneale/exe/fftw-3.1.2_aix/exec
-export GROMACS_LOCATION=/scratch/cneale/exe/gromacs-4.0.4_aix/exec
-export CPPFLAGS=-I$FFTW_LOCATION/include
-export LDFLAGS=-L$FFTW_LOCATION/lib
-cd /scratch/cneale/exe/gromacs-4.0.4_aix
-mkdir exec
-./configure --prefix=$GROMACS_LOCATION --without-motif-includes
---without-motif-libraries --without-x --without-xml >output.configure
->&1
-make  >output.make 2>&1
-make install  >output.make_install 2>&1
-make distclean
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
+   make install
-=====Compiling parallel gromacs on the power6 (submitted to the queue)=====
-<source lang="sh">
-#===============================================================================
-# Specifies the name of the shell to use for the job
-# @ shell = /usr/bin/ksh
-##### @ job_type = serial
-# @ job_type = parallel
-# @ class = verylong
-# @ node = 1
-# @ tasks_per_node = 1
-# @ output = $(jobid).out
-# @ error = $(jobid).err
-# @ wall_clock_limit = 40:00:00
-#=====================================
-## this is necessary in order to avoid core dumps for batch files
-## which can cause the system to be overloaded
-# ulimits
-# @ core_limit = 0
-#=====================================
-## necessary to force use of infiniband network for MPI traffic
-### TURN IT OFF # @ network.MPI = csss,not_shared,US,HIGH
-#=====================================
-# @ environment=COPY_ALL
-# @ queue
-export F77=xlf_r
+Created the gromacs/4.5.1 module.  Prerequisite modules:  intel, openmpi, extras.
-export CC=xlc_r
-export CXX=xlc++_r
-export FFLAGS="-O5 -qarch=pwr6 -qtune=pwr6"
-export CFLAGS="-O5 -qarch=pwr6 -qtune=pwr6"
-export CXXFLAGS="-O5 -qarch=pwr6 -qtune=pwr6"
-export FFTW_LOCATION=/scratch/cneale/exe/fftw-3.1.2_aix/exec
-export GROMACS_LOCATION=/scratch/cneale/exe/gromacs-4.0.4_aix/exec
-export CPPFLAGS=-I$FFTW_LOCATION/include
-export LDFLAGS=-L$FFTW_LOCATION/lib
-cd /scratch/cneale/exe/gromacs-4.0.4_aix
-echo "cn-r0-10" > ~/.rhosts
-echo localhost > ~/host.list
-for((i=2;i<=16;i++)); do
-  echo localhost >> ~/host.list
-done
-export MP_HOSTFILE=~/host.list
-./configure --prefix=$GROMACS_LOCATION --without-motif-includes
---without-motif-libraries --without-x --without-xml --enable-mpi
---disable-nice --program-suffix="_mpi" CC=mpcc_r F77=mpxlf_r >
-output.configure_mpi 2>&1
-make mdrun > output.make_mpi 2>&1
-make install-mdrun > output.make_install_mpi 2>&1
-make distclean
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
+<source lang="tcl">
-=====fftw single precision compilation=====
+  #%Module -*- tcl -*-
-FFTW is required by GROMACS. This compilation must be completed before compiling GROMACS.
+  # gromacs
-<source lang="sh">
+  proc ModulesHelp { } {
-mkdir exec
+    puts stderr "\tThis module adds gromacs 4.5.1 (single precision) environment variables"
-export FFTW_LOCATION=/scratch/cneale/exe/intel/fftw-3.1.2/exec
+  }
-module purge
-module load openmpi intel
-./configure --enable-float --enable-threads --prefix=${FFTW_LOCATION}
-make
-make install
-make distclean
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
+  module-whatis "adds gromacs 4.5.1 (single precision) environment variables"
-=====Change to get mvapich2-1.4rc1 to compile gromacs=====
-This change is required to the mvapich2-1.4rc1 source code in order to compile GROMACS with it.
+  # gromacs was compiled with Intel compilers and OpenMPI
+  prereq intel
+  prereq openmpi
+  prereq extras
-<source lang="sh">
+  setenv SCINET_GROMACS_HOME /scinet/gpc/Applications/gromacs/4.5.1
-src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c
+  setenv SCINET_GROMACS_BIN /scinet/gpc/Applications/gromacs/4.5.1/bin
-line 503
+  setenv SCINET_MDRUN /scinet/gpc/Applications/gromacs/4.5.1/bin/mdrun
-unsigned long debug = 0;
-to
-static unsigned long debug = 0;
 </source>
--- [[User:Cneale|cneale]] 18 August 2009
+Here is a sample script for running Gromacs on 4 nodes, on the IB partition of the GPC:
-=====Compiling mvapich2-1.4rc1=====
-<source lang="sh">
+<source lang="bash">
-cd /scratch/cneale/exe/mvapich2-1.4rc1
-mkdir exec
-module purge
-module load intel
-./configure --prefix=/scratch/cneale/exe/mvapich2-1.4rc1/exec CC=icc
-CXX=icpc F90=ifort F77=ifort >output.configure 2>&1
-make  >output.make 2>&1
-make install  >output.make_install 2>&1
-make distclean
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
-=====Compiling gromacs on GPC using mvapich2-1.4rc1=====
-<source lang="sh">
 #!/bin/bash
-cd /scratch/cneale/exe/intel/gromacs-4.0.5
+#PBS -l nodes=4:ib:ppn=8,walltime=08:50:00
-mkdir exec
+#PBS -N test
-PATH=/usr/lib64/qt-3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/xcat/bin:/opt/xcat/sbin:/root/bin:/opt/torque/bin:/opt/xcat/bin:/opt/xcat/sbin:/usr/lpp/mmfs/bin:/scratch/cneale/exe/mvapich2-1.4rc1/exec/bin/:/scinet/gpc/x1/intel/Compiler/11.0/081/bin/intel64
-LD_LIBRARY_PATH=/scratch/cneale/exe/mvapich2-1.4rc1/exec/lib/:/scinet/gpc/x1/intel/Compiler/11.0/081/lib/intel64:/scinet/gpc/x1/intel/Compiler/11.0/081/mkl/lib/em64t/
-export FFTW_LOCATION=/scratch/cneale/exe/intel/fftw-3.1.2/exec
-export GROMACS_LOCATION=/scratch/cneale/exe/intel/gromacs-4.0.5/exec
-export CPPFLAGS="-I$FFTW_LOCATION/include
--I/scratch/cneale/exe/mvapich2-1.4rc1/exec/include
--I/scratch/cneale/exe/mvapich2-1.4rc1/exec/lib"
-export LDFLAGS=-L$FFTW_LOCATION/lib
-./configure --prefix=$GROMACS_LOCATION --without-motif-includes
---without-motif-libraries --without-x --without-xml --enable-mpi
---program-suffix="_mvapich2" >output.configure.mpi.mvapich2 2>&1
-make  >output.make.mpi.mvapich2 2>&1
-make install-mdrun  >output.make_install.mpi.mvapich2 2>&1
-make distclean
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
+cd $PBS_O_WORKDIR
-=====Submitting an IB GPC job using openmpi=====
+mpirun -np 32 -hostfile $PBS_NODEFILE $SCINET_MDRUN -v -s test.tpr -deffnm after_test
-<source lang="sh">
-#!/bin/bash
-#PBS -l nodes=10:ib:ppn=8,walltime=40:00:00,os=centos53computeA
-#PBS -N 1
-if [ "$PBS_ENVIRONMENT" != "PBS_INTERACTIVE" ]; then
-  if [ -n "$PBS_O_WORKDIR" ]; then
-    cd $PBS_O_WORKDIR
-  fi
-fi
-/scinet/gpc/mpi/openmpi/1.3.2-intel-v11.0-ofed/bin/mpirun -np $(wc -l
-$PBS_NODEFILE | gawk '{print $1}') -machinefile $PBS_NODEFILE
-/scratch/cneale/exe/intel/gromacs-4.0.5/exec/bin/mdrun_openmpi -deffnm
-pagp -nosum -dlb yes -npme 24 -cpt 120
-## To submit type: qsub this.sh
 </source>
--- [[User:Cneale|cneale]] 18 August 2009
-=====Submitting an IB GPC job using mvapich2-1.4rc1=====
-Note that mvapich2-1.4rc1 is not configured to fall back to ethernet
-so this will not work on the non-IB nodes, even for 8 cores.
-<source lang="sh">
+== [[User:Dgruner|dgruner]] 1 October 2010
-#!/bin/bash
-#PBS -l nodes=4:ib:ppn=8,walltime=30:00:00,os=centos53computeA
-#PBS -N 1
-if [ "$PBS_ENVIRONMENT" != "PBS_INTERACTIVE" ]; then
-  if [ -n "$PBS_O_WORKDIR" ]; then
-    cd $PBS_O_WORKDIR
-  fi
-fi
-module purge
-module load mvapich2 intel
-/scratch/cneale/exe/mvapich2-1.4rc1/bin/mpirun_rsh -np $(wc -l
-$PBS_NODEFILE | gawk '{print $1}') -hostfile $PBS_NODEFILE
-/scratch/cneale/exe/intel/gromacs-4.0.5/exec/bin/mdrun_mvapich2
--deffnm pagp -nosum -dlb yes -npme 12 -cpt 120
-## To submit type: qsub this.sh
-</source>
--- [[User:Cneale|cneale]] 18 August 2009
-=====Submitting a non-IB GPC job using openmpi=====
-<source lang="sh">
-#!/bin/bash
-#PBS -l nodes=1:compute-eth:ppn=8,walltime=40:00:00,os=centos53computeA
-#PBS -N 1
-if [ "$PBS_ENVIRONMENT" != "PBS_INTERACTIVE" ]; then
-  if [ -n "$PBS_O_WORKDIR" ]; then
-    cd $PBS_O_WORKDIR
-  fi
-fi
-/scinet/gpc/mpi/openmpi/1.3.2-intel-v11.0-ofed/bin/mpirun
--mca btl_sm_num_fifos 7 -np $(wc -l $PBS_NODEFILE | gawk '{print $1}')
--mca btl self,sm -machinefile $PBS_NODEFILE
-/scratch/cneale/exe/intel/gromacs-4.0.5/exec/bin/mdrun_openmpi -deffnm
-pagp -nosum -dlb yes -npme 24 -cpt 120
-## To submit type: qsub this.sh
-</source>
+=Peculiarities of running single node GROMACS jobs on SCINET=
 This is '''VERY IMPORTANT !!!'''
 Please read the [[https://support.scinet.utoronto.ca/wiki/index.php/User_Tips#Running_single_node_MPI_jobs relevant user tips section]] for information that is essential for your single node (up to 8 core) MPI GROMACS jobs.
@@ Line 323: / Line 79: @@
 -- [[User:Cneale|cneale]] 14 September 2009
-=====Things still left to do for GROMACS=====
+=Compiling GROMACS on SciNet=
+Please refer to the [[Compiling_Gromacs|GROMACS compilation page]]
-Intel has it's own fast fourier transform library, which we expect to yield improved performance over fftw.
+=Submitting GROMACS jobs on SciNet=
-We have not yet attempted such a compilation.
+Please refer to the [[Running_Gromacs|GROMACS submission page]]
 -- [[User:Cneale|cneale]] 18 August 2009
-=====GROMACS benchmarks on Scinet=====
+=GROMACS benchmarks on Scinet=
 This is a rudimentary list of scaling information.
@@ Line 353: / Line 110: @@
 -- [[User:Cneale|cneale]] 19 August 2009
-=====Strong scaling for GROMACS on GPC=====
+=Strong scaling for GROMACS on GPC=
 Requested, and on our list to complete, but not yet available in a complete chart form.
 -- [[User:Cneale|cneale]] 19 August 2009
-=====Scientific studies being carried out using GROMACS on GPC=====
+=Scientific studies being carried out using GROMACS on GPC=
 Requested, but not yet available
 -- [[User:Cneale|cneale]] 19 August 2009
+=Hyperthreading with Gromacs=
+Using -np 16 on an 8 core box, I get an 8% to 18% performance increase
+when using -np 16 and optimizing -npme as compared to -np 8 and optimizing -npme (using gromacs 4.0.7).
+I now regularly overload the number of processes.
+selected examples:
+System A with 250,000 atoms:
+  mdrun -np 8  -npme -1    1.15 ns/day
+  mdrun -np 8  -npme  2    1.02 ns/day
+  mdrun -np 16 -npme  2    0.99 ns/day
+  mdrun -np 16 -npme  4    1.36 ns/day <-- 118 % performance vs 1.15 ns/day
+  mdrun -np 15 -npme  3    1.32 ns/day
+System B with 35,000 atoms (4 fs timestep):
+  mdrun -np 8  -npme -1    22.66 ns/day
+  mdrun -np 8  -npme  2    23.06 ns/day
+  mdrun -np 16 -npme -1    22.69 ns/day
+  mdrun -np 16 -npme  4    24.90 ns/day <-- 108 % performance vs 23.06 ns/day
+  mdrun -np 56 -npme 16    14.15 ns/day
+Cutoffs and timesteps differ between these runs, but both use PME and
+explicit water.
+And according to gromacs developer Berk Hess ( http://lists.gromacs.org/pipermail/gmx-users/2010-August/053033.html )
+"In Gromacs 4.5 there is no difference [between -np and -nt based hyperthreading], since it does not use real thread parallelization.
+Gromacs 4.5 has a built-in threaded MPI library, but openmpi also has an efficient
+MPI implementation for shared memory machines. But even with proper thread
+parallelization I expect the same 15 to 20% performance improvement."