<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://oldwiki.scinet.utoronto.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Brelier</id>
	<title>oldwiki.scinet.utoronto.ca - User contributions [en-gb]</title>
	<link rel="self" type="application/atom+xml" href="https://oldwiki.scinet.utoronto.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Brelier"/>
	<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php/Special:Contributions/Brelier"/>
	<updated>2026-05-07T06:44:54Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.12</generator>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7253</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7253"/>
		<updated>2014-09-16T20:07:58Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2&lt;br /&gt;
| 2.5.3&lt;br /&gt;
| 2.5.5&lt;br /&gt;
| -&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 &lt;br /&gt;
| 0.13.1&lt;br /&gt;
| 1.0.0&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 &lt;br /&gt;
| 2.4.0&lt;br /&gt;
| 3.0.0&lt;br /&gt;
| 3.1.1&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8&lt;br /&gt;
| 1.0.4&lt;br /&gt;
| 1.1.1&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.1.3&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| 2.3.0&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.6.2&lt;br /&gt;
| 2.7.1&lt;br /&gt;
| -&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15.1&lt;br /&gt;
| 0.18&lt;br /&gt;
| 0.19.1&lt;br /&gt;
| 0.20.1&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| 1.1&lt;br /&gt;
| 5.1&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
| [http://pandas.pydata.org/ pandas]&lt;br /&gt;
| 0.13.0&lt;br /&gt;
| 0.13.0&lt;br /&gt;
| 0.13.0&lt;br /&gt;
| -&lt;br /&gt;
| high-performance, easy-to-use data structures and data analysis tools.&lt;br /&gt;
|- &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7252</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7252"/>
		<updated>2014-09-16T20:05:08Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2&lt;br /&gt;
| 2.5.3&lt;br /&gt;
| 2.5.5&lt;br /&gt;
| -&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 &lt;br /&gt;
| 0.13.1&lt;br /&gt;
| 1.0.0&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 &lt;br /&gt;
| 2.4.0&lt;br /&gt;
| 3.0.0&lt;br /&gt;
| 3.1.1&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8&lt;br /&gt;
| 1.0.4&lt;br /&gt;
| 1.1.1&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.1.3&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| 2.3.0&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.6.2&lt;br /&gt;
| 2.7.1&lt;br /&gt;
| -&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15.1&lt;br /&gt;
| 0.18&lt;br /&gt;
| 0.19.1&lt;br /&gt;
| 0.20.1&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| 1.1&lt;br /&gt;
| 5.1&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7251</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7251"/>
		<updated>2014-09-16T19:59:15Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2&lt;br /&gt;
| 2.5.3&lt;br /&gt;
| 2.5.5&lt;br /&gt;
| -&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 &lt;br /&gt;
| 0.13.1&lt;br /&gt;
| 1.0.0&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 &lt;br /&gt;
| 2.4.0&lt;br /&gt;
| 3.0.0&lt;br /&gt;
| 3.1.1&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8&lt;br /&gt;
| 1.0.4&lt;br /&gt;
| 1.1.1&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.1.3&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| 2.3.0&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7250</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7250"/>
		<updated>2014-09-16T19:57:38Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2&lt;br /&gt;
| 2.5.3&lt;br /&gt;
| 2.5.5&lt;br /&gt;
| -&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 &lt;br /&gt;
| 0.13.1&lt;br /&gt;
| 1.0.0&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 &lt;br /&gt;
| 2.4.0&lt;br /&gt;
| 3.0.0&lt;br /&gt;
| 3.1.1&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8&lt;br /&gt;
| 1.0.4&lt;br /&gt;
| 1.1.1&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7249</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7249"/>
		<updated>2014-09-16T19:54:36Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2&lt;br /&gt;
| 2.5.3&lt;br /&gt;
| 2.5.5&lt;br /&gt;
| -&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 &lt;br /&gt;
| 0.13.1&lt;br /&gt;
| 1.0.0&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| 1.3.0&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 (Python 2.7.2)&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8 (Python 2.7.2)&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7248</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7248"/>
		<updated>2014-09-16T19:52:28Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2&lt;br /&gt;
| 2.5.3&lt;br /&gt;
| 2.5.5&lt;br /&gt;
| -&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 &lt;br /&gt;
| 0.13.1&lt;br /&gt;
| 1.0.0&lt;br /&gt;
| 1.2.1&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0 (Python 2.7.2)&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 (Python 2.7.2)&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8 (Python 2.7.2)&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7247</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7247"/>
		<updated>2014-09-16T19:51:02Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2&lt;br /&gt;
| 2.5.3&lt;br /&gt;
| 2.5.5&lt;br /&gt;
| -&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 (Python 2.7.2)&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0 (Python 2.7.2)&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 (Python 2.7.2)&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8 (Python 2.7.2)&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7246</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7246"/>
		<updated>2014-09-16T19:49:26Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| &lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2 (Python 2.7.2)&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 (Python 2.7.2)&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0 (Python 2.7.2)&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 (Python 2.7.2)&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8 (Python 2.7.2)&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7245</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7245"/>
		<updated>2014-09-16T19:48:25Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| 1.2.2&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| 2.2.1&lt;br /&gt;
| 2.4_rc2&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2 (Python 2.7.2)&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 (Python 2.7.2)&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0 (Python 2.7.2)&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 (Python 2.7.2)&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8 (Python 2.7.2)&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7244</id>
		<title>Python</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Python&amp;diff=7244"/>
		<updated>2014-09-16T19:38:22Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Modules installed system-wide */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing.   It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.     &lt;br /&gt;
&lt;br /&gt;
There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.&lt;br /&gt;
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].&lt;br /&gt;
&lt;br /&gt;
__FORCETOC__ &lt;br /&gt;
&lt;br /&gt;
== Python on the GPC ==&lt;br /&gt;
&lt;br /&gt;
We currently have python 2.7.2 installed, compiled against fast intel math libraries.   To use this version,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
module load gcc intel python&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Modules installed system-wide ==&lt;br /&gt;
&lt;br /&gt;
Many optional packages are available for Python which greatly extend the language adding important new functionality.  Those packages which are likely to be important to all of our users &amp;amp;mdash; eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.&lt;br /&gt;
&lt;br /&gt;
Below is a list of the packages currently installed system-wide.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Module  &lt;br /&gt;
!{{Hl2}}| python/2.7.2 &lt;br /&gt;
!{{Hl2}}| python/2.7.3 &lt;br /&gt;
!{{Hl2}}| python/2.7.5 &lt;br /&gt;
!{{Hl2}}| python/3.3.4&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
|-  &lt;br /&gt;
|[http://www.scipy.org/ SciPy]&lt;br /&gt;
|  0.10.0&lt;br /&gt;
|  0.11.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
|  0.14.0&lt;br /&gt;
| An Open-source software for mathematics, science, and engineering.  Version in Python 2.7.x is linked against very fast MKL numerical libraries. &lt;br /&gt;
|-&lt;br /&gt;
|[http://numpy.scipy.org/ NumPy]&lt;br /&gt;
| 1.6.1&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.7.0&lt;br /&gt;
| 1.8.1&lt;br /&gt;
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc.  SciPy is built on top of NumPy.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mpi4py.scipy.org/ mpi4py]&lt;br /&gt;
| 1.2.2 (Python 2.7.2)&lt;br /&gt;
| A pythonic interface to mpi.   Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]&lt;br /&gt;
| 2.0 (Python 2.7.2)&lt;br /&gt;
| Fast, memory-efficient elementwise operations on Numpy arrays.&lt;br /&gt;
|-&lt;br /&gt;
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]&lt;br /&gt;
| 2.8 &lt;br /&gt;
| A collection of scientific python utilities.   Does not include MPI support.&lt;br /&gt;
|-&lt;br /&gt;
| [http://yt.enzotools.org/ yt]&lt;br /&gt;
| 2.2 (Python 2.7.2)&lt;br /&gt;
| A collection of python tools for analyzing astrophysical simulation output.&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipython.scipy.org/moin/ iPython]&lt;br /&gt;
| 0.11 (Python 2.7.2)&lt;br /&gt;
| An enhanced interactive python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab&lt;br /&gt;
| 1.1.0 (Python 2.7.2)&lt;br /&gt;
| Matlab-like plotting for python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pytables.org/moin PyTables]&lt;br /&gt;
| 2.3.1 (Python 2.7.2)&lt;br /&gt;
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.)   Requires the &amp;lt;tt&amp;gt;hdf5/184-p1-v18-serial-gcc&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;hdf5/187-v18-serial-gcc&amp;lt;/tt&amp;gt; on CentOS6.)&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]&lt;br /&gt;
| 0.9.8 (Python 2.7.2)&lt;br /&gt;
| Python interface to NetCDF4 files.   Requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module to be loaded. (&amp;lt;tt&amp;gt;netcdf/4.1.3_hdf5_serial-gcc&amp;lt;/tt&amp;gt; on CentOS 6)&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]&lt;br /&gt;
| 1.4.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to NetCDF4 files; again, requires the &amp;lt;tt&amp;gt;netcdf/4.0.1_hdf5_v18-serial.shared-nofortran&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
|-&lt;br /&gt;
| [http://alfven.org/wp/hdf5-for-python/ h5py]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.&lt;br /&gt;
|-&lt;br /&gt;
| [http://pysvn.tigris.org/ PySVN]&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Python interface to the svn version control system.  Requires the &amp;lt;tt&amp;gt;svn&amp;lt;/tt&amp;gt; module to be loaded on CentOS5.&lt;br /&gt;
|-&lt;br /&gt;
| [http://mercurial.selenic.com/ Mercurial]&lt;br /&gt;
| 2.0.1 (Python 2.7.2)&lt;br /&gt;
| A distributed version-control system written in Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://cython.org/ Cython]&lt;br /&gt;
| 0.15 (Python 2.7.x)&lt;br /&gt;
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.&lt;br /&gt;
|-&lt;br /&gt;
| [http://code.google.com/p/python-nose/ nose]&lt;br /&gt;
| 1.1.2 (Python 2.7.2)&lt;br /&gt;
| A unit-testing framework for python.&lt;br /&gt;
|- &lt;br /&gt;
| [http://pypi.python.org/pypi/setuptools setuptools]&lt;br /&gt;
| 0.6c11&lt;br /&gt;
| Enables easy installation of new python modules&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Installing your own Python Modules ==&lt;br /&gt;
&lt;br /&gt;
Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional  packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories.  This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict. &lt;br /&gt;
&lt;br /&gt;
To install your own Python modules, follow the instructions below.   Where the instructions say &amp;lt;tt&amp;gt;python2.X&amp;lt;/tt&amp;gt;, type &amp;lt;tt&amp;gt;python2.6&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;python2.7&amp;lt;/tt&amp;gt; depending on the version of python you are using.&lt;br /&gt;
&lt;br /&gt;
* First, create a directory in your home directory, &amp;lt;tt&amp;gt;${HOME}/lib/python2.X/site-packages&amp;lt;/tt&amp;gt;, where the packages will go.&lt;br /&gt;
* Next, in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt;, *after* you &amp;lt;tt&amp;gt;module load python&amp;lt;/tt&amp;gt; and in the &amp;quot;GPC&amp;quot; section, add the following line:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Re-load the modified .bashrc by typing &amp;lt;tt&amp;gt;source ~/.bashrc&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,&lt;br /&gt;
** install with the following command. where &amp;lt;tt&amp;gt;packagename&amp;lt;/tt&amp;gt; is the name of the package you are installing: &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
** Continue doing this until all of the packages you need to install are successfully installed.&lt;br /&gt;
** If, upon importing the new python package, you get error messages like &amp;lt;tt&amp;gt;undefined symbol: __stack_chk_guard&amp;lt;/tt&amp;gt;, you may need to use the following command instead:&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using &amp;lt;tt&amp;gt;python setup.py install&amp;lt;/tt&amp;gt; then instead:&lt;br /&gt;
** Download the relevant files&lt;br /&gt;
** You will probably have to uncompress and untar them: &amp;lt;tt&amp;gt;tar -xzvf packagename.tgz&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;tar -xjvf packagename.bz2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
** cd into the newly created directory, and run &lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
python setup.py install --prefix=${HOME}&lt;br /&gt;
&amp;lt;/source&amp;gt; &lt;br /&gt;
&lt;br /&gt;
* Now, the install process may have added some .egg files or directories to your path.  For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,&lt;br /&gt;
&amp;lt;source lang=bash&amp;gt;&lt;br /&gt;
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* You should now be done!   Now, re-source your .bashrc and test your new python modules.&lt;br /&gt;
&lt;br /&gt;
* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own  modules (for the &amp;quot;module&amp;quot; system, not specifically python modules).  &lt;br /&gt;
&lt;br /&gt;
[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=7243</id>
		<title>Software and Libraries</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=7243"/>
		<updated>2014-09-16T18:31:17Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* GPC Software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Software Module System =&lt;br /&gt;
All the software listed on this page is accessed using a modules system.  This means that much of the software is not &lt;br /&gt;
accessible by default but has to be loaded using the module command. The&lt;br /&gt;
reason is that&lt;br /&gt;
* it allows us to easily keep multiple versions of software for different users on the system;&lt;br /&gt;
* it allows users to easily switch between versions.&lt;br /&gt;
The module system works similarly on the GPC and the TCS, although different modules are installed on these two systems.&lt;br /&gt;
&lt;br /&gt;
Note that, generally, if you compile a program with a module loaded, you will have to run it with that same module loaded, to make dynamically linked libraries accessible.&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
!{{Hl2}}|Function&lt;br /&gt;
!{{Hl2}}|Command&lt;br /&gt;
!{{Hl2}}|Comments&lt;br /&gt;
|-&lt;br /&gt;
|List available software packages:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If a module is not listed here, it is not supported.&lt;br /&gt;
*The flag &amp;quot;(default)&amp;quot; is never part of the name.&lt;br /&gt;
|-&lt;br /&gt;
|Use particular software:&lt;br /&gt;
|&amp;lt;pre&amp;gt; $ module load [module-name] &amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If possible, specify only the short name (the part before the &amp;quot;/&amp;quot;). &lt;br /&gt;
*When ambiguous, this loads the default one. &lt;br /&gt;
|-&lt;br /&gt;
|List available versions of a specific software package:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail [short-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|List currently loaded modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module list&amp;lt;/pre&amp;gt;&lt;br /&gt;
|For reproducability, it is a good idea to put this in your job scripts, so you know exactly what modules(+version) were used.&lt;br /&gt;
|-&lt;br /&gt;
|Get description of a particular module:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module help [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove a module from your shell:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module unload [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove all modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module purge&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Replace one loaded module with another:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module switch [old-module-name] [new-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SCINET_[short-module-name]_BASE&lt;br /&gt;
SCINET_[short-module-name]_LIB&lt;br /&gt;
SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[short-module-name]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[short-module-name]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Errors in loaded modules can arise for a few reasons, for instance:&lt;br /&gt;
* A module by that name may not exist.&lt;br /&gt;
* Some modules require other modules to have been loaded; it this requirement is not met when you try to load that module, an error message will be printed explaining what module is needed.&lt;br /&gt;
* Some modules cannot be loaded together: an error message will be printed explaining which modules conflict.&lt;br /&gt;
&lt;br /&gt;
It is no longer recommended to load modules in the file [[Important_.bashrc_guidelines|.bashrc]] in your home directory, rather, load them explicitly on the command-line and in your job scripts.&lt;br /&gt;
&lt;br /&gt;
== Default and non-default modules ==&lt;br /&gt;
&lt;br /&gt;
When you load a module with its 'short' name, you will get the ''default'' version, which is the most recent (usually), recommended version of that library or piece of software.  In general, using the short module name is the way to go. However, you may have code that depends on the intricacies of a non-default version.  For that reason, the most common older versions are also available as modules.  You can find all available modules using the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
== Naming convention ==&lt;br /&gt;
&lt;br /&gt;
For modules that access applications, the full name of a module is as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  [short-module-name]/[version-number]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To have all modules conform to this convention, a number of modules' name change on Nov 3, 2010:&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''old name'''&lt;br /&gt;
| '''new name'''&lt;br /&gt;
| '''remarks'''&lt;br /&gt;
|-&lt;br /&gt;
|autoconf/autoconf-2.64 &amp;amp;nbsp; &amp;amp;nbsp;&amp;amp;nbsp;&lt;br /&gt;
|autoconf/2.64&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.0           &lt;br /&gt;
|cuda/3.0&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.1          &lt;br /&gt;
|cuda/3.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/ddd-3.3.12   &lt;br /&gt;
|ddd/3.3.12&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/gdb-7.1       &lt;br /&gt;
|gdb/7.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|editors/nano/2.2.4      &lt;br /&gt;
|nano/2.2.4&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|emacs/emacs-23.1        &lt;br /&gt;
|emacs/23.1.1&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|gcc/gcc-4.4.0           &lt;br /&gt;
|gcc/4.4.0&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|graphics/ncview         &lt;br /&gt;
|ncview/1.93&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|graphics/graphics       &lt;br /&gt;
|grace/5.1.22&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|                        &lt;br /&gt;
|gnuplot/4.2.6&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|svn/svn165              &lt;br /&gt;
|svn/1.6.5&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|visualization/paraview  &lt;br /&gt;
|paraview/3.8&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|amber10/amber10         &lt;br /&gt;
|amber/10.0.30 &lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|gamess/gamess           &lt;br /&gt;
|gamess/May2209 &amp;amp;nbsp;&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==modulefind - Finding modules by name==&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command will only show you modules whose names start with the argument that you give it, and will alsi return modules that you cannot load due to conflicts with already loaded modules.&lt;br /&gt;
&lt;br /&gt;
A little SciNet utility called &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt; (one word) can do that. It will list all installed modules which contain the arguments, and will determine  whether those modules have &lt;br /&gt;
been loaded, could be loaded, cannot because of conflicts with&lt;br /&gt;
already loaded modules, or have unresolved dependencies &lt;br /&gt;
(i.e. for which other modules need to be loaded first).  This is especially useful in cases like the &amp;quot;boost&amp;quot; libraries, whose module names are cxxlibraries/boost/1.47.0-gcc and cxxlibraries/boost/1.47.0-gcc, for the gcc and intel compiler, respectively.  &amp;lt;tt&amp;gt;modulefind boost&amp;lt;/tt&amp;gt; will find those, whereas &amp;lt;tt&amp;gt;module avail boost&amp;lt;/tt&amp;gt; will not.&lt;br /&gt;
&lt;br /&gt;
Note that just 'modulefind' will list all top-level modules.&lt;br /&gt;
&lt;br /&gt;
== Making your own modules ==&lt;br /&gt;
&lt;br /&gt;
How to make your own modules (e.g. for local installations or to access optional perl modules, ...), is possible and described on the [[Installing your own modules]] page.&lt;br /&gt;
&lt;br /&gt;
== Deprecated modules ==&lt;br /&gt;
&lt;br /&gt;
Some older software modules for which newer versions exist, get deprecated, which means they do not get maintained.  Since deprecated modules should only be needed in rare exceptional cases, they are not listed by the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.  However, if you have a piece of legacy code that really depends on a deprecated version of a library (and we urge you to check that it does not work with newer versions!), then you can load a deprecated version by &amp;lt;pre&amp;gt;module load use.deprecated [deprecated-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Currently (Oct 5,2010), the following modules are deprecated on the GPC: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/gcc-4.3.2          hdf5/184-v16-serial     intel/intel-v11.1.046               openmpi/1.3.3-intel-v11.0-ofed&lt;br /&gt;
hdf5/183-v16-openmpi   hdf5/184-v18-intelmpi   intelmpi/impi-3.2.1.009             openmpi/1.3.2-intel-v11.0-ofed.orig&lt;br /&gt;
hdf5/183-v18-openmpi   hdf5/184-v18-openmpi    intelmpi/impi-3.2.2.006             pgplot/5.2.2-gcc.old            &lt;br /&gt;
hdf5/184-v16-intelmpi  hdf5/184-v18-serial     intelmpi/impi-4.0.0.013             pgplot/5.2.2-intel.old&lt;br /&gt;
hdf5/184-v16-openmpi   intel/intel-v11.0.081   intelmpi/impi-4.0.0.025               &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the TCS, currently (Oct 5,2010) the only deprecated module is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ncl/5.1.1old&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Before using any of these deprecated modules, make sure that there is not a regular module that satisfies your needs, likely by a ''very similar name''.&lt;br /&gt;
&lt;br /&gt;
== Commercial software ==&lt;br /&gt;
&lt;br /&gt;
Apart from the compilers on our systems and the ddt parallel debugger, we generally do not provide licensed application software, e.g., no Gaussian, IDL, Matlab, etc. &lt;br /&gt;
See the [https://support.scinet.utoronto.ca/wiki/index.php/FAQ#How_can_I_run_Matlab_.2F_IDL_.2F_Gaussian_.2F_my_favourite_commercial_software_at_SciNet.3F FAQ].&lt;br /&gt;
&lt;br /&gt;
== Other software and libraries ==&lt;br /&gt;
&lt;br /&gt;
If you want to use a piece of software or a library that is not on the list, you can in principle install it yourself in you &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
Note however that building libraries and software from source often uses a lot of files. To avoid running out of disk space, building software is therefore best done from the &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt;, from which&lt;br /&gt;
you can copy/install only the libraries, header files and binaries to your &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
&lt;br /&gt;
If you suspect that a particular piece of software or a library would be of use to other users of SciNet as well, contact us, and we will consider adding it to the system.&lt;br /&gt;
&lt;br /&gt;
== Software lists ==&lt;br /&gt;
=== ARC/GPU Software ===&lt;br /&gt;
&lt;br /&gt;
The CPUs in the GPU nodes of the ARC cluster are of the same kind as those of the GPC, so all modules available on the GPC are available on the GPU nodes with a CentOS 6 image. This means that the different cuda variants that are available as modules, can be loaded on those GPC nodes as well, although they are of little use on that system.&lt;br /&gt;
&lt;br /&gt;
=== GPC Software ===&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Software  &lt;br /&gt;
!{{Hl2}}| Versions&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-  &lt;br /&gt;
|Intel Compiler&lt;br /&gt;
|12.1.3*, 12.1.5, 13.1.1, 14.0.1&lt;br /&gt;
| includes MKL library, which includes BLAS, LAPACK, FFT, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;icpc,icc,ifort&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.6.1*, 4.7.0, 4.7.2, 4.8.1, 4.9.0&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Cuda&lt;br /&gt;
| 3.2, 4.0, 4.1*, 4.2, 5.0, 5.5&lt;br /&gt;
| NVIDIA's extension to C for GPGPU programming&lt;br /&gt;
| &amp;lt;tt&amp;gt;nvcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| PGI Compiler&lt;br /&gt;
| 12.5&lt;br /&gt;
| supports OpenACC and CUDA Fortran &lt;br /&gt;
| &amp;lt;tt&amp;gt;pgcc,pgcpp,pgfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| IntelMPI&lt;br /&gt;
| 4.0.2, 4.1.2&lt;br /&gt;
| MPICH2 based MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intelmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| OpenMPI&lt;br /&gt;
| 1.4.4*, 1.5.4, 1.6.4&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 2.12.2&lt;br /&gt;
| Berkley Unified Parallel C Implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;upcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Editors'''''&lt;br /&gt;
|- &lt;br /&gt;
| Nano&lt;br /&gt;
| 2.2.4&lt;br /&gt;
| Nano's another editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Emacs&lt;br /&gt;
| 23.1.1&lt;br /&gt;
| New version of popular text editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| XEmacs&lt;br /&gt;
| 21.4.22&lt;br /&gt;
| XEmacs editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Development tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| Autoconf&lt;br /&gt;
| 2.68&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Automake&lt;br /&gt;
| 1.11.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;aclocal, automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CMake&lt;br /&gt;
| 2.8.6&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scons&lt;br /&gt;
| 2.0&lt;br /&gt;
| Software construction tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Git&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Revision control system&lt;br /&gt;
| &amp;lt;tt&amp;gt;git,gitk&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;git&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Intel tools&lt;br /&gt;
| 2011&lt;br /&gt;
| Intel Code Analysis Tools&lt;br /&gt;
| Vtune Amplifier XE, Inspector XE&lt;br /&gt;
| &amp;lt;tt&amp;gt;inteltools&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Mercurial&lt;br /&gt;
| 1.8.2&lt;br /&gt;
| Version control system&amp;lt;br&amp;gt;(part of the python module!)&lt;br /&gt;
| &amp;lt;tt&amp;gt;hg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug and performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1, 4.2.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool, + MAP MPI Profiler&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt, map&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| DDD&lt;br /&gt;
| 3.3.12&lt;br /&gt;
| Data Display Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GDB&lt;br /&gt;
| 7.3.1&lt;br /&gt;
| GNU debugger (the intel idbc debugger is available by default)&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| MPE2&lt;br /&gt;
| 2.4.5&lt;br /&gt;
| Multi-Processing Environment with intel + OpenMPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpecc, mpefc, jumpshot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#OpenSpeedShop_.28profiling.2C_MPI_tracing:_GPC.29 | OpenSpeedShop]]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| sampling and MPI tracing&lt;br /&gt;
| &amp;lt;tt&amp;gt;openss, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openspeedshop&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#Scalasca_.28profiling.2C_tracing:_TCS.2C_GPC.29 | Scalasca]]&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications (Compiled with OpenMPI)&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipm-hpc.sourceforge.net IPM]&lt;br /&gt;
| 0.983&lt;br /&gt;
| Integrated Performance Monitors http://ipm-hpc.sourceforge.net/]&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm, ipm_parse, ploticus,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Performance_And_Debugging_Tools:_GPC#Valgrind | Valgrind]]&lt;br /&gt;
| 3.6.1&lt;br /&gt;
| Memory checking utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind,cachegrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Padb&lt;br /&gt;
| 3.2 &lt;br /&gt;
| examine and debug parallel programs&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|&amp;lt;span id=&amp;quot;anchor_viz&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;'''''Visualization tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| Grace&lt;br /&gt;
| 5.1.22&lt;br /&gt;
| Plotting utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;xmgrace&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;grace&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Gnuplot&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Plotting utility&amp;lt;br&amp;gt;Requires 'extras' module if used on compute nodes.&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[ Using_Paraview | ParaView ]]&lt;br /&gt;
| 3.12.0&lt;br /&gt;
| Scientific visualization, server only&lt;br /&gt;
| &amp;lt;tt&amp;gt;pvserver,pvbatch,pvpython&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;paraview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| VMD&lt;br /&gt;
| 1.9&lt;br /&gt;
| Visualization and analysis utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCL/NCARG&lt;br /&gt;
| 6.0.0&lt;br /&gt;
| NCARG graphics and ncl utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| ROOT&lt;br /&gt;
| 5.30.00&lt;br /&gt;
| ROOT Analysis Framework from CERN&lt;br /&gt;
| &amp;lt;tt&amp;gt;root&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ROOT&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| ImageMagick&lt;br /&gt;
| 6.6.7&lt;br /&gt;
| Image manipulation tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;convert,animate,composite,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ImageMagick&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PGPLOT&lt;br /&gt;
| 5.2.2&lt;br /&gt;
| Graphics subroutine library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libcpgplot,libpgplot,libtkpgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ncview&lt;br /&gt;
| 2.1.1&lt;br /&gt;
| Visualization for NetCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, etc.&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CDO&lt;br /&gt;
| 1.5.1&lt;br /&gt;
| Climate Data Operators&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| UDUNITS&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc,hdiff,...,libdf,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Hdf5 | HDF5]]&lt;br /&gt;
| 1.8.7-v18*&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.arg0.net/encfs EncFS ]&lt;br /&gt;
| 1.74&lt;br /&gt;
| EncFS provides an encrypted filesystem in user-space, (works ONLY on gpc01..04)&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|- &lt;br /&gt;
| [[amber|AMBER 10]]&lt;br /&gt;
| Amber 10 + Amber Tools 1.3&lt;br /&gt;
| Amber Molecular Dynamics Package&lt;br /&gt;
| &amp;lt;tt&amp;gt;sander, sander.MPI&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;amber&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gamess|GAMESS (US)]]&lt;br /&gt;
| August 18, 2011 R1&lt;br /&gt;
| General Atomic and Molecular Electronic Structure System&lt;br /&gt;
| &amp;lt;tt&amp;gt;rungms&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gamess&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gromacs|GROMACS]]&lt;br /&gt;
| 4.5.5, 4.5.7, 4.6.2&lt;br /&gt;
| GROMACS molecular dynamics, single precision, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;grompp, mdrun&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gromacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[namd|NAMD]]&lt;br /&gt;
| 2.8&lt;br /&gt;
| NAMD - Scalable Molecular Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;namdmpiexec, namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[nwchem|NWChem]]&lt;br /&gt;
| 6.0&lt;br /&gt;
| NWChem Quantum Chemistry&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 4.3.2, 5.0.3&lt;br /&gt;
| Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;pw.x, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://blast.ncbi.nlm.nih.gov BLAST]&lt;br /&gt;
| 2.2.23+&lt;br /&gt;
| Basic Local Alignment Search Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;blastn,blastp,blastx,psiblast,tblastn...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;blast&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
| [http://denovoassembler.sourceforge.net RAY]&lt;br /&gt;
| 2.1.0 (small k-mer)&lt;br /&gt;
| Parallel de novo genome assemblies&lt;br /&gt;
| &amp;lt;tt&amp;gt;Ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[cpmd|CPMD]]&lt;br /&gt;
| 3.13.2&lt;br /&gt;
| Carr-Parinello molecular dynamics, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd.x&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[R Statistical Package|R]] &lt;br /&gt;
| 2.13.1&lt;br /&gt;
| statistical computing&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Octave&lt;br /&gt;
| 3.4.3&lt;br /&gt;
| Matlab-like environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.openfoam.org OpenFOAM ]&lt;br /&gt;
| 2.3.0&lt;br /&gt;
| Open Source CFD Package&lt;br /&gt;
| &amp;lt;tt&amp;gt;*foam&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.mcs.anl.gov/petsc/petsc-as/  PETSc ]&lt;br /&gt;
| 3.1*&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation (PETSc)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc, etc.. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Armadillo C++ linear algebra library | Armadillo]]&lt;br /&gt;
| 3.910.0&lt;br /&gt;
| C++ armadillo libraries (implement Matlab-like syntax)&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.tacc.utexas.edu/tacc-projects/gotoblas2/ GotoBLAS]&lt;br /&gt;
| 1.13&lt;br /&gt;
| Optimized BLAS implementation &lt;br /&gt;
| &amp;lt;tt&amp;gt;libgoto2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gotoblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13*, 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.3&lt;br /&gt;
| fast Fourier transform library&lt;br /&gt;
''Be careful in combining fftw3 and MKL: you need to link fftw3 first, with'' &amp;lt;tt&amp;gt;-L${SCINET_FFTW_LIB} -lfftw3&amp;lt;/tt&amp;gt;, then link MKL&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| LAPACK&lt;br /&gt;
| &lt;br /&gt;
| Provided by the Intel MKL library&lt;br /&gt;
| See http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://freshmeat.net/projects/rlog  RLog ]&lt;br /&gt;
| 1.4&lt;br /&gt;
| RLog provides a flexible message logging facility for C++ programs and libraries.&lt;br /&gt;
| &amp;lt;tt&amp;gt;librlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/rlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[GNU Parallel]]&lt;br /&gt;
| 2012-10-22&lt;br /&gt;
| execute commands in parallel&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnu-parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.3.4&lt;br /&gt;
| Python programming language. Modules included : numpy 1.8.1 , scipy 0.14.0 , matlotlib 1.3.1 , ipython 1.2.1 , cython 0.20.1 , h5py-2.3.0 , tables-3.1.1 , netCDF4-1.1.0 , astropy-0.3.2 , scikit_learn-0.15.0b1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ruby&lt;br /&gt;
| 1.9.1&lt;br /&gt;
| Ruby programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 1.6.0&lt;br /&gt;
| IBM's Java JRE ad SDK&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| Xlibraries&lt;br /&gt;
|&lt;br /&gt;
| A collection of X graphics libraries and tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;xterm&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xpdf&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;Xlibraries&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Extras&lt;br /&gt;
|&lt;br /&gt;
| A collection of standard linux and home-grown tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;bc&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;screen&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xxdiff&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;ish&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== TCS Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM compilers&lt;br /&gt;
|10.1(c/c++)&amp;lt;br&amp;gt;12.1(fortran)&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlf,xlc_r,xlC_r,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpcc,mpCC,mpxlf,mpcc_r,mpCC_r,mpxlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 1.2&lt;br /&gt;
| Unified Parallel C&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlupc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|13.1, 14.1&lt;br /&gt;
| newer version &lt;br /&gt;
| &amp;lt;tt&amp;gt;xlf,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| xlf/13.1&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|11.1, 12.1&lt;br /&gt;
| new versions&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| vacpp&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| MPE2&lt;br /&gt;
| 1.0.6&lt;br /&gt;
| Performance Visualization for Parallel Programs   &lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scalasca&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.5&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc, hdiff, ..., libdf, libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF + ncview&lt;br /&gt;
| 4.0.1*&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf, ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.1.1*&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 3.9.6*&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Fast Fourier transform library&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi,libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK+SCALAPACK&lt;br /&gt;
| 3.4.2+2.0.2&lt;br /&gt;
| Linear algebra package. Note that essl, which comes with the ibm compilers contains a large part of lapack as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack,libscalapack,libblacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lapack&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PetSc&lt;br /&gt;
| 3.2&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation. With external packages mumps, chaco, hypre, parmetis, prometheus, plapack, superlu, sprng.&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of libraries to your user environment&amp;lt;br&amp;gt; compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi, libfftw3, libhdf5, liblapack, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gmake&lt;br /&gt;
| 3.82&lt;br /&gt;
| GNU's make. Replaces AIX make or gmake 3.80.&lt;br /&gt;
| &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCL&lt;br /&gt;
| 5.1.1&lt;br /&gt;
| NCAR Command Language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl, libncl, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== P7 Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1, 13.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlf,xlf_r,xlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1, 11.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.23.2&lt;br /&gt;
| &lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|5.2.2&lt;br /&gt;
|IBM's Parallel Environment&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpcc,mpCC,mpfort,mpiexec&amp;lt;/tt&amp;gt;&lt;br /&gt;
|pe&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.6.1 , 4.8.1&lt;br /&gt;
| GNU Compiler Collection&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 7.0&lt;br /&gt;
| IBM Java 1.7 implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;jdk&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.7&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.5&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0 , scipy-0.13.2 , matplotlib-1.3.1 , pyfits-3.2 , h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| command-driven interactive function and data plotting program&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| udunits&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of applications and libraries to your user environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;bindlaunch, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=Manuals=&lt;br /&gt;
{{:Manuals}}&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7172</id>
		<title>Oldwiki.scinet.utoronto.ca:System Alerts</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7172"/>
		<updated>2014-08-28T14:30:43Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== System Status==&lt;br /&gt;
&amp;lt;!-- The 'status circles' can be one of the following files: &lt;br /&gt;
     down.png   for down&lt;br /&gt;
     up25.png   for 25% up&lt;br /&gt;
     up50.png   for 50% up&lt;br /&gt;
     up75.png   for 75% up&lt;br /&gt;
     up.png     for 100% up&lt;br /&gt;
 --&amp;gt;&lt;br /&gt;
{| &lt;br /&gt;
|[[File:up.png|up|link=GPC Quickstart]][[GPC Quickstart|GPC]]&lt;br /&gt;
|[[File:up.png|up|link=TCS Quickstart]][[TCS Quickstart|TCS]]&lt;br /&gt;
|[[File:up.png|up|link=Sandy]][[Sandy]]&lt;br /&gt;
|[[File:up.png|up|link=GPU Devel Nodes]][[GPU Devel Nodes|ARC]]&lt;br /&gt;
|[[File:up.png|up]]File System&lt;br /&gt;
|-&lt;br /&gt;
|[[File:up.png|up|link=Gravity]][[Gravity]]&lt;br /&gt;
|[[File:up.png|up|link=P7 Linux Cluster]][[P7 Linux Cluster|P7]]&lt;br /&gt;
|[[File:up.png|up|link=BGQ]][[BGQ]]&lt;br /&gt;
|[[File:up.png|up|link=HPSS]][[HPSS]]&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
On August 21, 2014, at 00:30am, there will be a 15 minute period during which maintenance on our gateway switch must be conducted.  As a result you may not be able to contact the datacentre.  No jobs will be affected by this.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note: As a precaution, emails by the Moab/Torque scheduler have been disabled because of a potential security vulnerability since Jan 24th 2014.&lt;br /&gt;
&lt;br /&gt;
Last updated on Wed Aug 20 11:19:21 EDT 2014&lt;br /&gt;
&lt;br /&gt;
([[Previous_messages:|Previous messages]])&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7170</id>
		<title>Oldwiki.scinet.utoronto.ca:System Alerts</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7170"/>
		<updated>2014-08-27T19:46:06Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== System Status==&lt;br /&gt;
&amp;lt;!-- The 'status circles' can be one of the following files: &lt;br /&gt;
     down.png   for down&lt;br /&gt;
     up25.png   for 25% up&lt;br /&gt;
     up50.png   for 50% up&lt;br /&gt;
     up75.png   for 75% up&lt;br /&gt;
     up.png     for 100% up&lt;br /&gt;
 --&amp;gt;&lt;br /&gt;
{| &lt;br /&gt;
|[[File:up.png|up|link=GPC Quickstart]][[GPC Quickstart|GPC]]&lt;br /&gt;
|[[File:up.png|up|link=TCS Quickstart]][[TCS Quickstart|TCS]]&lt;br /&gt;
|[[File:up.png|up|link=Sandy]][[Sandy]]&lt;br /&gt;
|[[File:up.png|up|link=GPU Devel Nodes]][[GPU Devel Nodes|ARC]]&lt;br /&gt;
|[[File:up.png|up]]File System&lt;br /&gt;
|-&lt;br /&gt;
|[[File:up.png|up|link=Gravity]][[Gravity]]&lt;br /&gt;
|[[File:up.png|up|link=P7 Linux Cluster]][[P7 Linux Cluster|P7]]&lt;br /&gt;
|[[File:up75.png|up|link=BGQ]][[BGQ]]&lt;br /&gt;
|[[File:up.png|up|link=HPSS]][[HPSS]]&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Wed Aug 27 15:44:00 EDT 2014: Issue with a node board on the production system : full system jobs cannot run until the node is fixed.&lt;br /&gt;
&lt;br /&gt;
On August 21, 2014, at 00:30am, there will be a 15 minute period during which maintenance on our gateway switch must be conducted.  As a result you may not be able to contact the datacentre.  No jobs will be affected by this.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note: As a precaution, emails by the Moab/Torque scheduler have been disabled because of a potential security vulnerability since Jan 24th 2014.&lt;br /&gt;
&lt;br /&gt;
Last updated on Wed Aug 20 11:19:21 EDT 2014&lt;br /&gt;
&lt;br /&gt;
([[Previous_messages:|Previous messages]])&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7169</id>
		<title>Oldwiki.scinet.utoronto.ca:System Alerts</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7169"/>
		<updated>2014-08-27T19:45:46Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== System Status==&lt;br /&gt;
&amp;lt;!-- The 'status circles' can be one of the following files: &lt;br /&gt;
     down.png   for down&lt;br /&gt;
     up25.png   for 25% up&lt;br /&gt;
     up50.png   for 50% up&lt;br /&gt;
     up75.png   for 75% up&lt;br /&gt;
     up.png     for 100% up&lt;br /&gt;
 --&amp;gt;&lt;br /&gt;
{| &lt;br /&gt;
|[[File:up.png|up|link=GPC Quickstart]][[GPC Quickstart|GPC]]&lt;br /&gt;
|[[File:up.png|up|link=TCS Quickstart]][[TCS Quickstart|TCS]]&lt;br /&gt;
|[[File:up.png|up|link=Sandy]][[Sandy]]&lt;br /&gt;
|[[File:up.png|up|link=GPU Devel Nodes]][[GPU Devel Nodes|ARC]]&lt;br /&gt;
|[[File:up.png|up]]File System&lt;br /&gt;
|-&lt;br /&gt;
|[[File:up.png|up|link=Gravity]][[Gravity]]&lt;br /&gt;
|[[File:up.png|up|link=P7 Linux Cluster]][[P7 Linux Cluster|P7]]&lt;br /&gt;
|[[up75.png|up|link=BGQ]][[BGQ]]&lt;br /&gt;
|[[File:up.png|up|link=HPSS]][[HPSS]]&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Wed Aug 27 15:44:00 EDT 2014: Issue with a node board on the production system : full system jobs cannot run until the node is fixed.&lt;br /&gt;
&lt;br /&gt;
On August 21, 2014, at 00:30am, there will be a 15 minute period during which maintenance on our gateway switch must be conducted.  As a result you may not be able to contact the datacentre.  No jobs will be affected by this.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note: As a precaution, emails by the Moab/Torque scheduler have been disabled because of a potential security vulnerability since Jan 24th 2014.&lt;br /&gt;
&lt;br /&gt;
Last updated on Wed Aug 20 11:19:21 EDT 2014&lt;br /&gt;
&lt;br /&gt;
([[Previous_messages:|Previous messages]])&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7163</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7163"/>
		<updated>2014-08-19T18:45:30Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Steps ( Job dependency) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/about/our-research-partners/soscip/].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Steps ( Job dependency) ===&lt;br /&gt;
LoadLeveler has a lot of advanced features to control job submission and execution. One of these features is called steps. This feature allows a series of jobs to be submitted using one script with dependencies defined between the jobs. What this allows is for a series of jobs to be run sequentially, waiting for the previous job, called a step, to be finished before the next job is started. The following example uses the same LoadLeveler script as previously shown, however the #@ step_name and #@ dependency directives are used to rerun the same case three times in a row, waiting until each job is finished to start the next.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ step_name = step1                                                                                                                                                                                                                        &lt;br /&gt;
# @ queue&lt;br /&gt;
# Launch the first step :&lt;br /&gt;
if [ $LOADL_STEP_NAME = &amp;quot;step1&amp;quot; ]; then&lt;br /&gt;
    runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ step_name = step2                                                                                                                                                                                                                        &lt;br /&gt;
# @ dependency = step1 == 0                                                                                                                                                                                                                        &lt;br /&gt;
# @ queue&lt;br /&gt;
# Launch the second step if the first one has returned 0 (done successfully) :&lt;br /&gt;
if [ $LOADL_STEP_NAME = &amp;quot;step2&amp;quot; ]; then&lt;br /&gt;
    runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ step_name = step3                                                                                                                                                                                                                        &lt;br /&gt;
# @ dependency = step2 == 0                                                                                                                                                                                                                        &lt;br /&gt;
# @ queue&lt;br /&gt;
# Launch the third step if the second one has returned 0 (done successfully) :&lt;br /&gt;
if [ $LOADL_STEP_NAME = &amp;quot;step3&amp;quot; ]; then&lt;br /&gt;
    runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
debugjob session with an executable implicitly calls runjob  with 1 mpi task :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
debugjob -i&lt;br /&gt;
**********************************************************&lt;br /&gt;
 Interactive BGQ runjob shell using bgq-fen1-ib0.10295.0 and           &lt;br /&gt;
 LL14040718574824 for 30 minutes with 64 NODES (1024 cores). &lt;br /&gt;
 IMPLICIT MODE: running an executable implicitly calls runjob&lt;br /&gt;
                with 1 mpi task&lt;br /&gt;
 Exit shell when finished.                                &lt;br /&gt;
**********************************************************&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.3, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7162</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7162"/>
		<updated>2014-08-19T17:54:00Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Steps ( Job dependency) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/about/our-research-partners/soscip/].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Steps ( Job dependency) ===&lt;br /&gt;
LoadLeveler has a lot of advanced features to control job submission and execution. One of these features is called steps. This feature allows a series of jobs to be submitted using one script with dependencies defined between the jobs. What this allows is for a series of jobs to be run sequentially, waiting for the previous job, called a step, to be finished before the next job is started. The following example uses the same LoadLeveler script as previously shown, however the #@ step_name and #@ dependency directives are used to rerun the same case three times in a row, waiting until each job is finished to start the next.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ step_name = step1                                                                                                                                                                                                                        &lt;br /&gt;
# @ queue&lt;br /&gt;
# Launch the first step :&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ step_name = step2                                                                                                                                                                                                                        &lt;br /&gt;
# @ dependency = step1 == 0                                                                                                                                                                                                                        &lt;br /&gt;
# @ queue&lt;br /&gt;
# Launch the second step if the first one has returned 0 (done successfully) :&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ step_name = step3                                                                                                                                                                                                                        &lt;br /&gt;
# @ dependency = step2 == 0                                                                                                                                                                                                                        &lt;br /&gt;
# @ queue&lt;br /&gt;
# Launch the third step if the second one has returned 0 (done successfully) :&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
debugjob session with an executable implicitly calls runjob  with 1 mpi task :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
debugjob -i&lt;br /&gt;
**********************************************************&lt;br /&gt;
 Interactive BGQ runjob shell using bgq-fen1-ib0.10295.0 and           &lt;br /&gt;
 LL14040718574824 for 30 minutes with 64 NODES (1024 cores). &lt;br /&gt;
 IMPLICIT MODE: running an executable implicitly calls runjob&lt;br /&gt;
                with 1 mpi task&lt;br /&gt;
 Exit shell when finished.                                &lt;br /&gt;
**********************************************************&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.3, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7161</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7161"/>
		<updated>2014-08-19T17:41:22Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Steps ( Job dependency) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/about/our-research-partners/soscip/].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Steps ( Job dependency) ===&lt;br /&gt;
LoadLeveler has a lot of advanced features to control job submission and execution. One of these features is called steps. This feature allows a series of jobs to be submitted using one script with dependencies defined between the jobs. What this allows is for a series of jobs to be run sequentially, waiting for the previous job, called a step, to be finished before the next job is started. The following example uses the same LoadLeveler script as previously shown, however the #@ step_name and #@ dependency directives are used to rerun the same case three times in a row, waiting until each job is finished to start the next.&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
debugjob session with an executable implicitly calls runjob  with 1 mpi task :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
debugjob -i&lt;br /&gt;
**********************************************************&lt;br /&gt;
 Interactive BGQ runjob shell using bgq-fen1-ib0.10295.0 and           &lt;br /&gt;
 LL14040718574824 for 30 minutes with 64 NODES (1024 cores). &lt;br /&gt;
 IMPLICIT MODE: running an executable implicitly calls runjob&lt;br /&gt;
                with 1 mpi task&lt;br /&gt;
 Exit shell when finished.                                &lt;br /&gt;
**********************************************************&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.3, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7160</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7160"/>
		<updated>2014-08-19T17:40:42Z</updated>

		<summary type="html">&lt;p&gt;Brelier: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/about/our-research-partners/soscip/].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Steps ( Job dependency) ===&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
debugjob session with an executable implicitly calls runjob  with 1 mpi task :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
debugjob -i&lt;br /&gt;
**********************************************************&lt;br /&gt;
 Interactive BGQ runjob shell using bgq-fen1-ib0.10295.0 and           &lt;br /&gt;
 LL14040718574824 for 30 minutes with 64 NODES (1024 cores). &lt;br /&gt;
 IMPLICIT MODE: running an executable implicitly calls runjob&lt;br /&gt;
                with 1 mpi task&lt;br /&gt;
 Exit shell when finished.                                &lt;br /&gt;
**********************************************************&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.3, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=GPC_Quickstart&amp;diff=7113</id>
		<title>GPC Quickstart</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=GPC_Quickstart&amp;diff=7113"/>
		<updated>2014-07-11T19:16:35Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /*  MPI */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:University_of_Tor_79284gm-a.jpg|center|300px|thumb]]&lt;br /&gt;
|name=General Purpose Cluster (GPC)&lt;br /&gt;
|installed=June 2009&lt;br /&gt;
|operatingsystem= Linux&lt;br /&gt;
|loginnode= gpc01..gpc04 (from &amp;lt;tt&amp;gt;login.scinet&amp;lt;/tt&amp;gt;)&lt;br /&gt;
|nnodes=3864 (30,912 cores)&lt;br /&gt;
|rampernode=16 Gb &lt;br /&gt;
|corespernode=8 (16 threads)&lt;br /&gt;
|interconnect=840 nodes 1:1 DDR, 3024 nodes 5:1 QDR&lt;br /&gt;
|vendorcompilers=icc (C) ifort (fortran) icpc (C++)&lt;br /&gt;
|queuetype=[[Moab | Moab/Torque]]&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
===Specifications===&lt;br /&gt;
The General Purpose Cluster is an extremely large cluster (ranked [http://www.top500.org/list/2009/06/100 16th] in the world at its inception, and fastest in Canada) and is where most computations are to be done at SciNet.  It is an IBM iDataPlex cluster based on Intel's Nehalem architecture (one of the [http://www.hpcwire.com/features/HPC-Vendors-Jump-On-Nehalem-42360237.html first in the world] to make use of the new chips). The GPC consists of 3,780 nodes (IBM iDataPlex DX360M2) with a total of 30,912  cores (Intel Xeon E5540) at 2.53GHz, with 16GB RAM per node (2GB per core). Approximately one quarter of the cluster is interconnected with non-blocking DDR InfiniBand while the rest of the nodes are connected with 5:1 blocked QDR InfiniBand.  The compute nodes are accessed through a queuing system that allows jobs with a minimum of 15minutes and maximum wall time of 48 hours .&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Login===&lt;br /&gt;
&lt;br /&gt;
First login via [[Ssh | ssh]] with your SciNet account at &amp;lt;tt&amp;gt;login.scinet.utoronto.ca&amp;lt;/tt&amp;gt;, and from there you can proceed to the Development nodes (&amp;lt;tt&amp;gt;gpc01&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;gpc02&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;gpc03&amp;lt;/tt&amp;gt;, or &amp;lt;tt&amp;gt;gpc04&amp;lt;/tt&amp;gt;) to compile/test your code.&lt;br /&gt;
&lt;br /&gt;
===Compile/Devel Nodes===&lt;br /&gt;
&lt;br /&gt;
From a scinet login node you can ssh to &amp;lt;tt&amp;gt;gpc01&amp;lt;/tt&amp;gt;..&amp;lt;tt&amp;gt;gpc04&amp;lt;/tt&amp;gt;.  You may also just use &amp;lt;tt&amp;gt;/scinet/gpc/bin/gpcdev&amp;lt;/tt&amp;gt; to take directly to the dev node with lowest cpu load (reassessed every 5 minutes).&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
% /scinet/gpc/bin/gpcdev -h&lt;br /&gt;
  Usage: gpcdev [usual ssh options]&lt;br /&gt;
Example: gpcdev -X&lt;br /&gt;
redirected to dev node with lowest cpu load&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These nodes have the same hardware configuration as most of the compute nodes except with more memory -- 8 Nehalem processing cores with 36GB RAM and QDR Infiniband.  You can compile and test your codes on these nodes. To interactively test on more than 8 processors, you can submit an [[GPC_Quickstart#Submitting_an_Interactive_.28Debug.29_Job | interactive job request]].&lt;br /&gt;
&lt;br /&gt;
Your [[Storage_Quickstart | home directory]] is in &amp;lt;tt&amp;gt;$HOME&amp;lt;/tt&amp;gt; (currently &amp;lt;tt&amp;gt;/home/g/group/USER&amp;lt;/tt&amp;gt; but this can change, so it's best to use the environment variable $HOME or &amp;lt;tt&amp;gt;~&amp;lt;/tt&amp;gt; in scripts). You have 10GB there that is backed up. '''Your home directory cannot be written to by the compute nodes!''' Thus, to run jobs, you'll use the &amp;lt;tt&amp;gt;$SCRATCH&amp;lt;/tt&amp;gt; directory (currently &amp;lt;tt&amp;gt;/scratch/g/group/USER&amp;lt;/tt&amp;gt; but again, use the environment variable). Here, there is a large amount of disk space, but it is not backed up. Thus it makes sense to keep your codes in /home, compile there, and then run them in the /scratch directory.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including any of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
Note that to use even the gcc compilers you will have to do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load gcc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
but in fact you probably should use the intel compilers installed on this system as they often produce faster executables (and occasionally, much faster.)&lt;br /&gt;
&lt;br /&gt;
A list of the installed software is available in [[Software_and_Libraries | Software &amp;amp; Libraries]] and can &lt;br /&gt;
be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load intel&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload intel&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands should got in your submission scripts to make sure you&lt;br /&gt;
are using the correct packages.  It is possible to load them in your .bashrc files as well, but this is generally not recommended (see [[Important .bashrc guidelines]]), especially if you routinely have to flip back and forth between modules.&lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' affect other shell environments; in particular, a queued job that is running is unaffected by you interactively loading a module, and conversely you loading a module at the prompt and then submitting a job does not ensure that the module is loaded when the job runs.  To ensure that a module is loaded when a job runs, be sure to put your &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command in your job submission script.&lt;br /&gt;
&lt;br /&gt;
===Compilers===&lt;br /&gt;
&lt;br /&gt;
The intel compilers are icc/icpc/ifort for C/C++/Fortran, and are available with the default module &amp;quot;intel&amp;quot;.  The intel compilers are recommended over the GNU compilers.  Documentation about icpc is available at &lt;br /&gt;
http://software.intel.com/en-us/articles/intel-software-technical-documentation/.  The Intel compilers accept many of the options that the GNU compilers accept, but tend to produce faster programs on our system.  If, for some reason, you really need the GNU compilers, the latest version of the GNU compiler collection (currently 4.4.0) is available by loading the &amp;quot;gcc&amp;quot; module, with gcc/g++/gfortran for C/C++/Fortran.   Note that f77/g77 is not supported. &lt;br /&gt;
&lt;br /&gt;
To ensure that the intel compilers are in your &amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt; and their libraries are in your &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, use the command&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load intel&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Optimize your code for the GPC machine using of at least the following compiler flags: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   -O3 -xHost&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(or &amp;lt;tt&amp;gt;-O3 -march=native&amp;lt;/tt&amp;gt; for the GNU compilers). &lt;br /&gt;
&lt;br /&gt;
*If your program uses openmp, add &amp;lt;tt&amp;gt;-openmp&amp;lt;/tt&amp;gt; (&amp;lt;tt&amp;gt;-fopenmp&amp;lt;/tt&amp;gt; for GNU compilers).&lt;br /&gt;
*If you get the warning &amp;lt;tt&amp;gt;feupdatreenv is not implemented&amp;lt;/tt&amp;gt;, add -limf to the link line.&lt;br /&gt;
*If you need to link in the MKL libraries, you are well advised to use the Intel(R) Math Kernel Library Link Line Advisor: http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/ for help in devising the list of libraries to link with your code. '''Note that this give the link line for the command prompt. When using this in Makefiles, replace $MKLPATH by ${MKLPATH}.'''&lt;br /&gt;
*More questions about compiling? See the [[FAQ#Compiling_your_Code|FAQ]].&lt;br /&gt;
&lt;br /&gt;
===Debuggers===&lt;br /&gt;
&lt;br /&gt;
* '''ddt''' - Allinea's graphical parallel debugger, in the &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; module. Highly recommended!&lt;br /&gt;
* '''gdb''' - The GNU Debugger, available in the &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
* '''idbc/idb''' - The intel debuggers, part of the &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt; module(s).&lt;br /&gt;
* '''ddd''' - A graphical debuggerThe GNU Debugger, available in the &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt; module.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note that to debug code, you have to give the &amp;lt;pre&amp;gt;-g&amp;lt;/pre&amp;gt; flags to the compiler. The intel compiler needs the additional option &amp;lt;tt&amp;gt;-debug parallel&amp;lt;/tt&amp;gt; to debug threaded/OpenMP code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===[[ GPC_MPI_Versions | MPI]]===&lt;br /&gt;
&lt;br /&gt;
SciNet currently provides multiple MPI libraries for the GPC; [http://www.open-mpi.org/ OpenMPI], and [http://software.intel.com/en-us/intel-mpi-library/ IntelMPI].  We currently recommend OpenMPI as the default, as it quite reliably demonstrates good performance on the infiniband network (and did so too on the ethernet network).  For full details and options see the complete [[ GPC_MPI_Versions | '''MPI''']] section.&lt;br /&gt;
&lt;br /&gt;
The MPI libraries are compiled with both the gnu compiler suite and the intel compiler suite.   To use (for instance) the intel-compiled OpenMPI libraries, which we recommend as the default (and use for most of our examples here), use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load intel openmpi&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
in your job submission scripts and on the command-line before compiling.  Putting these in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; is no longer recommended (since Oct 10, 2013).   &lt;br /&gt;
&lt;br /&gt;
Other combinations behave similarly.&lt;br /&gt;
&lt;br /&gt;
The MPI libraries define the wrappers mpicc/mpicxx/mpif90/mpif77 as wrappers around the appropriate compilers, which ensure the appropriate include and library directories are used in the compilation and linking steps.&lt;br /&gt;
&lt;br /&gt;
We currently recommend the Intel + OpenMPI combination.  However, if you require the GNU compilers as well as MPI, you will want to find the most recent openmpi module available with `gcc' in the version name.  This will enable development and runtime with gcc/g++/gfortran and OpenMPI. &lt;br /&gt;
&amp;lt;!-- You can make this your default by putting the module load line in your ~/.bashrc file. --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For mixed OpenMP/MPI code using Intel MPI, add the compilation flag -mt_mpi for full thread-safety (no such flag is necessary for OpenMPI).&lt;br /&gt;
&lt;br /&gt;
===Submitting A Batch Job===&lt;br /&gt;
&lt;br /&gt;
The SciNet machines are shared systems, and jobs that are to run on them are submitted to a queue; the&lt;br /&gt;
[[Moab | scheduler]] then orders the jobs in order to make the best use of the machine, and has them launched &lt;br /&gt;
when resources become availble.   The intervention of the scheduler can mean that the jobs aren't&lt;br /&gt;
quite run in a  first-in first-out order.&lt;br /&gt;
&lt;br /&gt;
The scheduler will have the job run on one or more of the compute nodes of the GPC. It is important to realize that '''on compute nodes, your home directory is read-only'''. You have to run your jobs from the $SCRATCH directory instead.  See [https://support.scinet.utoronto.ca/wiki/index.php/Data_Management  Data Management ] for more details on the file systems at SciNet.&lt;br /&gt;
&lt;br /&gt;
The maximum [[wallclock time]] for a job in the queue is 48 hours; computations that will take longer than&lt;br /&gt;
this must be broken into 48-hour chunks and run as several jobs. Also a minimum job length of 15 minutes is enforced and shorter jobs&lt;br /&gt;
should be batched together.  The usual way to do this is with [[checkpoints]],&lt;br /&gt;
writing out the complete state of the computation every so often in such a way that a job can be restarted from&lt;br /&gt;
this state information and continue on from where it left off.  Generating [[checkpoints]] is a good idea anyway,&lt;br /&gt;
as in the unlikely event of a hardware failure during your run, it allows you to restart without having lost much work.&lt;br /&gt;
&lt;br /&gt;
There are limits to how many jobs you can submit.  If your group has a default account, up to 32 nodes at a time for 48 hours per job on the GPC cluster are allowed to be queued. This is a total limit, e.g., you could request 64 nodes for 24 hours.  Jobs of users with an LRAC or NRAC allocation will run at a higher priority than others while their resources last. Because of the group-based allocation, it is conceivable that your jobs won't run if your colleagues have already exhausted your group's limits.&lt;br /&gt;
&lt;br /&gt;
Note that scheduling big jobs greatly affects the queuer and other users, so you have to talk to us first to run massively parallel jobs (&amp;gt; 2048 cores). We will help make sure that your jobs start and run efficiently.&lt;br /&gt;
&lt;br /&gt;
If your job should run in fewer than  48 hours, specify that in your script -- your job &lt;br /&gt;
will start sooner.   (It's easier for the [[Moab | scheduler]] to fit in a short job than a long job).  On the downside, the&lt;br /&gt;
job will be killed automatically by the queue manager software at the end of the specified [[wallclock time]], so if you&lt;br /&gt;
guess wrong you might lose some work.  So the standard procedure is to estimate how long your job will take and&lt;br /&gt;
add 10% or so. &lt;br /&gt;
&lt;br /&gt;
You interact with the queuing system through the queue/resource manager, [[Moab | Moab]] and [[Moab | Torque]].  To see all the jobs in the queue use&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ showq&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit your own job, you must write a script which describes the job and how it is to be run (a sample script [[GPC_Quickstart#Submission_Script | follows]]) and submit it to the queue, using the command&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ qsub [SCRIPT-FILE-NAME]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where you will replace &amp;lt;tt&amp;gt;[SCRIPT-FILE-NAME]&amp;lt;/tt&amp;gt; with the file containing the submission script.   This will return a job ID, for example 31415, which is used to identify the jobs.  Information about a queued job can be found using&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ checkjob [JOB-ID]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and jobs can be canceled with the command&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ canceljob [JOB-ID]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, these commands have many options, which can be read about on their man pages.&lt;br /&gt;
&lt;br /&gt;
Much more information on the queueing system is available on our [[Moab | queue]] page.&lt;br /&gt;
&lt;br /&gt;
====Batch Submission Script: MPI====&lt;br /&gt;
&lt;br /&gt;
A sample submission script is shown below for an mpi job with the &amp;lt;tt&amp;gt; #PBS &amp;lt;/tt&amp;gt; directives at the top and the rest being &lt;br /&gt;
what will be executed on the compute node.  &lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# MOAB/Torque submission script for SciNet GPC &lt;br /&gt;
#&lt;br /&gt;
#PBS -l nodes=2:ppn=8,walltime=1:00:00&lt;br /&gt;
#PBS -N test&lt;br /&gt;
&lt;br /&gt;
# load modules (must match modules used for compilation)&lt;br /&gt;
module load intel openmpi&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from&lt;br /&gt;
cd $PBS_O_WORKDIR&lt;br /&gt;
&lt;br /&gt;
# EXECUTION COMMAND; -np = nodes*ppn&lt;br /&gt;
mpirun -np 16 ./a.out&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The lines that begin &amp;lt;tt&amp;gt;#PBS&amp;lt;/tt&amp;gt; are commands that are parsed and interpreted by qsub at submission time, and control administrative things about your job.   In this example, the script above requests two nodes, using 8 processors per node, for a [[wallclock time]] of one hour.  (The resources required by the job are listed on the &amp;lt;tt&amp;gt;#PBS -l&amp;lt;/tt&amp;gt; line.)   Other options can be given in other &amp;lt;tt&amp;gt;#PBS&amp;lt;/tt&amp;gt; lines, such as &amp;lt;tt&amp;gt;#PBS -N&amp;lt;/tt&amp;gt;, which sets the name of the job.   &lt;br /&gt;
&lt;br /&gt;
The rest of the script is run as a bash script at run time.   A bash shell on the first node of the two nodes that are requested executes these commands as a normal bash script, just as if you had run this as a shell script from the terminal.   The only difference is that PBS sets certain environment variables that you can use in the script.  &amp;lt;tt&amp;gt;$PBS_O_WORKDIR&amp;lt;/tt&amp;gt; is set to be the directory that the command was 'submitted' from - eg,  &amp;lt;tt&amp;gt;$SCRATCH/SOMEDIRECTORY&amp;lt;/tt&amp;gt;.   The script then uses the &amp;lt;tt&amp;gt;mpirun&amp;lt;/tt&amp;gt; command to launch the job. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
* Note: The different versions of MPI require different commands to launch the run, and thus different scripts. The above script is  specific for the openmpi module.  For the intelmpi module on ethernet, the last line of the script should read&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -np 16 -env I_MPI_FABRICS shm:tcp ./a.out&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
For full MPI details see [[ GPC_MPI_Versions | MPI]]&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Submitting Collections of Serial Jobs====&lt;br /&gt;
&lt;br /&gt;
You cannot run purely serial jobs on the GPC (or any of SciNet's systems), as this would mean only one core out of 8 is used.  If you have serial jobs, you have to bunch them together.&lt;br /&gt;
SciNet-approved methods for running collections of serial jobs can be found on the [[User_Serial|serial run wiki page]].&lt;br /&gt;
&lt;br /&gt;
====Batch Submission Script: OpenMP====&lt;br /&gt;
&lt;br /&gt;
For running OpenMP jobs, the procedure is similar as for MPI jobs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# MOAB/Torque submission script for SciNet GPC (OpenMP)&lt;br /&gt;
#&lt;br /&gt;
#PBS -l nodes=1:ppn=8,walltime=1:00:00&lt;br /&gt;
#PBS -N test&lt;br /&gt;
 &lt;br /&gt;
# load modules (must match modules used for compilation)&lt;br /&gt;
module load intel&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from&lt;br /&gt;
cd $PBS_O_WORKDIR&lt;br /&gt;
&lt;br /&gt;
export OMP_NUM_THREADS=8&lt;br /&gt;
./a.out&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that [[Introduction_To_Performance#Throughput | in some circumstances]] it can be more efficient to run (say) two jobs each running on four threads than one job running on eight threads.   In that case you can use the same `ampersand-and-wait' technique outlined for serial jobs (see [[User_Serial|serial run wiki page]]) for less-than-eight-core OpenMP jobs.&lt;br /&gt;
&lt;br /&gt;
====Hybrid MPI/OpenMP jobs====&lt;br /&gt;
&lt;br /&gt;
'''Using Intel MPI'''&lt;br /&gt;
&lt;br /&gt;
Here is how to run hybrid codes using intelmpi::&lt;br /&gt;
&lt;br /&gt;
http://software.intel.com/en-us/articles/hybrid-applications-intelmpi-openmp/&lt;br /&gt;
&lt;br /&gt;
Make sure you compile with the -mt_mpi option to the compilers to use the thread safe libraries. &lt;br /&gt;
Set the environment variable I_MPI_PIN_DOMAIN:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export I_MPI_PIN_DOMAIN=omp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
This will set the process pinning domain size to be equal to OMP_NUM_THREADS (which you should set to the desired number of threads per mpi process). Therefore, each MPI process can create $OMP_NUM_THREADS number of children threads for running within the corresponding domain. If OMP_NUM_THREADS is not set, each node is treated as a separate domain (which will allow as many threads per MPI processes as there are cores).&lt;br /&gt;
&lt;br /&gt;
In addition, when invoking mpirun, you should add the argument &amp;quot;-ppn X&amp;quot;, where X is the number of MPI processes per node.&lt;br /&gt;
For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -ppn 2 -np 8 [executable]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
would start 2 mpi processes of &amp;lt;tt&amp;gt;[executable]&amp;lt;/tt&amp;gt; per node for a total of 8 processes, so mpirun will try to run mpi processes on 4 nodes&lt;br /&gt;
(OMP_NUM_THREADS is then probably best set at 4).&lt;br /&gt;
Your job script should still ask for these 4 nodes with the line&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
     #PBS -l nodes=4:ppn=8,walltime=....&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
(&amp;lt;tt&amp;gt;ppn=8&amp;lt;/tt&amp;gt; is not a mistake here; the ppn parameter has a different meaning for PBS and for mpirun)&lt;br /&gt;
&lt;br /&gt;
''The ppn parameter to ''mpirun'' is very important! Without it, eight mpi jobs would get bunched on the first node in this example, leaving 3 nodes unused.''&lt;br /&gt;
&lt;br /&gt;
NOTE: In order to pin OpenMP threads inside the domain, use the corresponding OpenMP feature by setting the KMP_AFFINITY environment variable, see [http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/lin/compiler_f/optaps/common/optaps_openmp_thread_affinity.htm#KMP_AFFINITY_Environment_Variable|Intel's Compiler User and Reference Guide].&lt;br /&gt;
&lt;br /&gt;
The IntelMPI manual is referenced on the front page of our wiki:&lt;br /&gt;
&lt;br /&gt;
http://software.intel.com/sites/products/documentation/hpc/mpi/linux/reference_manual.pdf&lt;br /&gt;
&lt;br /&gt;
For the above example of a total of 8 processes on 4 nodes, you could use the following script:&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# MOAB/Torque submission script for SciNet GPC (hybrid job)&lt;br /&gt;
#&lt;br /&gt;
#PBS -l nodes=4:ppn=8,walltime=1:00:00&lt;br /&gt;
#PBS -N test&lt;br /&gt;
&lt;br /&gt;
# load modules (must match modules used for compilation)&lt;br /&gt;
module load intel intelmpi&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from&lt;br /&gt;
cd $PBS_O_WORKDIR&lt;br /&gt;
&lt;br /&gt;
# SET THE NUMBER OF THREADS PER PROCESS:&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
&lt;br /&gt;
# PIN THE MPI DOMAINS ACCORDING TO OMP&lt;br /&gt;
export I_MPI_PIN_DOMAIN=omp&lt;br /&gt;
&lt;br /&gt;
# EXECUTION COMMAND; -np = nodes*ppn&lt;br /&gt;
mpirun -ppn 2 -np 8 ./a.out&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Using Open MPI'''&lt;br /&gt;
&lt;br /&gt;
For mixed MPI/OpenMP jobs using OpenMPI, which is the default for many users, the procedure is similar, but details differ.&lt;br /&gt;
&lt;br /&gt;
* Request the number of nodes in the PBS script.&lt;br /&gt;
* Set OMP_NUM_THREADS to the number of threads per MPI process.&lt;br /&gt;
* In addition to the -np parameter for mpirun, add the argument &amp;lt;tt&amp;gt;--bynode&amp;lt;/tt&amp;gt;, so that the mpi processes are not bunched up.&lt;br /&gt;
&lt;br /&gt;
So for example, to start a total of 8 processes on 4 nodes, you could use the following script&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# MOAB/Torque submission script for SciNet GPC (hybrid job)&lt;br /&gt;
#&lt;br /&gt;
#PBS -l nodes=4:ppn=8,walltime=1:00:00&lt;br /&gt;
#PBS -N test&lt;br /&gt;
&lt;br /&gt;
# load modules (must match modules used for compilation)&lt;br /&gt;
module load intel openmpi&lt;br /&gt;
&lt;br /&gt;
# DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from&lt;br /&gt;
cd $PBS_O_WORKDIR&lt;br /&gt;
&lt;br /&gt;
# SET THE NUMBER OF THREADS PER PROCESS:&lt;br /&gt;
export OMP_NUM_THREADS=4&lt;br /&gt;
&lt;br /&gt;
# EXECUTION COMMAND; -np = nodes*processes_per_nodes; --byhost forces a round robin of nodes.&lt;br /&gt;
mpirun -np 8 --bynode ./a.out&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* More questions about running your jobs? See the [[FAQ#Running_your_jobs|FAQs Running your jobs]] and [[FAQ#Errors_in_running_jobs|Errors in running jobs]] sections.&lt;br /&gt;
&lt;br /&gt;
===Submitting an Interactive (Debug) Job===&lt;br /&gt;
&lt;br /&gt;
Your development work flow may require a lot of small test runs.   You are allowed to do these on the development nodes, as long as it's very brief (a few minutes), and does not use all cores on the machine. For anything more you will have to use the compute nodes. &lt;br /&gt;
&lt;br /&gt;
It is sometimes convenient to run a job interactively; this can be very handy for debugging purposes.  In this case, you type a &amp;lt;tt&amp;gt;qsub&amp;lt;/tt&amp;gt; command which submits an interactive job to the queue; when the scheduler selects this job to run, then it starts a shell running on the first node of the job, which connects to your terminal.  You can then type any series of commands (for instance, the same commands listed as in the batch submission script above) to run a job interactively.&lt;br /&gt;
&lt;br /&gt;
For example, to start the same sort of job as in the batch submission script above, but interactively, one would type&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ qsub -I -l nodes=2:ppn=8,walltime=1:00:00&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This is exactly the &amp;lt;tt&amp;gt;#PBS -l&amp;lt;/tt&amp;gt; line in the batch script above (which requests all 8 processors on each of 2 nodes for one hour), but prepended with a &amp;lt;tt&amp;gt;-I&amp;lt;/tt&amp;gt; for `interactive'.   When this job begins, your terminal will now show you as being logged in to one of the compute nodes, and one can type in any shell command, run &amp;lt;tt&amp;gt;mpirun&amp;lt;/tt&amp;gt;, etc.   When you exit the shell, the job will end.  Interactive jobs can be used with any of the [[ Moab#GPC | GPC queues ]] however, there is a short&lt;br /&gt;
high turnover queue called [[ Moab#debug | debug ]] which can be especially useful when the system is busy. &lt;br /&gt;
&lt;br /&gt;
*More questions about running test jobs? See the [[FAQ#Testing_your_Code|FAQ]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
===Ethernet vs. Infiniband===&lt;br /&gt;
&lt;br /&gt;
About 1/4 of the GPC (862 nodes or 6896 cores) is connected with a high bandwidth low-latency fabric called&lt;br /&gt;
[http://en.wikipedia.org/wiki/InfiniBand InfiniBand].  Many jobs which require tight coupling to scale well greatly benefit from this interconnect;&lt;br /&gt;
other types of jobs, which have relatively modest communications, do not require this and run fine on Gigabit ethernet.&lt;br /&gt;
&lt;br /&gt;
Jobs which require the InfiniBand for good performance can request the nodes that have the `&amp;lt;tt&amp;gt;ib&amp;lt;/tt&amp;gt;' feature in the &amp;lt;tt&amp;gt;#PBS -l&amp;lt;/tt&amp;gt; line,&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#PBS -l nodes=2:ib:ppn=8,walltime=1:00:00&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Furthermore, if your mpirun command specifically requests a fabric in its options (eg. ssm), you will have to change those options as well. See [[GPC MPI Versions]].&lt;br /&gt;
&lt;br /&gt;
Because there are a limited number of these nodes, your job will start running faster if you do not request them (e.g. if you use the scripts as shown above), as this increases the number of nodes available to run your job. In fact, the InfiniBand nodes are to be used only for jobs that are known to scale well and  will benefit from this type of interconnect. As such the minimum number of nodes requested has to be at least 2, as single node jobs will not benefit from using an&lt;br /&gt;
Infiniband node. The MPI libraries provided by SciNet automatically correctly use either the InfiniBand or ethernet interconnect depending on which nodes your job runs on.&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===QDR vs. DDR Infiniband===&lt;br /&gt;
&lt;br /&gt;
The GPC Infiniband network GPC has two sections, one connecting 3,024 nodes with 5:1 blocking/oversubscribed QDR and the second connecting 840 nodes 1:1 non-blocking DDR Infiniband. By default a user's job will go to whichever network section best accommodates it, typically smaller jobs to the QDR and larger jobs to the DDR. However a user can override this by simply adding the flags &amp;quot;ddr&amp;quot; or &amp;quot;qdr&amp;quot; to the job resource request.&lt;br /&gt;
&lt;br /&gt;
For example, to request two nodes anywhere on the GPC (QDR or DDR), use&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#PBS -l nodes=2:ppn=8,walltime=1:00:00&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
in your job submission script.&lt;br /&gt;
&lt;br /&gt;
For two nodes using DDR, use&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#PBS -l nodes=2:ddr:ppn=8,walltime=1:00:00&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To get two nodes using QDR, instead, you would use&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#PBS -l nodes=2:qdr:ppn=8,walltime=1:00:00&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The queueing system also tries its best to keep jobs within the same switch of the QDR thus avoiding the 5:1 blocking. A user can can explicitly request this behaviour if their jobs is less than 30 nodes using&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#PBS -W x=nodesetisoptional:false&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===HyperThreading===&lt;br /&gt;
&lt;br /&gt;
Each GPC compute node has 8 Nehalem cores (2 sockets each with a four-core Intel Xeon E5540 @ 2.53GHz).  Thus, to make full use of the computing power of a GPC node, you must be running least 8 &amp;quot;tasks&amp;quot; -- MPI processes, or OpenMP threads.&lt;br /&gt;
&lt;br /&gt;
Under most circumstances, running exactly 8 tasks is the most efficient way to use these nodes.  However, sometimes software design (eg, having one thread for communication and one for computation) can usefully `oversubscribe' the number of physical cores, and running (say) twice as many tasks as cores can be a useful strategy.   If your code is highly memory-bandwidth bound, having one task ready to run while another waits for memory access can make more effective use of the processor.&lt;br /&gt;
&lt;br /&gt;
The Nehalem processors have hardware support for such two-way overloading of processors, through &amp;quot;HyperThreading&amp;quot;; there are an extra set of registers on each core to facilitate rapid switching between two tasks, making it look to the operating system that there are in fact 16 cores per node.   Depending on the nature of your code, making use of these virtual extra cores may speed up or slow down your computation; you should run small test cases before running production jobs in this manner.  In most cases, the speed difference will be under 10%.  Some of our users have obtained an 8% speedup by running gromacs with 16 tasks instead of 8 on a single node (mpirun -np 16 ./gromacs/mdrun -npme 4 is 108% the speed of mpirun -np 8 ./gromacs/mdrun with -npme 2 or -1).&lt;br /&gt;
&lt;br /&gt;
====HyperThreading with OpenMP====&lt;br /&gt;
&lt;br /&gt;
To use hyperthreading with an OpenMP job, one just runs twice as many threads as one would have previously; eg, if you were running 8 threads before (&amp;lt;tt&amp;gt;export OMP_NUM_THREADS=8&amp;lt;/tt&amp;gt;) you would run with 16 (&amp;lt;tt&amp;gt;export OMP_NUM_THREADS=16&amp;lt;/tt&amp;gt;).  Everything else remains the same, including the job submission script; one still uses &amp;lt;tt&amp;gt;ppn=8&amp;lt;/tt&amp;gt; in the submission of the job, as Torque has no way of knowing (or reason for caring) that you will be running on 16 `virtual' cores rather than 8 physical cores.&lt;br /&gt;
&lt;br /&gt;
====HyperThreading with MPI====&lt;br /&gt;
&lt;br /&gt;
To use hyperthreading with an MPI job, one just runs twice as many MPI processes as one would have previously; eg, if you were running on three nodes using 8 MPI tasks per node and used &amp;lt;tt&amp;gt;mpirun ... -np 24&amp;lt;/tt&amp;gt;, you could run instead with &amp;lt;tt&amp;gt;-np 48&amp;lt;/tt&amp;gt;.  Everything else remains the same, including the job submission script; one still uses &amp;lt;tt&amp;gt;ppn=8&amp;lt;/tt&amp;gt; in the submission of the job, as Torque has no way of knowing (or reason for caring) that you will be running on 16 `virtual' cores rather than 8 physical cores.&lt;br /&gt;
&lt;br /&gt;
Note that if you are using OpenMPI (as is the default), there is another consideration; OpenMPI assumes that there is no oversubscription and each task very aggressively makes full use of a core when it is waiting for a message (eg, the waits are &amp;quot;busywaits&amp;quot;).  If you find a significant slowdown when running multiple MPI tasks per core with OpenMPI, you may want to try adding the additional option to mpirun: &amp;lt;tt&amp;gt;--mca mpi_yield_when_idle 1&amp;lt;/tt&amp;gt;.  This will increase the latency of individual messages, but free up the core to do additional work while waiting.&lt;br /&gt;
&lt;br /&gt;
With IntelMPI, the problem should be less pronounced, but you can still improve things by using &amp;lt;tt&amp;gt;mpirun -genv I_MPI_SPIN_COUNT 1 ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Examples of hyperthreading with MPI'''&lt;br /&gt;
&lt;br /&gt;
Hyperthreading using gromacs: https://support.scinet.utoronto.ca/wiki/index.php/Gromacs#Hyperthreading_with_Gromacs&lt;br /&gt;
&lt;br /&gt;
====HyperThreading with Hybrid MPI/OpenMP codes====&lt;br /&gt;
&lt;br /&gt;
With a hybrid code, one has extra flexibility in how to assign the &amp;quot;extra&amp;quot; cores -- you could run extra MPI tasks or extra OpenMPI threads.  As with all hybrid codes, the combination which results in the best performance depends very strongly on the nature of your code, and you should experiment with different combinations.   In addition, with hybrid codes processor and memory affinity issues become very important; if you're unsure as to how to tune your application for best performance, please make an appointment with the SciNet technical analysts for more help.&lt;br /&gt;
&lt;br /&gt;
===Memory Configuration===&lt;br /&gt;
&lt;br /&gt;
'''16G'''&lt;br /&gt;
&lt;br /&gt;
There are 3756 nodes which have 16G of memory, and is the primary configuration in the GPC. These nodes will be used by default.&lt;br /&gt;
On these nodes, about 2 GB is taken by the operating system. So for mpi runs with 8 processes per node, this leaves about 1.75GB max per mpi process.  Do not try to use more than the available memory: the node will crash and your job will either die or hang until the requested walltime has elapsed.&lt;br /&gt;
&lt;br /&gt;
If you need more memory per process or per thread, you can either try to use the limit number of larger memory nodes listed below, or you can run with less mpi processes, or use a different decomposition, such that the job fits on a node.&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
&lt;br /&gt;
'''18G'''&lt;br /&gt;
&lt;br /&gt;
There are 24 nodes which have 18G of memory. These nodes have a fully populated memory configuration that maximizes memory bandwidth. Note that also on these nodes, about 2 GB is taken by the operating system. To&lt;br /&gt;
request these nodes use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ qsub -l nodes=2:m18g:ppn=8,walltime=1:00:00 &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''32G'''&lt;br /&gt;
&lt;br /&gt;
There are 84 nodes which have 32G of memory. To request these nodes use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ qsub -l nodes=2:m32g:ppn=8,walltime=1:00:00 &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, also on these nodes, about 2 GB is taken by the operating system, but this is a relatively small amount compared to the total of 32GB.&lt;br /&gt;
&lt;br /&gt;
'''64G/128G/256G'''&lt;br /&gt;
&lt;br /&gt;
There are 72 16-core Intel Sandybridge nodes which have 64G, 2 with 128G, and 2 with 256G of memory available as part of the contributed [[Sandy]] cluster. These nodes are requested through the &amp;lt;tt&amp;gt;sandy&amp;lt;/tt&amp;gt; queue.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ qsub -l nodes=1:m64g:ppn=16,walltime=1:00:00 -q sandy&lt;br /&gt;
$ qsub -l nodes=1:m128g:ppn=16,walltime=1:00:00 -q sandy&lt;br /&gt;
$ qsub -l nodes=1:m256g:ppn=16,walltime=1:00:00 -q sandy&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, also on these nodes, about 2 GB is taken by the operating system, but this is a relatively small amount compared to the total.&lt;br /&gt;
&lt;br /&gt;
'''128G'''&lt;br /&gt;
&lt;br /&gt;
There are two other stand-alone large memory (128GB) nodes which are primarily to be used for data analysis of runs.  They have 16 cores and are intel machines running linux, but they are of a different architecture than the GPC compute nodes, so codes may have to be compiled separately for these machines.  This also means that some modules that work on the other GPC nodes, such as octave, will not work in the 128G nodes.&lt;br /&gt;
&lt;br /&gt;
These nodes can be accessed using a specific &amp;lt;tt&amp;gt;largemem&amp;lt;/tt&amp;gt; queue.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ qsub -l nodes=2:ppn=16,walltime=1:00:00 -q largemem -I&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note:''' To estimate your time of access to these nodes, use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ showq -w class=largemem&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Ram Disk===&lt;br /&gt;
&lt;br /&gt;
On the GPC nodes, there is a `ram disk' available - up to half of the memory on the node may be used as a temporary file system.  This is particularly useful for use in the early stages of migrating destop-computing codes to a High Performance Computing platform such as the GPC.    It is much faster than real disk and does not require network traffic; however, each node sees its own ramdisk and cannot see files on that of other nodes.   This is a very easy way to cache writes (by writing them to fast ram disk instead of slow `real' disk); and then one would periodically copy the files to files on /scratch or /project so that they are available after the job has completed.&lt;br /&gt;
&lt;br /&gt;
To use the ramdisk, create and read to or write from files in /dev/shm/... just as one would to (eg) $SCRATCH.  Only the amount of RAM needed to store the files will be taken up by the temporary file system; thus if you have 8 serial jobs each requiring 1 GB of RAM, and 1GB is taken up by various OS services, you would still have approximately 7GB available to use as ramdisk on a 16GB node.   However, if you were to write 8 GB of data to the RAM disk, this would exceed available memory and your job would likely crash.&lt;br /&gt;
   &lt;br /&gt;
NOTE: it is very important to delete your files from ram disk at the end of your job.   If you do not do this, the next user to use that node will have less RAM available than they might expect, and this might kill their jobs.&lt;br /&gt;
&lt;br /&gt;
More details on how to setup your script to use the ramdisk can be found on the [[User_Ramdisk|Ramdisk wiki page]].&lt;br /&gt;
&lt;br /&gt;
=== Managing jobs on the Queuing system ===&lt;br /&gt;
&lt;br /&gt;
Information on checking available resources, starting, viewing, managing and canceling jobs on [[Moab | Moab/Torque]]. Also check out the [[FAQ#Monitoring_jobs_in_the_queue|FAQ]].&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7108</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=7108"/>
		<updated>2014-07-07T15:01:13Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Batch Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/about/our-research-partners/soscip/].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
debugjob session with an executable implicitly calls runjob  with 1 mpi task :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
debugjob -i&lt;br /&gt;
**********************************************************&lt;br /&gt;
 Interactive BGQ runjob shell using bgq-fen1-ib0.10295.0 and           &lt;br /&gt;
 LL14040718574824 for 30 minutes with 64 NODES (1024 cores). &lt;br /&gt;
 IMPLICIT MODE: running an executable implicitly calls runjob&lt;br /&gt;
                with 1 mpi task&lt;br /&gt;
 Exit shell when finished.                                &lt;br /&gt;
**********************************************************&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.3, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7106</id>
		<title>Oldwiki.scinet.utoronto.ca:System Alerts</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7106"/>
		<updated>2014-07-04T13:03:41Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== System Status==&lt;br /&gt;
&amp;lt;!-- The 'status circles' can be one of the following files: &lt;br /&gt;
     down.png   for down&lt;br /&gt;
     up25.png   for 25% up&lt;br /&gt;
     up50.png   for 50% up&lt;br /&gt;
     up75.png   for 75% up&lt;br /&gt;
     up.png     for 100% up&lt;br /&gt;
 --&amp;gt;&lt;br /&gt;
{| &lt;br /&gt;
|[[File:up.png|up|link=GPC Quickstart]][[GPC Quickstart|GPC]]&lt;br /&gt;
|[[File:up.png|up|link=TCS Quickstart]][[TCS Quickstart|TCS]]&lt;br /&gt;
|[[File:up.png|up|link=Sandy]][[Sandy]]&lt;br /&gt;
|[[File:up.png|up|link=GPU Devel Nodes]][[GPU Devel Nodes|ARC]]&lt;br /&gt;
|[[File:up.png|up]]File System&lt;br /&gt;
|-&lt;br /&gt;
|[[File:up.png|up|link=Gravity]][[Gravity]]&lt;br /&gt;
|[[File:up.png|up|link=P7 Linux Cluster]][[P7 Linux Cluster|P7]]&lt;br /&gt;
|[[File:up.png|up|link=BGQ]][[BGQ]]&lt;br /&gt;
|[[File:up.png|up|link=HPSS]][[HPSS]]&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Mon Jun 30 15:19:39 EDT: All system down. Some kind of power issue (again). &lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 19:57:29: Compute systems started coming online about 730PM.&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 18:20:41:  filesystems restarted after some issues. Likely at least 8PM before compute systems available&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 16:39:35 EDT 2014:    large voltage spike tripped our main circuit breaker.  We have power though it's out at sites within 2k because of lightning strike.  Cooling system being restored&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 15:47:11 EDT 2014:    staff enroute to site. Should have update on cause within an hour&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 15:40:31 EDT 2014:    power lost about 3:20P today. All systems down. Investigating.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note: As a precaution, emails by the Moab/Torque scheduler have been disabled because of a potential security vulnerability since Jan 24th 2014.&lt;br /&gt;
&lt;br /&gt;
Last updated: Fri May 23 12:01:44 EDT 2014&lt;br /&gt;
([[Previous_messages:|Previous messages]])&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7103</id>
		<title>Oldwiki.scinet.utoronto.ca:System Alerts</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Oldwiki.scinet.utoronto.ca:System_Alerts&amp;diff=7103"/>
		<updated>2014-07-02T17:21:53Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* System Status */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== System Status==&lt;br /&gt;
&amp;lt;!-- The 'status circles' can be one of the following files: &lt;br /&gt;
     down.png   for down&lt;br /&gt;
     up25.png   for 25% up&lt;br /&gt;
     up50.png   for 50% up&lt;br /&gt;
     up75.png   for 75% up&lt;br /&gt;
     up.png     for 100% up&lt;br /&gt;
 --&amp;gt;&lt;br /&gt;
{| &lt;br /&gt;
|[[File:up.png|up|link=GPC Quickstart]][[GPC Quickstart|GPC]]&lt;br /&gt;
|[[File:up.png|up|link=TCS Quickstart]][[TCS Quickstart|TCS]]&lt;br /&gt;
|[[File:up.png|up|link=Sandy]][[Sandy]]&lt;br /&gt;
|[[File:up.png|up|link=GPU Devel Nodes]][[GPU Devel Nodes|ARC]]&lt;br /&gt;
|[[File:up.png|up]]File System&lt;br /&gt;
|-&lt;br /&gt;
|[[File:up.png|up|link=Gravity]][[Gravity]]&lt;br /&gt;
|[[File:up.png|up|link=P7 Linux Cluster]][[P7 Linux Cluster|P7]]&lt;br /&gt;
|[[File:up.png|up|link=BGQ]][[BGQ]]&lt;br /&gt;
|[[File:up.png|up|link=HPSS]][[HPSS]]&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Wed Jul 02 13:20:00 EDT : Hadware issue on BlueGene/Q : cannot run full system jobs (2048 nodes)&lt;br /&gt;
&lt;br /&gt;
Mon Jun 30 15:19:39 EDT: All system down. Some kind of power issue (again). &lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 19:57:29: Compute systems started coming online about 730PM.&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 18:20:41:  filesystems restarted after some issues. Likely at least 8PM before compute systems available&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 16:39:35 EDT 2014:    large voltage spike tripped our main circuit breaker.  We have power though it's out at sites within 2k because of lightning strike.  Cooling system being restored&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 15:47:11 EDT 2014:    staff enroute to site. Should have update on cause within an hour&lt;br /&gt;
&lt;br /&gt;
Sun Jun 29 15:40:31 EDT 2014:    power lost about 3:20P today. All systems down. Investigating.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Note: As a precaution, emails by the Moab/Torque scheduler have been disabled because of a potential security vulnerability since Jan 24th 2014.&lt;br /&gt;
&lt;br /&gt;
Last updated: Fri May 23 12:01:44 EDT 2014&lt;br /&gt;
([[Previous_messages:|Previous messages]])&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6998</id>
		<title>Software and Libraries</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6998"/>
		<updated>2014-05-20T13:46:03Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* GPC Software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Software Module System =&lt;br /&gt;
All the software listed on this page is accessed using a modules system.  This means that much of the software is not &lt;br /&gt;
accessible by default but has to be loaded using the module command. The&lt;br /&gt;
reason is that&lt;br /&gt;
* it allows us to easily keep multiple versions of software for different users on the system;&lt;br /&gt;
* it allows users to easily switch between versions.&lt;br /&gt;
The module system works similarly on the GPC and the TCS, although different modules are installed on these two systems.&lt;br /&gt;
&lt;br /&gt;
Note that, generally, if you compile a program with a module loaded, you will have to run it with that same module loaded, to make dynamically linked libraries accessible.&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
!{{Hl2}}|Function&lt;br /&gt;
!{{Hl2}}|Command&lt;br /&gt;
!{{Hl2}}|Comments&lt;br /&gt;
|-&lt;br /&gt;
|List available software packages:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If a module is not listed here, it is not supported.&lt;br /&gt;
*The flag &amp;quot;(default)&amp;quot; is never part of the name.&lt;br /&gt;
|-&lt;br /&gt;
|Use particular software:&lt;br /&gt;
|&amp;lt;pre&amp;gt; $ module load [module-name] &amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If possible, specify only the short name (the part before the &amp;quot;/&amp;quot;). &lt;br /&gt;
*When ambiguous, this loads the default one. &lt;br /&gt;
|-&lt;br /&gt;
|List available versions of a specific software package:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail [short-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|List currently loaded modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module list&amp;lt;/pre&amp;gt;&lt;br /&gt;
|For reproducability, it is a good idea to put this in your job scripts, so you know exactly what modules(+version) were used.&lt;br /&gt;
|-&lt;br /&gt;
|Get description of a particular module:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module help [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove a module from your shell:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module unload [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove all modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module purge&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Replace one loaded module with another:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module switch [old-module-name] [new-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SCINET_[short-module-name]_BASE&lt;br /&gt;
SCINET_[short-module-name]_LIB&lt;br /&gt;
SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[short-module-name]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[short-module-name]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Errors in loaded modules can arise for a few reasons, for instance:&lt;br /&gt;
* A module by that name may not exist.&lt;br /&gt;
* Some modules require other modules to have been loaded; it this requirement is not met when you try to load that module, an error message will be printed explaining what module is needed.&lt;br /&gt;
* Some modules cannot be loaded together: an error message will be printed explaining which modules conflict.&lt;br /&gt;
&lt;br /&gt;
It is no longer recommended to load modules in the file [[Important_.bashrc_guidelines|.bashrc]] in your home directory, rather, load them explicitly on the command-line and in your job scripts.&lt;br /&gt;
&lt;br /&gt;
== Default and non-default modules ==&lt;br /&gt;
&lt;br /&gt;
When you load a module with its 'short' name, you will get the ''default'' version, which is the most recent (usually), recommended version of that library or piece of software.  In general, using the short module name is the way to go. However, you may have code that depends on the intricacies of a non-default version.  For that reason, the most common older versions are also available as modules.  You can find all available modules using the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
== Naming convention ==&lt;br /&gt;
&lt;br /&gt;
For modules that access applications, the full name of a module is as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  [short-module-name]/[version-number]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To have all modules conform to this convention, a number of modules' name change on Nov 3, 2010:&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''old name'''&lt;br /&gt;
| '''new name'''&lt;br /&gt;
| '''remarks'''&lt;br /&gt;
|-&lt;br /&gt;
|autoconf/autoconf-2.64 &amp;amp;nbsp; &amp;amp;nbsp;&amp;amp;nbsp;&lt;br /&gt;
|autoconf/2.64&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.0           &lt;br /&gt;
|cuda/3.0&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.1          &lt;br /&gt;
|cuda/3.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/ddd-3.3.12   &lt;br /&gt;
|ddd/3.3.12&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/gdb-7.1       &lt;br /&gt;
|gdb/7.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|editors/nano/2.2.4      &lt;br /&gt;
|nano/2.2.4&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|emacs/emacs-23.1        &lt;br /&gt;
|emacs/23.1.1&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|gcc/gcc-4.4.0           &lt;br /&gt;
|gcc/4.4.0&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|graphics/ncview         &lt;br /&gt;
|ncview/1.93&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|graphics/graphics       &lt;br /&gt;
|grace/5.1.22&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|                        &lt;br /&gt;
|gnuplot/4.2.6&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|svn/svn165              &lt;br /&gt;
|svn/1.6.5&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|visualization/paraview  &lt;br /&gt;
|paraview/3.8&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|amber10/amber10         &lt;br /&gt;
|amber/10.0.30 &lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|gamess/gamess           &lt;br /&gt;
|gamess/May2209 &amp;amp;nbsp;&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==modulefind - Finding modules by name==&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command will only show you modules whose names start with the argument that you give it, and will alsi return modules that you cannot load due to conflicts with already loaded modules.&lt;br /&gt;
&lt;br /&gt;
A little SciNet utility called &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt; (one word) can do that. It will list all installed modules which contain the arguments, and will determine  whether those modules have &lt;br /&gt;
been loaded, could be loaded, cannot because of conflicts with&lt;br /&gt;
already loaded modules, or have unresolved dependencies &lt;br /&gt;
(i.e. for which other modules need to be loaded first).  This is especially useful in cases like the &amp;quot;boost&amp;quot; libraries, whose module names are cxxlibraries/boost/1.47.0-gcc and cxxlibraries/boost/1.47.0-gcc, for the gcc and intel compiler, respectively.  &amp;lt;tt&amp;gt;modulefind boost&amp;lt;/tt&amp;gt; will find those, whereas &amp;lt;tt&amp;gt;module avail boost&amp;lt;/tt&amp;gt; will not.&lt;br /&gt;
&lt;br /&gt;
Note that just 'modulefind' will list all top-level modules.&lt;br /&gt;
&lt;br /&gt;
== Making your own modules ==&lt;br /&gt;
&lt;br /&gt;
How to make your own modules (e.g. for local installations or to access optional perl modules, ...), is possible and described on the [[Installing your own modules]] page.&lt;br /&gt;
&lt;br /&gt;
== Deprecated modules ==&lt;br /&gt;
&lt;br /&gt;
Some older software modules for which newer versions exist, get deprecated, which means they do not get maintained.  Since deprecated modules should only be needed in rare exceptional cases, they are not listed by the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.  However, if you have a piece of legacy code that really depends on a deprecated version of a library (and we urge you to check that it does not work with newer versions!), then you can load a deprecated version by &amp;lt;pre&amp;gt;module load use.deprecated [deprecated-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Currently (Oct 5,2010), the following modules are deprecated on the GPC: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/gcc-4.3.2          hdf5/184-v16-serial     intel/intel-v11.1.046               openmpi/1.3.3-intel-v11.0-ofed&lt;br /&gt;
hdf5/183-v16-openmpi   hdf5/184-v18-intelmpi   intelmpi/impi-3.2.1.009             openmpi/1.3.2-intel-v11.0-ofed.orig&lt;br /&gt;
hdf5/183-v18-openmpi   hdf5/184-v18-openmpi    intelmpi/impi-3.2.2.006             pgplot/5.2.2-gcc.old            &lt;br /&gt;
hdf5/184-v16-intelmpi  hdf5/184-v18-serial     intelmpi/impi-4.0.0.013             pgplot/5.2.2-intel.old&lt;br /&gt;
hdf5/184-v16-openmpi   intel/intel-v11.0.081   intelmpi/impi-4.0.0.025               &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the TCS, currently (Oct 5,2010) the only deprecated module is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ncl/5.1.1old&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Before using any of these deprecated modules, make sure that there is not a regular module that satisfies your needs, likely by a ''very similar name''.&lt;br /&gt;
&lt;br /&gt;
== Commercial software ==&lt;br /&gt;
&lt;br /&gt;
Apart from the compilers on our systems and the ddt parallel debugger, we generally do not provide licensed application software, e.g., no Gaussian, IDL, Matlab, etc. &lt;br /&gt;
See the [https://support.scinet.utoronto.ca/wiki/index.php/FAQ#How_can_I_run_Matlab_.2F_IDL_.2F_Gaussian_.2F_my_favourite_commercial_software_at_SciNet.3F FAQ].&lt;br /&gt;
&lt;br /&gt;
== Other software and libraries ==&lt;br /&gt;
&lt;br /&gt;
If you want to use a piece of software or a library that is not on the list, you can in principle install it yourself in you &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
Note however that building libraries and software from source often uses a lot of files. To avoid running out of disk space, building software is therefore best done from the &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt;, from which&lt;br /&gt;
you can copy/install only the libraries, header files and binaries to your &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
&lt;br /&gt;
If you suspect that a particular piece of software or a library would be of use to other users of SciNet as well, contact us, and we will consider adding it to the system.&lt;br /&gt;
&lt;br /&gt;
== Software lists ==&lt;br /&gt;
=== ARC/GPU Software ===&lt;br /&gt;
&lt;br /&gt;
The CPUs in the GPU nodes of the ARC cluster are of the same kind as those of the GPC, so all modules available on the GPC are available on the GPU nodes with a CentOS 6 image. This means that the different cuda variants that are available as modules, can be loaded on those GPC nodes as well, although they are of little use on that system.&lt;br /&gt;
&lt;br /&gt;
=== GPC Software ===&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Software  &lt;br /&gt;
!{{Hl2}}| Versions&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-  &lt;br /&gt;
|Intel Compiler&lt;br /&gt;
|12.1.3*, 12.1.5, 13.1.0&lt;br /&gt;
| includes MKL library, which includes BLAS, LAPACK, FFT, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;icpc,icc,ifort&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.6.1*, 4.7.0, 4.7.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Cuda&lt;br /&gt;
| 3.2, 4.0, 4.1*, 4.2&lt;br /&gt;
| NVIDIA's extension to C for GPGPU programming&lt;br /&gt;
| &amp;lt;tt&amp;gt;nvcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| PGI Compiler&lt;br /&gt;
| 12.5&lt;br /&gt;
| supports OpenACC and CUDA Fortran &lt;br /&gt;
| &amp;lt;tt&amp;gt;pgcc,pgcpp,pgfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| IntelMPI&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| MPICH2 based MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intelmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| OpenMPI&lt;br /&gt;
| 1.4.4*, 1.5.4&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 2.12.2&lt;br /&gt;
| Berkley Unified Parallel C Implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;upcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Editors'''''&lt;br /&gt;
|- &lt;br /&gt;
| Nano&lt;br /&gt;
| 2.2.4&lt;br /&gt;
| Nano's another editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Emacs&lt;br /&gt;
| 23.1.1&lt;br /&gt;
| New version of popular text editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| XEmacs&lt;br /&gt;
| 21.4.22&lt;br /&gt;
| XEmacs editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Development tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| Autoconf&lt;br /&gt;
| 2.68&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Automake&lt;br /&gt;
| 1.11.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;aclocal, automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CMake&lt;br /&gt;
| 2.8.6&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scons&lt;br /&gt;
| 2.0&lt;br /&gt;
| Software construction tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Git&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Revision control system&lt;br /&gt;
| &amp;lt;tt&amp;gt;git,gitk&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;git&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Intel tools&lt;br /&gt;
| 2011&lt;br /&gt;
| Intel Code Analysis Tools&lt;br /&gt;
| Vtune Amplifier XE, Inspector XE&lt;br /&gt;
| &amp;lt;tt&amp;gt;inteltools&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Mercurial&lt;br /&gt;
| 1.8.2&lt;br /&gt;
| Version control system&amp;lt;br&amp;gt;(part of the python module!)&lt;br /&gt;
| &amp;lt;tt&amp;gt;hg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug and performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool, + MAP MPI Profiler&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt, map&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| DDD&lt;br /&gt;
| 3.3.12&lt;br /&gt;
| Data Display Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GDB&lt;br /&gt;
| 7.3.1&lt;br /&gt;
| GNU debugger (the intel idbc debugger is available by default)&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| MPE2&lt;br /&gt;
| 2.4.5&lt;br /&gt;
| Multi-Processing Environment with intel + OpenMPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpecc, mpefc, jumpshot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#OpenSpeedShop_.28profiling.2C_MPI_tracing:_GPC.29 | OpenSpeedShop]]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| sampling and MPI tracing&lt;br /&gt;
| &amp;lt;tt&amp;gt;openss, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openspeedshop&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#Scalasca_.28profiling.2C_tracing:_TCS.2C_GPC.29 | Scalasca]]&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications (Compiled with OpenMPI)&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipm-hpc.sourceforge.net IPM]&lt;br /&gt;
| 0.983&lt;br /&gt;
| Integrated Performance Monitors http://ipm-hpc.sourceforge.net/]&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm, ipm_parse, ploticus,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Performance_And_Debugging_Tools:_GPC#Valgrind | Valgrind]]&lt;br /&gt;
| 3.6.1&lt;br /&gt;
| Memory checking utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind,cachegrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Padb&lt;br /&gt;
| 3.2 &lt;br /&gt;
| examine and debug parallel programs&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|&amp;lt;span id=&amp;quot;anchor_viz&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;'''''Visualization tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| Grace&lt;br /&gt;
| 5.1.22&lt;br /&gt;
| Plotting utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;xmgrace&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;grace&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Gnuplot&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Plotting utility&amp;lt;br&amp;gt;Requires 'extras' module if used on compute nodes.&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[ Using_Paraview | ParaView ]]&lt;br /&gt;
| 3.12.0&lt;br /&gt;
| Scientific visualization, server only&lt;br /&gt;
| &amp;lt;tt&amp;gt;pvserver,pvbatch,pvpython&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;paraview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| VMD&lt;br /&gt;
| 1.9&lt;br /&gt;
| Visualization and analysis utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCL/NCARG&lt;br /&gt;
| 6.0.0&lt;br /&gt;
| NCARG graphics and ncl utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| ROOT&lt;br /&gt;
| 5.30.00&lt;br /&gt;
| ROOT Analysis Framework from CERN&lt;br /&gt;
| &amp;lt;tt&amp;gt;root&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ROOT&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| ImageMagick&lt;br /&gt;
| 6.6.7&lt;br /&gt;
| Image manipulation tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;convert,animate,composite,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ImageMagick&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PGPLOT&lt;br /&gt;
| 5.2.2&lt;br /&gt;
| Graphics subroutine library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libcpgplot,libpgplot,libtkpgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ncview&lt;br /&gt;
| 2.1.1&lt;br /&gt;
| Visualization for NetCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, etc.&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CDO&lt;br /&gt;
| 1.5.1&lt;br /&gt;
| Climate Data Operators&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| UDUNITS&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc,hdiff,...,libdf,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Hdf5 | HDF5]]&lt;br /&gt;
| 1.8.7-v18*&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.arg0.net/encfs EncFS ]&lt;br /&gt;
| 1.74&lt;br /&gt;
| EncFS provides an encrypted filesystem in user-space, (works ONLY on gpc01..04)&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|- &lt;br /&gt;
| [[amber|AMBER 10]]&lt;br /&gt;
| Amber 10 + Amber Tools 1.3&lt;br /&gt;
| Amber Molecular Dynamics Package&lt;br /&gt;
| &amp;lt;tt&amp;gt;sander, sander.MPI&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;amber&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gamess|GAMESS (US)]]&lt;br /&gt;
| August 18, 2011 R1&lt;br /&gt;
| General Atomic and Molecular Electronic Structure System&lt;br /&gt;
| &amp;lt;tt&amp;gt;rungms&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gamess&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gromacs|GROMACS]]&lt;br /&gt;
| 4.5.5, 4.5.7, 4.6.2&lt;br /&gt;
| GROMACS molecular mechanics, single precision, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;grompp, mdrun&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gromacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[namd|NAMD]]&lt;br /&gt;
| 2.8&lt;br /&gt;
| NAMD - Scalable Molecular Dynammics&lt;br /&gt;
| &amp;lt;tt&amp;gt;namdmpiexec, namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[nwchem|NWChem]]&lt;br /&gt;
| 6.0&lt;br /&gt;
| NWChem Quantum Chemistry&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 4.3.2, 5.0.3&lt;br /&gt;
| Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;pw.x, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://blast.ncbi.nlm.nih.gov BLAST]&lt;br /&gt;
| 2.2.23+&lt;br /&gt;
| Basic Local Alignment Search Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;blastn,blastp,blastx,psiblast,tblastn...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;blast&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
| [http://denovoassembler.sourceforge.net RAY]&lt;br /&gt;
| 2.1.0 (small k-mer)&lt;br /&gt;
| Parallel de novo genome assemblies&lt;br /&gt;
| &amp;lt;tt&amp;gt;Ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[cpmd|CPMD]]&lt;br /&gt;
| 3.13.2&lt;br /&gt;
| Carr-Parinello molecular dynamics, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd.x&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[R Statistical Package|R]] &lt;br /&gt;
| 2.13.1&lt;br /&gt;
| statistical computing&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Octave&lt;br /&gt;
| 3.4.3&lt;br /&gt;
| Matlab-like environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.mcs.anl.gov/petsc/petsc-as/  PETSc ]&lt;br /&gt;
| 3.1*&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation (PETSc)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc, etc.. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Armadillo C++ linear algebra library | Armadillo]]&lt;br /&gt;
| 3.910.0&lt;br /&gt;
| C++ armadillo libraries (implement Matlab-like syntax)&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.tacc.utexas.edu/tacc-projects/gotoblas2/ GotoBLAS]&lt;br /&gt;
| 1.13&lt;br /&gt;
| Optimized BLAS implementation &lt;br /&gt;
| &amp;lt;tt&amp;gt;libgoto2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gotoblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13*, 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.3&lt;br /&gt;
| fast Fourier transform library&lt;br /&gt;
''Be careful in combining fftw3 and MKL: you need to link fftw3 first, with'' &amp;lt;tt&amp;gt;-L${SCINET_FFTW_LIB} -lfftw3&amp;lt;/tt&amp;gt;, then link MKL&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| LAPACK&lt;br /&gt;
| &lt;br /&gt;
| Provided by the Intel MKL library&lt;br /&gt;
| See http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://freshmeat.net/projects/rlog  RLog ]&lt;br /&gt;
| 1.4&lt;br /&gt;
| RLog provides a flexible message logging facility for C++ programs and libraries.&lt;br /&gt;
| &amp;lt;tt&amp;gt;librlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/rlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[GNU Parallel]]&lt;br /&gt;
| 2012-10-22&lt;br /&gt;
| execute commands in parallel&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnu-parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.3.4&lt;br /&gt;
| Python programming language. Modules included : numpy 1.8.0 , scipy 0.13.3 , matlotlib 1.3.1 , ipython 1.2.1 , cython 0.20.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ruby&lt;br /&gt;
| 1.9.1&lt;br /&gt;
| Ruby programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 1.6.0&lt;br /&gt;
| IBM's Java JRE ad SDK&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| Xlibraries&lt;br /&gt;
|&lt;br /&gt;
| A collection of X graphics libraries and tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;xterm&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xpdf&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;Xlibraries&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Extras&lt;br /&gt;
|&lt;br /&gt;
| A collection of standard linux and home-grown tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;bc&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;screen&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xxdiff&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;ish&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== TCS Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM compilers&lt;br /&gt;
|10.1(c/c++)&amp;lt;br&amp;gt;12.1(fortran)&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlf,xlc_r,xlC_r,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpcc,mpCC,mpxlf,mpcc_r,mpCC_r,mpxlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 1.2&lt;br /&gt;
| Unified Parallel C&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlupc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|13.1, 14.1&lt;br /&gt;
| newer version &lt;br /&gt;
| &amp;lt;tt&amp;gt;xlf,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| xlf/13.1&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|11.1, 12.1&lt;br /&gt;
| new versions&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| vacpp&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| MPE2&lt;br /&gt;
| 1.0.6&lt;br /&gt;
| Performance Visualization for Parallel Programs   &lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scalasca&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.5&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc, hdiff, ..., libdf, libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF + ncview&lt;br /&gt;
| 4.0.1*&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf, ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.1.1*&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 3.9.6*&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Fast Fourier transform library&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi,libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK+SCALAPACK&lt;br /&gt;
| 3.4.2+2.0.2&lt;br /&gt;
| Linear algebra package. Note that essl, which comes with the ibm compilers contains a large part of lapack as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack,libscalapack,libblacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lapack&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PetSc&lt;br /&gt;
| 3.2&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation. With external packages mumps, chaco, hypre, parmetis, prometheus, plapack, superlu, sprng.&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of libraries to your user environment&amp;lt;br&amp;gt; compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi, libfftw3, libhdf5, liblapack, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gmake&lt;br /&gt;
| 3.82&lt;br /&gt;
| GNU's make. Replaces AIX make or gmake 3.80.&lt;br /&gt;
| &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCL&lt;br /&gt;
| 5.1.1&lt;br /&gt;
| NCAR Command Language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl, libncl, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== P7 Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1, 13.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlf,xlf_r,xlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1, 11.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.23.2&lt;br /&gt;
| &lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|5.2.2&lt;br /&gt;
|IBM's Parallel Environment&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpcc,mpCC,mpfort,mpiexec&amp;lt;/tt&amp;gt;&lt;br /&gt;
|pe&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.6.1 , 4.8.1&lt;br /&gt;
| GNU Compiler Collection&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 7.0&lt;br /&gt;
| IBM Java 1.7 implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;jdk&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.7&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.5&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0 , scipy-0.13.2 , matplotlib-1.3.1 , pyfits-3.2 , h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| command-driven interactive function and data plotting program&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| udunits&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of applications and libraries to your user environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;bindlaunch, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=Manuals=&lt;br /&gt;
{{:Manuals}}&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6957</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6957"/>
		<updated>2014-04-07T23:03:55Z</updated>

		<summary type="html">&lt;p&gt;Brelier: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
debugjob session with an executable implicitly calls runjob  with 1 mpi task :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
debugjob -i&lt;br /&gt;
**********************************************************&lt;br /&gt;
 Interactive BGQ runjob shell using bgq-fen1-ib0.10295.0 and           &lt;br /&gt;
 LL14040718574824 for 30 minutes with 64 NODES (1024 cores). &lt;br /&gt;
 IMPLICIT MODE: running an executable implicitly calls runjob&lt;br /&gt;
                with 1 mpi task&lt;br /&gt;
 Exit shell when finished.                                &lt;br /&gt;
**********************************************************&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.3, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6925</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6925"/>
		<updated>2014-03-28T20:27:32Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Software modules installed on the BGQ */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.3, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6924</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6924"/>
		<updated>2014-03-28T13:29:49Z</updated>

		<summary type="html">&lt;p&gt;Brelier: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| first of (20 TB ; 1 million files)&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
== Python on BlueGene ==&lt;br /&gt;
Python 2.7.3 has been installed on BlueGene. To use &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Numpy&amp;lt;/span&amp;gt; and &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;Scipy&amp;lt;/span&amp;gt;, the module &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;essl/5.1&amp;lt;/span&amp;gt; has to be loaded.&lt;br /&gt;
The full python path has to be provided (otherwise the default version is used).&lt;br /&gt;
&lt;br /&gt;
To use python on BlueGene :&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
module load python/2.7.3&lt;br /&gt;
##Only if you need numpy/scipy :&lt;br /&gt;
module load xlf/14.1 essl/5.1&lt;br /&gt;
runjob --np 1 --ranks-per-node=1 --envs HOME=$HOME LD_LIBRARY_PATH=$LD_LIBRARY_PATH PYTHONPATH=/scinet/bgq/tools/Python/python2.7.3-20131205/lib/python2.7/site-packages/ : /scinet/bgq/tools/Python/python2.7.3-20131205/bin/python2.7 /PATHOFYOURSCRIPT.py &lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6907</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6907"/>
		<updated>2014-03-06T17:59:16Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Batch Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid codes) = 1 , 2 , 4 , 8 , 16 , 32 , 64&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6906</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6906"/>
		<updated>2014-03-06T17:58:01Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Batch Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
np : number of MPI processes&lt;br /&gt;
&lt;br /&gt;
ranks-per-node : number of MPI processes per node&lt;br /&gt;
&lt;br /&gt;
OMP_NUM_THREADS : number of OpenMP thread per MPI process (for hybrid code)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6904</id>
		<title>Software and Libraries</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6904"/>
		<updated>2014-03-04T19:42:48Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* GPC Software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Software Module System =&lt;br /&gt;
All the software listed on this page is accessed using a modules system.  This means that much of the software is not &lt;br /&gt;
accessible by default but has to be loaded using the module command. The&lt;br /&gt;
reason is that&lt;br /&gt;
* it allows us to easily keep multiple versions of software for different users on the system;&lt;br /&gt;
* it allows users to easily switch between versions.&lt;br /&gt;
The module system works similarly on the GPC and the TCS, although different modules are installed on these two systems.&lt;br /&gt;
&lt;br /&gt;
Note that, generally, if you compile a program with a module loaded, you will have to run it with that same module loaded, to make dynamically linked libraries accessible.&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
!{{Hl2}}|Function&lt;br /&gt;
!{{Hl2}}|Command&lt;br /&gt;
!{{Hl2}}|Comments&lt;br /&gt;
|-&lt;br /&gt;
|List available software packages:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If a module is not listed here, it is not supported.&lt;br /&gt;
*The flag &amp;quot;(default)&amp;quot; is never part of the name.&lt;br /&gt;
|-&lt;br /&gt;
|Use particular software:&lt;br /&gt;
|&amp;lt;pre&amp;gt; $ module load [module-name] &amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If possible, specify only the short name (the part before the &amp;quot;/&amp;quot;). &lt;br /&gt;
*When ambiguous, this loads the default one. &lt;br /&gt;
|-&lt;br /&gt;
|List available versions of a specific software package:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail [short-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|List currently loaded modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module list&amp;lt;/pre&amp;gt;&lt;br /&gt;
|For reproducability, it is a good idea to put this in your job scripts, so you know exactly what modules(+version) were used.&lt;br /&gt;
|-&lt;br /&gt;
|Get description of a particular module:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module help [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove a module from your shell:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module unload [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove all modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module purge&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Replace one loaded module with another:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module switch [old-module-name] [new-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SCINET_[short-module-name]_BASE&lt;br /&gt;
SCINET_[short-module-name]_LIB&lt;br /&gt;
SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[short-module-name]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[short-module-name]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Errors in loaded modules can arise for a few reasons, for instance:&lt;br /&gt;
* A module by that name may not exist.&lt;br /&gt;
* Some modules require other modules to have been loaded; it this requirement is not met when you try to load that module, an error message will be printed explaining what module is needed.&lt;br /&gt;
* Some modules cannot be loaded together: an error message will be printed explaining which modules conflict.&lt;br /&gt;
&lt;br /&gt;
It is no longer recommended to load modules in the file [[Important_.bashrc_guidelines|.bashrc]] in your home directory, rather, load them explicitly on the command-line and in your job scripts.&lt;br /&gt;
&lt;br /&gt;
== Default and non-default modules ==&lt;br /&gt;
&lt;br /&gt;
When you load a module with its 'short' name, you will get the ''default'' version, which is the most recent (usually), recommended version of that library or piece of software.  In general, using the short module name is the way to go. However, you may have code that depends on the intricacies of a non-default version.  For that reason, the most common older versions are also available as modules.  You can find all available modules using the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
== Naming convention ==&lt;br /&gt;
&lt;br /&gt;
For modules that access applications, the full name of a module is as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  [short-module-name]/[version-number]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To have all modules conform to this convention, a number of modules' name change on Nov 3, 2010:&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''old name'''&lt;br /&gt;
| '''new name'''&lt;br /&gt;
| '''remarks'''&lt;br /&gt;
|-&lt;br /&gt;
|autoconf/autoconf-2.64 &amp;amp;nbsp; &amp;amp;nbsp;&amp;amp;nbsp;&lt;br /&gt;
|autoconf/2.64&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.0           &lt;br /&gt;
|cuda/3.0&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.1          &lt;br /&gt;
|cuda/3.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/ddd-3.3.12   &lt;br /&gt;
|ddd/3.3.12&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/gdb-7.1       &lt;br /&gt;
|gdb/7.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|editors/nano/2.2.4      &lt;br /&gt;
|nano/2.2.4&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|emacs/emacs-23.1        &lt;br /&gt;
|emacs/23.1.1&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|gcc/gcc-4.4.0           &lt;br /&gt;
|gcc/4.4.0&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|graphics/ncview         &lt;br /&gt;
|ncview/1.93&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|graphics/graphics       &lt;br /&gt;
|grace/5.1.22&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|                        &lt;br /&gt;
|gnuplot/4.2.6&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|svn/svn165              &lt;br /&gt;
|svn/1.6.5&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|visualization/paraview  &lt;br /&gt;
|paraview/3.8&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|amber10/amber10         &lt;br /&gt;
|amber/10.0.30 &lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|gamess/gamess           &lt;br /&gt;
|gamess/May2209 &amp;amp;nbsp;&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==modulefind - Finding modules by name==&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command will only show you modules whose names start with the argument that you give it, and will alsi return modules that you cannot load due to conflicts with already loaded modules.&lt;br /&gt;
&lt;br /&gt;
A little SciNet utility called &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt; (one word) can do that. It will list all installed modules which contain the arguments, and will determine  whether those modules have &lt;br /&gt;
been loaded, could be loaded, cannot because of conflicts with&lt;br /&gt;
already loaded modules, or have unresolved dependencies &lt;br /&gt;
(i.e. for which other modules need to be loaded first).  This is especially useful in cases like the &amp;quot;boost&amp;quot; libraries, whose module names are cxxlibraries/boost/1.47.0-gcc and cxxlibraries/boost/1.47.0-gcc, for the gcc and intel compiler, respectively.  &amp;lt;tt&amp;gt;modulefind boost&amp;lt;/tt&amp;gt; will find those, whereas &amp;lt;tt&amp;gt;module avail boost&amp;lt;/tt&amp;gt; will not.&lt;br /&gt;
&lt;br /&gt;
Note that just 'modulefind' will list all top-level modules.&lt;br /&gt;
&lt;br /&gt;
== Making your own modules ==&lt;br /&gt;
&lt;br /&gt;
How to make your own modules (e.g. for local installations or to access optional perl modules, ...), is possible and described on the [[Installing your own modules]] page.&lt;br /&gt;
&lt;br /&gt;
== Deprecated modules ==&lt;br /&gt;
&lt;br /&gt;
Some older software modules for which newer versions exist, get deprecated, which means they do not get maintained.  Since deprecated modules should only be needed in rare exceptional cases, they are not listed by the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.  However, if you have a piece of legacy code that really depends on a deprecated version of a library (and we urge you to check that it does not work with newer versions!), then you can load a deprecated version by &amp;lt;pre&amp;gt;module load use.deprecated [deprecated-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Currently (Oct 5,2010), the following modules are deprecated on the GPC: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/gcc-4.3.2          hdf5/184-v16-serial     intel/intel-v11.1.046               openmpi/1.3.3-intel-v11.0-ofed&lt;br /&gt;
hdf5/183-v16-openmpi   hdf5/184-v18-intelmpi   intelmpi/impi-3.2.1.009             openmpi/1.3.2-intel-v11.0-ofed.orig&lt;br /&gt;
hdf5/183-v18-openmpi   hdf5/184-v18-openmpi    intelmpi/impi-3.2.2.006             pgplot/5.2.2-gcc.old            &lt;br /&gt;
hdf5/184-v16-intelmpi  hdf5/184-v18-serial     intelmpi/impi-4.0.0.013             pgplot/5.2.2-intel.old&lt;br /&gt;
hdf5/184-v16-openmpi   intel/intel-v11.0.081   intelmpi/impi-4.0.0.025               &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the TCS, currently (Oct 5,2010) the only deprecated module is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ncl/5.1.1old&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Before using any of these deprecated modules, make sure that there is not a regular module that satisfies your needs, likely by a ''very similar name''.&lt;br /&gt;
&lt;br /&gt;
== Commercial software ==&lt;br /&gt;
&lt;br /&gt;
Apart from the compilers on our systems and the ddt parallel debugger, we generally do not provide licensed application software, e.g., no Gaussian, IDL, Matlab, etc. &lt;br /&gt;
See the [https://support.scinet.utoronto.ca/wiki/index.php/FAQ#How_can_I_run_Matlab_.2F_IDL_.2F_Gaussian_.2F_my_favourite_commercial_software_at_SciNet.3F FAQ].&lt;br /&gt;
&lt;br /&gt;
== Other software and libraries ==&lt;br /&gt;
&lt;br /&gt;
If you want to use a piece of software or a library that is not on the list, you can in principle install it yourself in you &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
Note however that building libraries and software from source often uses a lot of files. To avoid running out of disk space, building software is therefore best done from the &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt;, from which&lt;br /&gt;
you can copy/install only the libraries, header files and binaries to your &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
&lt;br /&gt;
If you suspect that a particular piece of software or a library would be of use to other users of SciNet as well, contact us, and we will consider adding it to the system.&lt;br /&gt;
&lt;br /&gt;
== Software lists ==&lt;br /&gt;
=== ARC/GPU Software ===&lt;br /&gt;
&lt;br /&gt;
The CPUs in the GPU nodes of the ARC cluster are of the same kind as those of the GPC, so all modules available on the GPC are available on the GPU nodes with a CentOS 6 image. This means that the different cuda variants that are available as modules, can be loaded on those GPC nodes as well, although they are of little use on that system.&lt;br /&gt;
&lt;br /&gt;
=== GPC Software ===&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Software  &lt;br /&gt;
!{{Hl2}}| Versions&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-  &lt;br /&gt;
|Intel Compiler&lt;br /&gt;
|12.1.3*, 12.1.5, 13.1.0&lt;br /&gt;
| includes MKL library, which includes BLAS, LAPACK, FFT, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;icpc,icc,ifort&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.6.1*, 4.7.0, 4.7.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Cuda&lt;br /&gt;
| 3.2, 4.0, 4.1*, 4.2&lt;br /&gt;
| NVIDIA's extension to C for GPGPU programming&lt;br /&gt;
| &amp;lt;tt&amp;gt;nvcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| PGI Compiler&lt;br /&gt;
| 12.5&lt;br /&gt;
| supports OpenACC and CUDA Fortran &lt;br /&gt;
| &amp;lt;tt&amp;gt;pgcc,pgcpp,pgfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| IntelMPI&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| MPICH2 based MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intelmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| OpenMPI&lt;br /&gt;
| 1.4.4*, 1.5.4&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 2.12.2&lt;br /&gt;
| Berkley Unified Parallel C Implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;upcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Editors'''''&lt;br /&gt;
|- &lt;br /&gt;
| Nano&lt;br /&gt;
| 2.2.4&lt;br /&gt;
| Nano's another editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Emacs&lt;br /&gt;
| 23.1.1&lt;br /&gt;
| New version of popular text editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| XEmacs&lt;br /&gt;
| 21.4.22&lt;br /&gt;
| XEmacs editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Development tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| Autoconf&lt;br /&gt;
| 2.68&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Automake&lt;br /&gt;
| 1.11.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;aclocal, automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CMake&lt;br /&gt;
| 2.8.6&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scons&lt;br /&gt;
| 2.0&lt;br /&gt;
| Software construction tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Git&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Revision control system&lt;br /&gt;
| &amp;lt;tt&amp;gt;git,gitk&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;git&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Intel tools&lt;br /&gt;
| 2011&lt;br /&gt;
| Intel Code Analysis Tools&lt;br /&gt;
| Vtune Amplifier XE, Inspector XE&lt;br /&gt;
| &amp;lt;tt&amp;gt;inteltools&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Mercurial&lt;br /&gt;
| 1.8.2&lt;br /&gt;
| Version control system&amp;lt;br&amp;gt;(part of the python module!)&lt;br /&gt;
| &amp;lt;tt&amp;gt;hg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug and performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool, + MAP MPI Profiler&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt, map&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| DDD&lt;br /&gt;
| 3.3.12&lt;br /&gt;
| Data Display Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GDB&lt;br /&gt;
| 7.3.1&lt;br /&gt;
| GNU debugger (the intel idbc debugger is available by default)&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| MPE2&lt;br /&gt;
| 2.4.5&lt;br /&gt;
| Multi-Processing Environment with intel + OpenMPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpecc, mpefc, jumpshot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#OpenSpeedShop_.28profiling.2C_MPI_tracing:_GPC.29 | OpenSpeedShop]]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| sampling and MPI tracing&lt;br /&gt;
| &amp;lt;tt&amp;gt;openss, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openspeedshop&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#Scalasca_.28profiling.2C_tracing:_TCS.2C_GPC.29 | Scalasca]]&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications (Compiled with OpenMPI)&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipm-hpc.sourceforge.net IPM]&lt;br /&gt;
| 0.983&lt;br /&gt;
| Integrated Performance Monitors http://ipm-hpc.sourceforge.net/]&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm, ipm_parse, ploticus,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Performance_And_Debugging_Tools:_GPC#Valgrind | Valgrind]]&lt;br /&gt;
| 3.6.1&lt;br /&gt;
| Memory checking utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind,cachegrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Padb&lt;br /&gt;
| 3.2 &lt;br /&gt;
| examine and debug parallel programs&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|&amp;lt;span id=&amp;quot;anchor_viz&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;'''''Visualization tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| Grace&lt;br /&gt;
| 5.1.22&lt;br /&gt;
| Plotting utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;xmgrace&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;grace&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Gnuplot&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Plotting utility&amp;lt;br&amp;gt;Requires 'extras' module if used on compute nodes.&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[ Using_Paraview | ParaView ]]&lt;br /&gt;
| 3.12.0&lt;br /&gt;
| Scientific visualization, server only&lt;br /&gt;
| &amp;lt;tt&amp;gt;pvserver,pvbatch,pvpython&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;paraview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| VMD&lt;br /&gt;
| 1.9&lt;br /&gt;
| Visualization and analysis utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCL/NCARG&lt;br /&gt;
| 6.0.0&lt;br /&gt;
| NCARG graphics and ncl utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| ROOT&lt;br /&gt;
| 5.30.00&lt;br /&gt;
| ROOT Analysis Framework from CERN&lt;br /&gt;
| &amp;lt;tt&amp;gt;root&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ROOT&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| ImageMagick&lt;br /&gt;
| 6.6.7&lt;br /&gt;
| Image manipulation tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;convert,animate,composite,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ImageMagick&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PGPLOT&lt;br /&gt;
| 5.2.2&lt;br /&gt;
| Graphics subroutine library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libcpgplot,libpgplot,libtkpgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ncview&lt;br /&gt;
| 2.1.1&lt;br /&gt;
| Visualization for NetCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, etc.&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CDO&lt;br /&gt;
| 1.5.1&lt;br /&gt;
| Climate Data Operators&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| UDUNITS&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc,hdiff,...,libdf,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Hdf5 | HDF5]]&lt;br /&gt;
| 1.8.7-v18*&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.arg0.net/encfs EncFS ]&lt;br /&gt;
| 1.74&lt;br /&gt;
| EncFS provides an encrypted filesystem in user-space, (works ONLY on gpc01..04)&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|- &lt;br /&gt;
| [[amber|AMBER 10]]&lt;br /&gt;
| Amber 10 + Amber Tools 1.3&lt;br /&gt;
| Amber Molecular Dynamics Package&lt;br /&gt;
| &amp;lt;tt&amp;gt;sander, sander.MPI&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;amber&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gamess|GAMESS (US)]]&lt;br /&gt;
| August 18, 2011 R1&lt;br /&gt;
| General Atomic and Molecular Electronic Structure System&lt;br /&gt;
| &amp;lt;tt&amp;gt;rungms&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gamess&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gromacs|GROMACS]]&lt;br /&gt;
| 4.5.5, 4.5.7, 4.6.2&lt;br /&gt;
| GROMACS molecular mechanics, single precision, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;grompp, mdrun&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gromacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[namd|NAMD]]&lt;br /&gt;
| 2.8&lt;br /&gt;
| NAMD - Scalable Molecular Dynammics&lt;br /&gt;
| &amp;lt;tt&amp;gt;namdmpiexec, namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[nwchem|NWChem]]&lt;br /&gt;
| 6.0&lt;br /&gt;
| NWChem Quantum Chemistry&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 4.3.2, 5.0.3&lt;br /&gt;
| Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;pw.x, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://blast.ncbi.nlm.nih.gov BLAST]&lt;br /&gt;
| 2.2.23+&lt;br /&gt;
| Basic Local Alignment Search Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;blastn,blastp,blastx,psiblast,tblastn...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;blast&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
| [http://denovoassembler.sourceforge.net RAY]&lt;br /&gt;
| 2.1.0 (small k-mer)&lt;br /&gt;
| Parallel de novo genome assemblies&lt;br /&gt;
| &amp;lt;tt&amp;gt;Ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[cpmd|CPMD]]&lt;br /&gt;
| 3.13.2&lt;br /&gt;
| Carr-Parinello molecular dynamics, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd.x&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[R Statistical Package|R]] &lt;br /&gt;
| 2.13.1&lt;br /&gt;
| statistical computing&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Octave&lt;br /&gt;
| 3.4.3&lt;br /&gt;
| Matlab-like environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.mcs.anl.gov/petsc/petsc-as/  PETSc ]&lt;br /&gt;
| 3.1*&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation (PETSc)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc, etc.. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Armadillo C++ linear algebra library | Armadillo]]&lt;br /&gt;
| 3.910.0&lt;br /&gt;
| C++ armadillo libraries (implement Matlab-like syntax)&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.tacc.utexas.edu/tacc-projects/gotoblas2/ GotoBLAS]&lt;br /&gt;
| 1.13&lt;br /&gt;
| Optimized BLAS implementation &lt;br /&gt;
| &amp;lt;tt&amp;gt;libgoto2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gotoblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13*, 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.3&lt;br /&gt;
| fast Fourier transform library&lt;br /&gt;
''Be careful in combining fftw3 and MKL: you need to link fftw3 first, with'' &amp;lt;tt&amp;gt;-L${SCINET_FFTW_LIB} -lfftw3&amp;lt;/tt&amp;gt;, then link MKL&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| LAPACK&lt;br /&gt;
| &lt;br /&gt;
| Provided by the Intel MKL library&lt;br /&gt;
| See http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://freshmeat.net/projects/rlog  RLog ]&lt;br /&gt;
| 1.4&lt;br /&gt;
| RLog provides a flexible message logging facility for C++ programs and libraries.&lt;br /&gt;
| &amp;lt;tt&amp;gt;librlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/rlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[GNU Parallel]]&lt;br /&gt;
| 2012-10-22&lt;br /&gt;
| execute commands in parallel&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnu-parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.3.4&lt;br /&gt;
| Python programming language. Modules included : numpy 1.8.0 , scipy 0.13.3 , matlotlib 1.3.1 , ipython 1.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ruby&lt;br /&gt;
| 1.9.1&lt;br /&gt;
| Ruby programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 1.6.0&lt;br /&gt;
| IBM's Java JRE ad SDK&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| Xlibraries&lt;br /&gt;
|&lt;br /&gt;
| A collection of X graphics libraries and tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;xterm&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xpdf&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;Xlibraries&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Extras&lt;br /&gt;
|&lt;br /&gt;
| A collection of standard linux and home-grown tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;bc&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;screen&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xxdiff&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;ish&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== TCS Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM compilers&lt;br /&gt;
|10.1(c/c++)&amp;lt;br&amp;gt;12.1(fortran)&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlf,xlc_r,xlC_r,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpcc,mpCC,mpxlf,mpcc_r,mpCC_r,mpxlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 1.2&lt;br /&gt;
| Unified Parallel C&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlupc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|13.1, 14.1&lt;br /&gt;
| newer version &lt;br /&gt;
| &amp;lt;tt&amp;gt;xlf,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| xlf/13.1&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|11.1, 12.1&lt;br /&gt;
| new versions&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| vacpp&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| MPE2&lt;br /&gt;
| 1.0.6&lt;br /&gt;
| Performance Visualization for Parallel Programs   &lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scalasca&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.5&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc, hdiff, ..., libdf, libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF + ncview&lt;br /&gt;
| 4.0.1*&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf, ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.1.1*&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 3.9.6*&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Fast Fourier transform library&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi,libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK+SCALAPACK&lt;br /&gt;
| 3.4.2+2.0.2&lt;br /&gt;
| Linear algebra package. Note that essl, which comes with the ibm compilers contains a large part of lapack as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack,libscalapack,libblacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lapack&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PetSc&lt;br /&gt;
| 3.2&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation. With external packages mumps, chaco, hypre, parmetis, prometheus, plapack, superlu, sprng.&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of libraries to your user environment&amp;lt;br&amp;gt; compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi, libfftw3, libhdf5, liblapack, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gmake&lt;br /&gt;
| 3.82&lt;br /&gt;
| GNU's make. Replaces AIX make or gmake 3.80.&lt;br /&gt;
| &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCL&lt;br /&gt;
| 5.1.1&lt;br /&gt;
| NCAR Command Language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl, libncl, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== P7 Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1, 13.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlf,xlf_r,xlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1, 11.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.23.2&lt;br /&gt;
| &lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|5.2.2&lt;br /&gt;
|IBM's Parallel Environment&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpcc,mpCC,mpfort,mpiexec&amp;lt;/tt&amp;gt;&lt;br /&gt;
|pe&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.6.1 , 4.8.1&lt;br /&gt;
| GNU Compiler Collection&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 7.0&lt;br /&gt;
| IBM Java 1.7 implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;jdk&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.7&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.5&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0 , scipy-0.13.2 , matplotlib-1.3.1 , pyfits-3.2 , h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| command-driven interactive function and data plotting program&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| udunits&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of applications and libraries to your user environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;bindlaunch, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=Manuals=&lt;br /&gt;
{{:Manuals}}&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6903</id>
		<title>Software and Libraries</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6903"/>
		<updated>2014-03-04T19:41:38Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* GPC Software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Software Module System =&lt;br /&gt;
All the software listed on this page is accessed using a modules system.  This means that much of the software is not &lt;br /&gt;
accessible by default but has to be loaded using the module command. The&lt;br /&gt;
reason is that&lt;br /&gt;
* it allows us to easily keep multiple versions of software for different users on the system;&lt;br /&gt;
* it allows users to easily switch between versions.&lt;br /&gt;
The module system works similarly on the GPC and the TCS, although different modules are installed on these two systems.&lt;br /&gt;
&lt;br /&gt;
Note that, generally, if you compile a program with a module loaded, you will have to run it with that same module loaded, to make dynamically linked libraries accessible.&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
!{{Hl2}}|Function&lt;br /&gt;
!{{Hl2}}|Command&lt;br /&gt;
!{{Hl2}}|Comments&lt;br /&gt;
|-&lt;br /&gt;
|List available software packages:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If a module is not listed here, it is not supported.&lt;br /&gt;
*The flag &amp;quot;(default)&amp;quot; is never part of the name.&lt;br /&gt;
|-&lt;br /&gt;
|Use particular software:&lt;br /&gt;
|&amp;lt;pre&amp;gt; $ module load [module-name] &amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If possible, specify only the short name (the part before the &amp;quot;/&amp;quot;). &lt;br /&gt;
*When ambiguous, this loads the default one. &lt;br /&gt;
|-&lt;br /&gt;
|List available versions of a specific software package:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail [short-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|List currently loaded modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module list&amp;lt;/pre&amp;gt;&lt;br /&gt;
|For reproducability, it is a good idea to put this in your job scripts, so you know exactly what modules(+version) were used.&lt;br /&gt;
|-&lt;br /&gt;
|Get description of a particular module:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module help [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove a module from your shell:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module unload [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove all modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module purge&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Replace one loaded module with another:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module switch [old-module-name] [new-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SCINET_[short-module-name]_BASE&lt;br /&gt;
SCINET_[short-module-name]_LIB&lt;br /&gt;
SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[short-module-name]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[short-module-name]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Errors in loaded modules can arise for a few reasons, for instance:&lt;br /&gt;
* A module by that name may not exist.&lt;br /&gt;
* Some modules require other modules to have been loaded; it this requirement is not met when you try to load that module, an error message will be printed explaining what module is needed.&lt;br /&gt;
* Some modules cannot be loaded together: an error message will be printed explaining which modules conflict.&lt;br /&gt;
&lt;br /&gt;
It is no longer recommended to load modules in the file [[Important_.bashrc_guidelines|.bashrc]] in your home directory, rather, load them explicitly on the command-line and in your job scripts.&lt;br /&gt;
&lt;br /&gt;
== Default and non-default modules ==&lt;br /&gt;
&lt;br /&gt;
When you load a module with its 'short' name, you will get the ''default'' version, which is the most recent (usually), recommended version of that library or piece of software.  In general, using the short module name is the way to go. However, you may have code that depends on the intricacies of a non-default version.  For that reason, the most common older versions are also available as modules.  You can find all available modules using the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
== Naming convention ==&lt;br /&gt;
&lt;br /&gt;
For modules that access applications, the full name of a module is as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  [short-module-name]/[version-number]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To have all modules conform to this convention, a number of modules' name change on Nov 3, 2010:&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''old name'''&lt;br /&gt;
| '''new name'''&lt;br /&gt;
| '''remarks'''&lt;br /&gt;
|-&lt;br /&gt;
|autoconf/autoconf-2.64 &amp;amp;nbsp; &amp;amp;nbsp;&amp;amp;nbsp;&lt;br /&gt;
|autoconf/2.64&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.0           &lt;br /&gt;
|cuda/3.0&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.1          &lt;br /&gt;
|cuda/3.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/ddd-3.3.12   &lt;br /&gt;
|ddd/3.3.12&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/gdb-7.1       &lt;br /&gt;
|gdb/7.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|editors/nano/2.2.4      &lt;br /&gt;
|nano/2.2.4&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|emacs/emacs-23.1        &lt;br /&gt;
|emacs/23.1.1&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|gcc/gcc-4.4.0           &lt;br /&gt;
|gcc/4.4.0&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|graphics/ncview         &lt;br /&gt;
|ncview/1.93&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|graphics/graphics       &lt;br /&gt;
|grace/5.1.22&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|                        &lt;br /&gt;
|gnuplot/4.2.6&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|svn/svn165              &lt;br /&gt;
|svn/1.6.5&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|visualization/paraview  &lt;br /&gt;
|paraview/3.8&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|amber10/amber10         &lt;br /&gt;
|amber/10.0.30 &lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|gamess/gamess           &lt;br /&gt;
|gamess/May2209 &amp;amp;nbsp;&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==modulefind - Finding modules by name==&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command will only show you modules whose names start with the argument that you give it, and will alsi return modules that you cannot load due to conflicts with already loaded modules.&lt;br /&gt;
&lt;br /&gt;
A little SciNet utility called &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt; (one word) can do that. It will list all installed modules which contain the arguments, and will determine  whether those modules have &lt;br /&gt;
been loaded, could be loaded, cannot because of conflicts with&lt;br /&gt;
already loaded modules, or have unresolved dependencies &lt;br /&gt;
(i.e. for which other modules need to be loaded first).  This is especially useful in cases like the &amp;quot;boost&amp;quot; libraries, whose module names are cxxlibraries/boost/1.47.0-gcc and cxxlibraries/boost/1.47.0-gcc, for the gcc and intel compiler, respectively.  &amp;lt;tt&amp;gt;modulefind boost&amp;lt;/tt&amp;gt; will find those, whereas &amp;lt;tt&amp;gt;module avail boost&amp;lt;/tt&amp;gt; will not.&lt;br /&gt;
&lt;br /&gt;
Note that just 'modulefind' will list all top-level modules.&lt;br /&gt;
&lt;br /&gt;
== Making your own modules ==&lt;br /&gt;
&lt;br /&gt;
How to make your own modules (e.g. for local installations or to access optional perl modules, ...), is possible and described on the [[Installing your own modules]] page.&lt;br /&gt;
&lt;br /&gt;
== Deprecated modules ==&lt;br /&gt;
&lt;br /&gt;
Some older software modules for which newer versions exist, get deprecated, which means they do not get maintained.  Since deprecated modules should only be needed in rare exceptional cases, they are not listed by the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.  However, if you have a piece of legacy code that really depends on a deprecated version of a library (and we urge you to check that it does not work with newer versions!), then you can load a deprecated version by &amp;lt;pre&amp;gt;module load use.deprecated [deprecated-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Currently (Oct 5,2010), the following modules are deprecated on the GPC: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/gcc-4.3.2          hdf5/184-v16-serial     intel/intel-v11.1.046               openmpi/1.3.3-intel-v11.0-ofed&lt;br /&gt;
hdf5/183-v16-openmpi   hdf5/184-v18-intelmpi   intelmpi/impi-3.2.1.009             openmpi/1.3.2-intel-v11.0-ofed.orig&lt;br /&gt;
hdf5/183-v18-openmpi   hdf5/184-v18-openmpi    intelmpi/impi-3.2.2.006             pgplot/5.2.2-gcc.old            &lt;br /&gt;
hdf5/184-v16-intelmpi  hdf5/184-v18-serial     intelmpi/impi-4.0.0.013             pgplot/5.2.2-intel.old&lt;br /&gt;
hdf5/184-v16-openmpi   intel/intel-v11.0.081   intelmpi/impi-4.0.0.025               &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the TCS, currently (Oct 5,2010) the only deprecated module is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ncl/5.1.1old&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Before using any of these deprecated modules, make sure that there is not a regular module that satisfies your needs, likely by a ''very similar name''.&lt;br /&gt;
&lt;br /&gt;
== Commercial software ==&lt;br /&gt;
&lt;br /&gt;
Apart from the compilers on our systems and the ddt parallel debugger, we generally do not provide licensed application software, e.g., no Gaussian, IDL, Matlab, etc. &lt;br /&gt;
See the [https://support.scinet.utoronto.ca/wiki/index.php/FAQ#How_can_I_run_Matlab_.2F_IDL_.2F_Gaussian_.2F_my_favourite_commercial_software_at_SciNet.3F FAQ].&lt;br /&gt;
&lt;br /&gt;
== Other software and libraries ==&lt;br /&gt;
&lt;br /&gt;
If you want to use a piece of software or a library that is not on the list, you can in principle install it yourself in you &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
Note however that building libraries and software from source often uses a lot of files. To avoid running out of disk space, building software is therefore best done from the &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt;, from which&lt;br /&gt;
you can copy/install only the libraries, header files and binaries to your &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
&lt;br /&gt;
If you suspect that a particular piece of software or a library would be of use to other users of SciNet as well, contact us, and we will consider adding it to the system.&lt;br /&gt;
&lt;br /&gt;
== Software lists ==&lt;br /&gt;
=== ARC/GPU Software ===&lt;br /&gt;
&lt;br /&gt;
The CPUs in the GPU nodes of the ARC cluster are of the same kind as those of the GPC, so all modules available on the GPC are available on the GPU nodes with a CentOS 6 image. This means that the different cuda variants that are available as modules, can be loaded on those GPC nodes as well, although they are of little use on that system.&lt;br /&gt;
&lt;br /&gt;
=== GPC Software ===&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Software  &lt;br /&gt;
!{{Hl2}}| Versions&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-  &lt;br /&gt;
|Intel Compiler&lt;br /&gt;
|12.1.3*, 12.1.5, 13.1.0&lt;br /&gt;
| includes MKL library, which includes BLAS, LAPACK, FFT, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;icpc,icc,ifort&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.6.1*, 4.7.0, 4.7.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Cuda&lt;br /&gt;
| 3.2, 4.0, 4.1*, 4.2&lt;br /&gt;
| NVIDIA's extension to C for GPGPU programming&lt;br /&gt;
| &amp;lt;tt&amp;gt;nvcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| PGI Compiler&lt;br /&gt;
| 12.5&lt;br /&gt;
| supports OpenACC and CUDA Fortran &lt;br /&gt;
| &amp;lt;tt&amp;gt;pgcc,pgcpp,pgfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| IntelMPI&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| MPICH2 based MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intelmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| OpenMPI&lt;br /&gt;
| 1.4.4*, 1.5.4&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 2.12.2&lt;br /&gt;
| Berkley Unified Parallel C Implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;upcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Editors'''''&lt;br /&gt;
|- &lt;br /&gt;
| Nano&lt;br /&gt;
| 2.2.4&lt;br /&gt;
| Nano's another editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Emacs&lt;br /&gt;
| 23.1.1&lt;br /&gt;
| New version of popular text editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| XEmacs&lt;br /&gt;
| 21.4.22&lt;br /&gt;
| XEmacs editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Development tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| Autoconf&lt;br /&gt;
| 2.68&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Automake&lt;br /&gt;
| 1.11.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;aclocal, automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CMake&lt;br /&gt;
| 2.8.6&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scons&lt;br /&gt;
| 2.0&lt;br /&gt;
| Software construction tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Git&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Revision control system&lt;br /&gt;
| &amp;lt;tt&amp;gt;git,gitk&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;git&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Intel tools&lt;br /&gt;
| 2011&lt;br /&gt;
| Intel Code Analysis Tools&lt;br /&gt;
| Vtune Amplifier XE, Inspector XE&lt;br /&gt;
| &amp;lt;tt&amp;gt;inteltools&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Mercurial&lt;br /&gt;
| 1.8.2&lt;br /&gt;
| Version control system&amp;lt;br&amp;gt;(part of the python module!)&lt;br /&gt;
| &amp;lt;tt&amp;gt;hg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug and performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool, + MAP MPI Profiler&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt, map&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| DDD&lt;br /&gt;
| 3.3.12&lt;br /&gt;
| Data Display Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GDB&lt;br /&gt;
| 7.3.1&lt;br /&gt;
| GNU debugger (the intel idbc debugger is available by default)&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| MPE2&lt;br /&gt;
| 2.4.5&lt;br /&gt;
| Multi-Processing Environment with intel + OpenMPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpecc, mpefc, jumpshot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#OpenSpeedShop_.28profiling.2C_MPI_tracing:_GPC.29 | OpenSpeedShop]]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| sampling and MPI tracing&lt;br /&gt;
| &amp;lt;tt&amp;gt;openss, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openspeedshop&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#Scalasca_.28profiling.2C_tracing:_TCS.2C_GPC.29 | Scalasca]]&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications (Compiled with OpenMPI)&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipm-hpc.sourceforge.net IPM]&lt;br /&gt;
| 0.983&lt;br /&gt;
| Integrated Performance Monitors http://ipm-hpc.sourceforge.net/]&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm, ipm_parse, ploticus,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Performance_And_Debugging_Tools:_GPC#Valgrind | Valgrind]]&lt;br /&gt;
| 3.6.1&lt;br /&gt;
| Memory checking utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind,cachegrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Padb&lt;br /&gt;
| 3.2 &lt;br /&gt;
| examine and debug parallel programs&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|&amp;lt;span id=&amp;quot;anchor_viz&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;'''''Visualization tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| Grace&lt;br /&gt;
| 5.1.22&lt;br /&gt;
| Plotting utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;xmgrace&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;grace&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Gnuplot&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Plotting utility&amp;lt;br&amp;gt;Requires 'extras' module if used on compute nodes.&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[ Using_Paraview | ParaView ]]&lt;br /&gt;
| 3.12.0&lt;br /&gt;
| Scientific visualization, server only&lt;br /&gt;
| &amp;lt;tt&amp;gt;pvserver,pvbatch,pvpython&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;paraview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| VMD&lt;br /&gt;
| 1.9&lt;br /&gt;
| Visualization and analysis utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCL/NCARG&lt;br /&gt;
| 6.0.0&lt;br /&gt;
| NCARG graphics and ncl utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| ROOT&lt;br /&gt;
| 5.30.00&lt;br /&gt;
| ROOT Analysis Framework from CERN&lt;br /&gt;
| &amp;lt;tt&amp;gt;root&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ROOT&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| ImageMagick&lt;br /&gt;
| 6.6.7&lt;br /&gt;
| Image manipulation tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;convert,animate,composite,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ImageMagick&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PGPLOT&lt;br /&gt;
| 5.2.2&lt;br /&gt;
| Graphics subroutine library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libcpgplot,libpgplot,libtkpgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ncview&lt;br /&gt;
| 2.1.1&lt;br /&gt;
| Visualization for NetCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, etc.&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CDO&lt;br /&gt;
| 1.5.1&lt;br /&gt;
| Climate Data Operators&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| UDUNITS&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc,hdiff,...,libdf,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Hdf5 | HDF5]]&lt;br /&gt;
| 1.8.7-v18*&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.arg0.net/encfs EncFS ]&lt;br /&gt;
| 1.74&lt;br /&gt;
| EncFS provides an encrypted filesystem in user-space, (works ONLY on gpc01..04)&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|- &lt;br /&gt;
| [[amber|AMBER 10]]&lt;br /&gt;
| Amber 10 + Amber Tools 1.3&lt;br /&gt;
| Amber Molecular Dynamics Package&lt;br /&gt;
| &amp;lt;tt&amp;gt;sander, sander.MPI&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;amber&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gamess|GAMESS (US)]]&lt;br /&gt;
| August 18, 2011 R1&lt;br /&gt;
| General Atomic and Molecular Electronic Structure System&lt;br /&gt;
| &amp;lt;tt&amp;gt;rungms&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gamess&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gromacs|GROMACS]]&lt;br /&gt;
| 4.5.5, 4.5.7, 4.6.2&lt;br /&gt;
| GROMACS molecular mechanics, single precision, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;grompp, mdrun&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gromacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[namd|NAMD]]&lt;br /&gt;
| 2.8&lt;br /&gt;
| NAMD - Scalable Molecular Dynammics&lt;br /&gt;
| &amp;lt;tt&amp;gt;namdmpiexec, namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[nwchem|NWChem]]&lt;br /&gt;
| 6.0&lt;br /&gt;
| NWChem Quantum Chemistry&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 4.3.2, 5.0.3&lt;br /&gt;
| Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;pw.x, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://blast.ncbi.nlm.nih.gov BLAST]&lt;br /&gt;
| 2.2.23+&lt;br /&gt;
| Basic Local Alignment Search Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;blastn,blastp,blastx,psiblast,tblastn...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;blast&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
| [http://denovoassembler.sourceforge.net RAY]&lt;br /&gt;
| 2.1.0 (small k-mer)&lt;br /&gt;
| Parallel de novo genome assemblies&lt;br /&gt;
| &amp;lt;tt&amp;gt;Ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[cpmd|CPMD]]&lt;br /&gt;
| 3.13.2&lt;br /&gt;
| Carr-Parinello molecular dynamics, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd.x&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[R Statistical Package|R]] &lt;br /&gt;
| 2.13.1&lt;br /&gt;
| statistical computing&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Octave&lt;br /&gt;
| 3.4.3&lt;br /&gt;
| Matlab-like environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.mcs.anl.gov/petsc/petsc-as/  PETSc ]&lt;br /&gt;
| 3.1*&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation (PETSc)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc, etc.. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Armadillo C++ linear algebra library | Armadillo]]&lt;br /&gt;
| 3.910.0&lt;br /&gt;
| C++ armadillo libraries (implement Matlab-like syntax)&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.tacc.utexas.edu/tacc-projects/gotoblas2/ GotoBLAS]&lt;br /&gt;
| 1.13&lt;br /&gt;
| Optimized BLAS implementation &lt;br /&gt;
| &amp;lt;tt&amp;gt;libgoto2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gotoblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13*, 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.3&lt;br /&gt;
| fast Fourier transform library&lt;br /&gt;
''Be careful in combining fftw3 and MKL: you need to link fftw3 first, with'' &amp;lt;tt&amp;gt;-L${SCINET_FFTW_LIB} -lfftw3&amp;lt;/tt&amp;gt;, then link MKL&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| LAPACK&lt;br /&gt;
| &lt;br /&gt;
| Provided by the Intel MKL library&lt;br /&gt;
| See http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://freshmeat.net/projects/rlog  RLog ]&lt;br /&gt;
| 1.4&lt;br /&gt;
| RLog provides a flexible message logging facility for C++ programs and libraries.&lt;br /&gt;
| &amp;lt;tt&amp;gt;librlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/rlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[GNU Parallel]]&lt;br /&gt;
| 2012-10-22&lt;br /&gt;
| execute commands in parallel&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnu-parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.3.4&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ruby&lt;br /&gt;
| 1.9.1&lt;br /&gt;
| Ruby programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 1.6.0&lt;br /&gt;
| IBM's Java JRE ad SDK&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| Xlibraries&lt;br /&gt;
|&lt;br /&gt;
| A collection of X graphics libraries and tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;xterm&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xpdf&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;Xlibraries&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Extras&lt;br /&gt;
|&lt;br /&gt;
| A collection of standard linux and home-grown tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;bc&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;screen&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xxdiff&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;ish&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== TCS Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM compilers&lt;br /&gt;
|10.1(c/c++)&amp;lt;br&amp;gt;12.1(fortran)&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlf,xlc_r,xlC_r,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpcc,mpCC,mpxlf,mpcc_r,mpCC_r,mpxlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 1.2&lt;br /&gt;
| Unified Parallel C&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlupc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|13.1, 14.1&lt;br /&gt;
| newer version &lt;br /&gt;
| &amp;lt;tt&amp;gt;xlf,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| xlf/13.1&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|11.1, 12.1&lt;br /&gt;
| new versions&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| vacpp&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| MPE2&lt;br /&gt;
| 1.0.6&lt;br /&gt;
| Performance Visualization for Parallel Programs   &lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scalasca&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.5&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc, hdiff, ..., libdf, libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF + ncview&lt;br /&gt;
| 4.0.1*&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf, ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.1.1*&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 3.9.6*&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Fast Fourier transform library&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi,libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK+SCALAPACK&lt;br /&gt;
| 3.4.2+2.0.2&lt;br /&gt;
| Linear algebra package. Note that essl, which comes with the ibm compilers contains a large part of lapack as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack,libscalapack,libblacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lapack&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PetSc&lt;br /&gt;
| 3.2&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation. With external packages mumps, chaco, hypre, parmetis, prometheus, plapack, superlu, sprng.&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of libraries to your user environment&amp;lt;br&amp;gt; compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi, libfftw3, libhdf5, liblapack, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gmake&lt;br /&gt;
| 3.82&lt;br /&gt;
| GNU's make. Replaces AIX make or gmake 3.80.&lt;br /&gt;
| &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCL&lt;br /&gt;
| 5.1.1&lt;br /&gt;
| NCAR Command Language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl, libncl, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== P7 Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1, 13.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlf,xlf_r,xlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1, 11.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.23.2&lt;br /&gt;
| &lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|5.2.2&lt;br /&gt;
|IBM's Parallel Environment&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpcc,mpCC,mpfort,mpiexec&amp;lt;/tt&amp;gt;&lt;br /&gt;
|pe&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.6.1 , 4.8.1&lt;br /&gt;
| GNU Compiler Collection&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 7.0&lt;br /&gt;
| IBM Java 1.7 implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;jdk&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.7&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.5&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0 , scipy-0.13.2 , matplotlib-1.3.1 , pyfits-3.2 , h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| command-driven interactive function and data plotting program&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| udunits&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of applications and libraries to your user environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;bindlaunch, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=Manuals=&lt;br /&gt;
{{:Manuals}}&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6846</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6846"/>
		<updated>2014-02-13T17:15:45Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Batch Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
ranks-per-node &amp;amp;le; np&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6800</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6800"/>
		<updated>2014-01-24T17:58:28Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Batch Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;np&amp;lt;/span&amp;gt; &amp;amp;le; ranks-per-node * bg_size&lt;br /&gt;
&lt;br /&gt;
(ranks-per-node * OMP_NUM_THREADS ) &amp;amp;le; 64 &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6799</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6799"/>
		<updated>2014-01-24T17:51:46Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Batch Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter &amp;lt;span style=&amp;quot;font-weight: bold;&amp;quot;&amp;gt;bg_size&amp;lt;/span&amp;gt; can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6798</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6798"/>
		<updated>2014-01-24T17:50:58Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Batch Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
The parameter bg_size can only be equal to 64, 128, 256, 512, 1024 and 2048.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6797</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6797"/>
		<updated>2014-01-24T17:37:01Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* BACKFILL scheduling */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command &amp;lt;span style=&amp;quot;color: red;font-weight: bold;&amp;quot;&amp;gt;llAvailableResources&amp;lt;/span&amp;gt; gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6796</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6796"/>
		<updated>2014-01-24T17:31:17Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* BACKFILL scheduling */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command '''&amp;lt;tt&amp;gt;llAvailableResources&amp;lt;/tt&amp;gt;''' gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6795</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6795"/>
		<updated>2014-01-24T16:29:04Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* BACKFILL scheduling */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command llAvailableResources gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
On the Devel system : only a debugjob can start immediately&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 512 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 256 nodes requesting a walltime T &amp;lt;= 21 hours and 11 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6794</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6794"/>
		<updated>2014-01-24T15:32:53Z</updated>

		<summary type="html">&lt;p&gt;Brelier: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== BACKFILL scheduling ===&lt;br /&gt;
To optimize the cluster usage, we encourage users to submit jobs according to the available resources on BGQ. The command llAvailableResources gives for example :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
The Devel system is full&lt;br /&gt;
&lt;br /&gt;
On the Prod. system : a job will start immediately if you use 128 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min &lt;br /&gt;
On the Prod. system : a job will start immediately if you use 64 nodes requesting a walltime T &amp;lt;= 24 hours and 0 min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6783</id>
		<title>Software and Libraries</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Software_and_Libraries&amp;diff=6783"/>
		<updated>2014-01-22T15:53:26Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* P7 Software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Software Module System =&lt;br /&gt;
All the software listed on this page is accessed using a modules system.  This means that much of the software is not &lt;br /&gt;
accessible by default but has to be loaded using the module command. The&lt;br /&gt;
reason is that&lt;br /&gt;
* it allows us to easily keep multiple versions of software for different users on the system;&lt;br /&gt;
* it allows users to easily switch between versions.&lt;br /&gt;
The module system works similarly on the GPC and the TCS, although different modules are installed on these two systems.&lt;br /&gt;
&lt;br /&gt;
Note that, generally, if you compile a program with a module loaded, you will have to run it with that same module loaded, to make dynamically linked libraries accessible.&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
!{{Hl2}}|Function&lt;br /&gt;
!{{Hl2}}|Command&lt;br /&gt;
!{{Hl2}}|Comments&lt;br /&gt;
|-&lt;br /&gt;
|List available software packages:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If a module is not listed here, it is not supported.&lt;br /&gt;
*The flag &amp;quot;(default)&amp;quot; is never part of the name.&lt;br /&gt;
|-&lt;br /&gt;
|Use particular software:&lt;br /&gt;
|&amp;lt;pre&amp;gt; $ module load [module-name] &amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
*If possible, specify only the short name (the part before the &amp;quot;/&amp;quot;). &lt;br /&gt;
*When ambiguous, this loads the default one. &lt;br /&gt;
|-&lt;br /&gt;
|List available versions of a specific software package:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module avail [short-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|List currently loaded modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module list&amp;lt;/pre&amp;gt;&lt;br /&gt;
|For reproducability, it is a good idea to put this in your job scripts, so you know exactly what modules(+version) were used.&lt;br /&gt;
|-&lt;br /&gt;
|Get description of a particular module:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module help [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove a module from your shell:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module unload [module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Remove all modules:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module purge&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Replace one loaded module with another:&lt;br /&gt;
|&amp;lt;pre&amp;gt;$ module switch [old-module-name] [new-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SCINET_[short-module-name]_BASE&lt;br /&gt;
SCINET_[short-module-name]_LIB&lt;br /&gt;
SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[short-module-name]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[short-module-name]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Errors in loaded modules can arise for a few reasons, for instance:&lt;br /&gt;
* A module by that name may not exist.&lt;br /&gt;
* Some modules require other modules to have been loaded; it this requirement is not met when you try to load that module, an error message will be printed explaining what module is needed.&lt;br /&gt;
* Some modules cannot be loaded together: an error message will be printed explaining which modules conflict.&lt;br /&gt;
&lt;br /&gt;
It is no longer recommended to load modules in the file [[Important_.bashrc_guidelines|.bashrc]] in your home directory, rather, load them explicitly on the command-line and in your job scripts.&lt;br /&gt;
&lt;br /&gt;
== Default and non-default modules ==&lt;br /&gt;
&lt;br /&gt;
When you load a module with its 'short' name, you will get the ''default'' version, which is the most recent (usually), recommended version of that library or piece of software.  In general, using the short module name is the way to go. However, you may have code that depends on the intricacies of a non-default version.  For that reason, the most common older versions are also available as modules.  You can find all available modules using the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
== Naming convention ==&lt;br /&gt;
&lt;br /&gt;
For modules that access applications, the full name of a module is as follows.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  [short-module-name]/[version-number]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To have all modules conform to this convention, a number of modules' name change on Nov 3, 2010:&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''old name'''&lt;br /&gt;
| '''new name'''&lt;br /&gt;
| '''remarks'''&lt;br /&gt;
|-&lt;br /&gt;
|autoconf/autoconf-2.64 &amp;amp;nbsp; &amp;amp;nbsp;&amp;amp;nbsp;&lt;br /&gt;
|autoconf/2.64&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.0           &lt;br /&gt;
|cuda/3.0&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|cuda/cuda-3.1          &lt;br /&gt;
|cuda/3.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/ddd-3.3.12   &lt;br /&gt;
|ddd/3.3.12&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|debuggers/gdb-7.1       &lt;br /&gt;
|gdb/7.1&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|editors/nano/2.2.4      &lt;br /&gt;
|nano/2.2.4&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|emacs/emacs-23.1        &lt;br /&gt;
|emacs/23.1.1&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|gcc/gcc-4.4.0           &lt;br /&gt;
|gcc/4.4.0&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|graphics/ncview         &lt;br /&gt;
|ncview/1.93&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|graphics/graphics       &lt;br /&gt;
|grace/5.1.22&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|                        &lt;br /&gt;
|gnuplot/4.2.6&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|svn/svn165              &lt;br /&gt;
|svn/1.6.5&lt;br /&gt;
|''short name unchanged''&lt;br /&gt;
|-&lt;br /&gt;
|visualization/paraview  &lt;br /&gt;
|paraview/3.8&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|amber10/amber10         &lt;br /&gt;
|amber/10.0.30 &lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|gamess/gamess           &lt;br /&gt;
|gamess/May2209 &amp;amp;nbsp;&lt;br /&gt;
|''default's short name unchanged''&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==modulefind - Finding modules by name==&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command will only show you modules whose names start with the argument that you give it, and will alsi return modules that you cannot load due to conflicts with already loaded modules.&lt;br /&gt;
&lt;br /&gt;
A little SciNet utility called &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt; (one word) can do that. It will list all installed modules which contain the arguments, and will determine  whether those modules have &lt;br /&gt;
been loaded, could be loaded, cannot because of conflicts with&lt;br /&gt;
already loaded modules, or have unresolved dependencies &lt;br /&gt;
(i.e. for which other modules need to be loaded first).  This is especially useful in cases like the &amp;quot;boost&amp;quot; libraries, whose module names are cxxlibraries/boost/1.47.0-gcc and cxxlibraries/boost/1.47.0-gcc, for the gcc and intel compiler, respectively.  &amp;lt;tt&amp;gt;modulefind boost&amp;lt;/tt&amp;gt; will find those, whereas &amp;lt;tt&amp;gt;module avail boost&amp;lt;/tt&amp;gt; will not.&lt;br /&gt;
&lt;br /&gt;
Note that just 'modulefind' will list all top-level modules.&lt;br /&gt;
&lt;br /&gt;
== Making your own modules ==&lt;br /&gt;
&lt;br /&gt;
How to make your own modules (e.g. for local installations or to access optional perl modules, ...), is possible and described on the [[Installing your own modules]] page.&lt;br /&gt;
&lt;br /&gt;
== Deprecated modules ==&lt;br /&gt;
&lt;br /&gt;
Some older software modules for which newer versions exist, get deprecated, which means they do not get maintained.  Since deprecated modules should only be needed in rare exceptional cases, they are not listed by the &amp;lt;tt&amp;gt;module avail&amp;lt;/tt&amp;gt; command.  However, if you have a piece of legacy code that really depends on a deprecated version of a library (and we urge you to check that it does not work with newer versions!), then you can load a deprecated version by &amp;lt;pre&amp;gt;module load use.deprecated [deprecated-module-name]&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Currently (Oct 5,2010), the following modules are deprecated on the GPC: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gcc/gcc-4.3.2          hdf5/184-v16-serial     intel/intel-v11.1.046               openmpi/1.3.3-intel-v11.0-ofed&lt;br /&gt;
hdf5/183-v16-openmpi   hdf5/184-v18-intelmpi   intelmpi/impi-3.2.1.009             openmpi/1.3.2-intel-v11.0-ofed.orig&lt;br /&gt;
hdf5/183-v18-openmpi   hdf5/184-v18-openmpi    intelmpi/impi-3.2.2.006             pgplot/5.2.2-gcc.old            &lt;br /&gt;
hdf5/184-v16-intelmpi  hdf5/184-v18-serial     intelmpi/impi-4.0.0.013             pgplot/5.2.2-intel.old&lt;br /&gt;
hdf5/184-v16-openmpi   intel/intel-v11.0.081   intelmpi/impi-4.0.0.025               &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the TCS, currently (Oct 5,2010) the only deprecated module is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ncl/5.1.1old&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Before using any of these deprecated modules, make sure that there is not a regular module that satisfies your needs, likely by a ''very similar name''.&lt;br /&gt;
&lt;br /&gt;
== Commercial software ==&lt;br /&gt;
&lt;br /&gt;
Apart from the compilers on our systems and the ddt parallel debugger, we generally do not provide licensed application software, e.g., no Gaussian, IDL, Matlab, etc. &lt;br /&gt;
See the [https://support.scinet.utoronto.ca/wiki/index.php/FAQ#How_can_I_run_Matlab_.2F_IDL_.2F_Gaussian_.2F_my_favourite_commercial_software_at_SciNet.3F FAQ].&lt;br /&gt;
&lt;br /&gt;
== Other software and libraries ==&lt;br /&gt;
&lt;br /&gt;
If you want to use a piece of software or a library that is not on the list, you can in principle install it yourself in you &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
Note however that building libraries and software from source often uses a lot of files. To avoid running out of disk space, building software is therefore best done from the &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt;, from which&lt;br /&gt;
you can copy/install only the libraries, header files and binaries to your &amp;lt;tt&amp;gt;/home&amp;lt;/tt&amp;gt; directory.&lt;br /&gt;
&lt;br /&gt;
If you suspect that a particular piece of software or a library would be of use to other users of SciNet as well, contact us, and we will consider adding it to the system.&lt;br /&gt;
&lt;br /&gt;
== Software lists ==&lt;br /&gt;
=== ARC/GPU Software ===&lt;br /&gt;
&lt;br /&gt;
The CPUs in the GPU nodes of the ARC cluster are of the same kind as those of the GPC, so all modules available on the GPC are available on the GPU nodes with a CentOS 6 image. This means that the different cuda variants that are available as modules, can be loaded on those GPC nodes as well, although they are of little use on that system.&lt;br /&gt;
&lt;br /&gt;
=== GPC Software ===&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}}| Software  &lt;br /&gt;
!{{Hl2}}| Versions&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-  &lt;br /&gt;
|Intel Compiler&lt;br /&gt;
|12.1.3*, 12.1.5, 13.1.0&lt;br /&gt;
| includes MKL library, which includes BLAS, LAPACK, FFT, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;icpc,icc,ifort&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.6.1*, 4.7.0, 4.7.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Cuda&lt;br /&gt;
| 3.2, 4.0, 4.1*, 4.2&lt;br /&gt;
| NVIDIA's extension to C for GPGPU programming&lt;br /&gt;
| &amp;lt;tt&amp;gt;nvcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cuda&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| PGI Compiler&lt;br /&gt;
| 12.5&lt;br /&gt;
| supports OpenACC and CUDA Fortran &lt;br /&gt;
| &amp;lt;tt&amp;gt;pgcc,pgcpp,pgfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| IntelMPI&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| MPICH2 based MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;intelmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| OpenMPI&lt;br /&gt;
| 1.4.4*, 1.5.4&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpicc,mpiCC,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openmpi&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 2.12.2&lt;br /&gt;
| Berkley Unified Parallel C Implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;upcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Editors'''''&lt;br /&gt;
|- &lt;br /&gt;
| Nano&lt;br /&gt;
| 2.2.4&lt;br /&gt;
| Nano's another editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nano&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Emacs&lt;br /&gt;
| 23.1.1&lt;br /&gt;
| New version of popular text editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;emacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| XEmacs&lt;br /&gt;
| 21.4.22&lt;br /&gt;
| XEmacs editor&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;xemacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Development tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| Autoconf&lt;br /&gt;
| 2.68&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;autoconf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Automake&lt;br /&gt;
| 1.11.2&lt;br /&gt;
|&lt;br /&gt;
| &amp;lt;tt&amp;gt;aclocal, automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;automake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CMake&lt;br /&gt;
| 2.8.6&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scons&lt;br /&gt;
| 2.0&lt;br /&gt;
| Software construction tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scons&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Git&lt;br /&gt;
| 1.7.1&lt;br /&gt;
| Revision control system&lt;br /&gt;
| &amp;lt;tt&amp;gt;git,gitk&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;git&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Intel tools&lt;br /&gt;
| 2011&lt;br /&gt;
| Intel Code Analysis Tools&lt;br /&gt;
| Vtune Amplifier XE, Inspector XE&lt;br /&gt;
| &amp;lt;tt&amp;gt;inteltools&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Mercurial&lt;br /&gt;
| 1.8.2&lt;br /&gt;
| Version control system&amp;lt;br&amp;gt;(part of the python module!)&lt;br /&gt;
| &amp;lt;tt&amp;gt;hg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug and performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool, + MAP MPI Profiler&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt, map&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| DDD&lt;br /&gt;
| 3.3.12&lt;br /&gt;
| Data Display Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GDB&lt;br /&gt;
| 7.3.1&lt;br /&gt;
| GNU debugger (the intel idbc debugger is available by default)&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| MPE2&lt;br /&gt;
| 2.4.5&lt;br /&gt;
| Multi-Processing Environment with intel + OpenMPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpecc, mpefc, jumpshot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#OpenSpeedShop_.28profiling.2C_MPI_tracing:_GPC.29 | OpenSpeedShop]]&lt;br /&gt;
| 2.0.1&lt;br /&gt;
| sampling and MPI tracing&lt;br /&gt;
| &amp;lt;tt&amp;gt;openss, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openspeedshop&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Introduction_To_Performance#Scalasca_.28profiling.2C_tracing:_TCS.2C_GPC.29 | Scalasca]]&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications (Compiled with OpenMPI)&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://ipm-hpc.sourceforge.net IPM]&lt;br /&gt;
| 0.983&lt;br /&gt;
| Integrated Performance Monitors http://ipm-hpc.sourceforge.net/]&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm, ipm_parse, ploticus,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ipm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Performance_And_Debugging_Tools:_GPC#Valgrind | Valgrind]]&lt;br /&gt;
| 3.6.1&lt;br /&gt;
| Memory checking utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind,cachegrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;valgrind&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Padb&lt;br /&gt;
| 3.2 &lt;br /&gt;
| examine and debug parallel programs&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;padb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|&amp;lt;span id=&amp;quot;anchor_viz&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;'''''Visualization tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| Grace&lt;br /&gt;
| 5.1.22&lt;br /&gt;
| Plotting utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;xmgrace&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;grace&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Gnuplot&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Plotting utility&amp;lt;br&amp;gt;Requires 'extras' module if used on compute nodes.&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[ Using_Paraview | ParaView ]]&lt;br /&gt;
| 3.12.0&lt;br /&gt;
| Scientific visualization, server only&lt;br /&gt;
| &amp;lt;tt&amp;gt;pvserver,pvbatch,pvpython&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;paraview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| VMD&lt;br /&gt;
| 1.9&lt;br /&gt;
| Visualization and analysis utility&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;vmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCL/NCARG&lt;br /&gt;
| 6.0.0&lt;br /&gt;
| NCARG graphics and ncl utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| ROOT&lt;br /&gt;
| 5.30.00&lt;br /&gt;
| ROOT Analysis Framework from CERN&lt;br /&gt;
| &amp;lt;tt&amp;gt;root&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ROOT&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| ImageMagick&lt;br /&gt;
| 6.6.7&lt;br /&gt;
| Image manipulation tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;convert,animate,composite,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ImageMagick&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PGPLOT&lt;br /&gt;
| 5.2.2&lt;br /&gt;
| Graphics subroutine library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libcpgplot,libpgplot,libtkpgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;pgplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ncview&lt;br /&gt;
| 2.1.1&lt;br /&gt;
| Visualization for NetCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, etc.&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| CDO&lt;br /&gt;
| 1.5.1&lt;br /&gt;
| Climate Data Operators&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cdo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| UDUNITS&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.6&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc,hdiff,...,libdf,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Hdf5 | HDF5]]&lt;br /&gt;
| 1.8.7-v18*&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.arg0.net/encfs EncFS ]&lt;br /&gt;
| 1.74&lt;br /&gt;
| EncFS provides an encrypted filesystem in user-space, (works ONLY on gpc01..04)&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;encfs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|- &lt;br /&gt;
| [[amber|AMBER 10]]&lt;br /&gt;
| Amber 10 + Amber Tools 1.3&lt;br /&gt;
| Amber Molecular Dynamics Package&lt;br /&gt;
| &amp;lt;tt&amp;gt;sander, sander.MPI&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;amber&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gamess|GAMESS (US)]]&lt;br /&gt;
| August 18, 2011 R1&lt;br /&gt;
| General Atomic and Molecular Electronic Structure System&lt;br /&gt;
| &amp;lt;tt&amp;gt;rungms&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gamess&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[gromacs|GROMACS]]&lt;br /&gt;
| 4.5.5, 4.5.7, 4.6.2&lt;br /&gt;
| GROMACS molecular mechanics, single precision, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;grompp, mdrun&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gromacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[namd|NAMD]]&lt;br /&gt;
| 2.8&lt;br /&gt;
| NAMD - Scalable Molecular Dynammics&lt;br /&gt;
| &amp;lt;tt&amp;gt;namdmpiexec, namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[nwchem|NWChem]]&lt;br /&gt;
| 6.0&lt;br /&gt;
| NWChem Quantum Chemistry&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;nwchem&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 4.3.2, 5.0.3&lt;br /&gt;
| Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;pw.x, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://blast.ncbi.nlm.nih.gov BLAST]&lt;br /&gt;
| 2.2.23+&lt;br /&gt;
| Basic Local Alignment Search Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;blastn,blastp,blastx,psiblast,tblastn...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;blast&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|-&lt;br /&gt;
| [http://denovoassembler.sourceforge.net RAY]&lt;br /&gt;
| 2.1.0 (small k-mer)&lt;br /&gt;
| Parallel de novo genome assemblies&lt;br /&gt;
| &amp;lt;tt&amp;gt;Ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ray&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[cpmd|CPMD]]&lt;br /&gt;
| 3.13.2&lt;br /&gt;
| Carr-Parinello molecular dynamics, MPI&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd.x&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cpmd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[R Statistical Package|R]] &lt;br /&gt;
| 2.13.1&lt;br /&gt;
| statistical computing&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;R&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Octave&lt;br /&gt;
| 3.4.3&lt;br /&gt;
| Matlab-like environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;octave&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.mcs.anl.gov/petsc/petsc-as/  PETSc ]&lt;br /&gt;
| 3.1*&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation (PETSc)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc, etc.. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Armadillo C++ linear algebra library | Armadillo]]&lt;br /&gt;
| 3.910.0&lt;br /&gt;
| C++ armadillo libraries (implement Matlab-like syntax)&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;armadillo&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.tacc.utexas.edu/tacc-projects/gotoblas2/ GotoBLAS]&lt;br /&gt;
| 1.13&lt;br /&gt;
| Optimized BLAS implementation &lt;br /&gt;
| &amp;lt;tt&amp;gt;libgoto2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gotoblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13*, 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.3&lt;br /&gt;
| fast Fourier transform library&lt;br /&gt;
''Be careful in combining fftw3 and MKL: you need to link fftw3 first, with'' &amp;lt;tt&amp;gt;-L${SCINET_FFTW_LIB} -lfftw3&amp;lt;/tt&amp;gt;, then link MKL&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| LAPACK&lt;br /&gt;
| &lt;br /&gt;
| Provided by the Intel MKL library&lt;br /&gt;
| See http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/&lt;br /&gt;
| &amp;lt;tt&amp;gt;intel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://freshmeat.net/projects/rlog  RLog ]&lt;br /&gt;
| 1.4&lt;br /&gt;
| RLog provides a flexible message logging facility for C++ programs and libraries.&lt;br /&gt;
| &amp;lt;tt&amp;gt;librlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/rlog&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[GNU Parallel]]&lt;br /&gt;
| 2012-10-22&lt;br /&gt;
| execute commands in parallel&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnu-parallel&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| Ruby&lt;br /&gt;
| 1.9.1&lt;br /&gt;
| Ruby programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ruby&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 1.6.0&lt;br /&gt;
| IBM's Java JRE ad SDK&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;java&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| Xlibraries&lt;br /&gt;
|&lt;br /&gt;
| A collection of X graphics libraries and tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;xterm&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xpdf&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;Xlibraries&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Extras&lt;br /&gt;
|&lt;br /&gt;
| A collection of standard linux and home-grown tools&lt;br /&gt;
| &amp;lt;tt&amp;gt;bc&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;screen&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;xxdiff&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;modulefind&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;ish&amp;lt;/tt&amp;gt;, ...&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== TCS Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM compilers&lt;br /&gt;
|10.1(c/c++)&amp;lt;br&amp;gt;12.1(fortran)&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlf,xlc_r,xlC_r,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|&lt;br /&gt;
| See [[TCS Quickstart]]&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpcc,mpCC,mpxlf,mpcc_r,mpCC_r,mpxlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| ''standard available''&lt;br /&gt;
|-&lt;br /&gt;
| UPC&lt;br /&gt;
| 1.2&lt;br /&gt;
| Unified Parallel C&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlupc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;upc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|13.1, 14.1&lt;br /&gt;
| newer version &lt;br /&gt;
| &amp;lt;tt&amp;gt;xlf,xlf_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| xlf/13.1&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|11.1, 12.1&lt;br /&gt;
| new versions&lt;br /&gt;
| &amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r&amp;lt;/tt&amp;gt;&lt;br /&gt;
| vacpp&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| MPE2&lt;br /&gt;
| 1.0.6&lt;br /&gt;
| Performance Visualization for Parallel Programs   &lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;mpe&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Scalasca&lt;br /&gt;
| 1.2&lt;br /&gt;
| SCalable performance Analysis of LArge SCale Applications&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;scalasca&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF4&lt;br /&gt;
| 4.2.5&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h4fc, hdiff, ..., libdf, libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf4&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF + ncview&lt;br /&gt;
| 4.0.1*&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf, ncview&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.1.1*&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 3.9.6*&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap, ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| FFTW&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Fast Fourier transform library&amp;lt;br&amp;gt;Part of the extras module on the tcs:&amp;lt;br&amp;gt;compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi,libfftw3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK+SCALAPACK&lt;br /&gt;
| 3.4.2+2.0.2&lt;br /&gt;
| Linear algebra package. Note that essl, which comes with the ibm compilers contains a large part of lapack as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack,libscalapack,libblacs&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lapack&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| PetSc&lt;br /&gt;
| 3.2&lt;br /&gt;
| Portable, Extensible Toolkit for Scientific Computation. With external packages mumps, chaco, hypre, parmetis, prometheus, plapack, superlu, sprng.&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpetsc,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;petsc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of libraries to your user environment&amp;lt;br&amp;gt; compile with &amp;lt;tt&amp;gt;-I${SCINET_EXTRAS_INC}&amp;lt;/tt&amp;gt;&amp;lt;br&amp;gt; link with &amp;lt;tt&amp;gt;-L${SCINET_EXTRAS_LIB}&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;libfftw, libfftw_mpi, libfftw3, libhdf5, liblapack, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gmake&lt;br /&gt;
| 3.82&lt;br /&gt;
| GNU's make. Replaces AIX make or gmake 3.80.&lt;br /&gt;
| &amp;lt;tt&amp;gt;make&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCL&lt;br /&gt;
| 5.1.1&lt;br /&gt;
| NCAR Command Language&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl, libncl, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
''* Several versions of this module are installed; listed is the default version.''&lt;br /&gt;
&lt;br /&gt;
=== P7 Software ===&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1, 13.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlf,xlf_r,xlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1, 11.1&lt;br /&gt;
|See [[P7 Linux Cluster]]&lt;br /&gt;
|&amp;lt;tt&amp;gt;xlc,xlC,xlc_r,xlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.23.2&lt;br /&gt;
| &lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|IBM MPI library&lt;br /&gt;
|5.2.2&lt;br /&gt;
|IBM's Parallel Environment&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpcc,mpCC,mpfort,mpiexec&amp;lt;/tt&amp;gt;&lt;br /&gt;
|pe&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.6.1 , 4.8.1&lt;br /&gt;
| GNU Compiler Collection&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc,g++,gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Java&lt;br /&gt;
| 7.0&lt;br /&gt;
| IBM Java 1.7 implementation&lt;br /&gt;
| &amp;lt;tt&amp;gt;javac&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;jdk&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performancs tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.0, 4.1*&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools and libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.7&lt;br /&gt;
| Scientific data storage and retrieval, parallel I/O&lt;br /&gt;
| &amp;lt;tt&amp;gt;libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.1.3&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump, ncgen, libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| parallel netCDF&lt;br /&gt;
| 1.2.0&lt;br /&gt;
| Scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NCO&lt;br /&gt;
| 4.0.8&lt;br /&gt;
| NCO utilities to manipulate netCDF files&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncap2, ncatted, &amp;lt;/tt&amp;gt; etc.&lt;br /&gt;
| &amp;lt;tt&amp;gt;nco&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.13&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.5&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0 , scipy-0.13.2 , matplotlib-1.3.1 , pyfits-3.2 , h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Other'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| command-driven interactive function and data plotting program&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| antlr&lt;br /&gt;
| 2.7.7&lt;br /&gt;
| ANother Tool for Language Recognition&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr, antlr-config&amp;lt;br&amp;gt;libantlr, antlr.jar, antlr.py&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;antlr&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| udunits&lt;br /&gt;
| 2.1.11&lt;br /&gt;
| unit conversion utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;libudunits2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;udunits&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| extras&lt;br /&gt;
|&lt;br /&gt;
| Adds paths to a fuller set of applications and libraries to your user environment&lt;br /&gt;
| &amp;lt;tt&amp;gt;bindlaunch, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;extras&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=Manuals=&lt;br /&gt;
{{:Manuals}}&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6748</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6748"/>
		<updated>2014-01-12T21:19:43Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Queue Limits */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-12AM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs. As a consequence, a 512-node job can only be submitted with a wall time less than 8 hours on the development system.&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6737</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6737"/>
		<updated>2014-01-09T14:51:03Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Software modules installed on the BGQ */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-8PM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs.&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6736</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6736"/>
		<updated>2014-01-09T14:36:00Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Software modules installed on the BGQ */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-8PM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs.&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc*&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6735</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6735"/>
		<updated>2014-01-09T14:35:29Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Software modules installed on the BGQ */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-8PM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs.&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| 1.8.12-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/1812-v18-serial-gcc*&amp;lt;br/&amp;gt;hdf5/1812-v18-mpich2-gcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6734</id>
		<title>BGQ</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=BGQ&amp;diff=6734"/>
		<updated>2014-01-09T14:20:07Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Software modules installed on the BGQ */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox Computer&lt;br /&gt;
|image=[[Image:Blue_Gene_Cabinet.jpeg|center|300px|thumb]]&lt;br /&gt;
|name=Blue Gene/Q (BGQ)&lt;br /&gt;
|installed=August 2012&lt;br /&gt;
|operatingsystem= RH6.2, CNK (Linux) &lt;br /&gt;
|loginnode= bgqdev-fen1,bgq-fen1&lt;br /&gt;
|nnodes=  2048(32,768 cores), 512 (8,192 cores)&lt;br /&gt;
|rampernode=16 GB &lt;br /&gt;
|corespernode=16 (64 threads)&lt;br /&gt;
|interconnect=5D Torus (jobs), QDR Infiniband (I/O) &lt;br /&gt;
|vendorcompilers= bgxlc, bgxlf&lt;br /&gt;
|queuetype=Loadleveler&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
==System Status==&lt;br /&gt;
&lt;br /&gt;
The current BGQ system status can be found on the wiki's [[Main Page]].&lt;br /&gt;
&lt;br /&gt;
==SOSCIP==&lt;br /&gt;
&lt;br /&gt;
The BGQ is a Southern Ontario Smart Computing&lt;br /&gt;
Innovation Platform ([http://soscip.org/ SOSCIP]) BlueGene/Q supercomputer located at the&lt;br /&gt;
University of Toronto's SciNet HPC facility. The SOSCIP &lt;br /&gt;
multi-university/industry consortium is funded by the Ontario Government &lt;br /&gt;
and the Federal Economic Development Agency for Southern Ontario [http://www.research.utoronto.ca/strategic-initiatives/southern-ontario-smart-computing-innovation-platform].&lt;br /&gt;
&lt;br /&gt;
== Support Email ==&lt;br /&gt;
&lt;br /&gt;
Please use [mailto:bgq-support@scinet.utoronto.ca &amp;lt;bgq-support@scinet.utoronto.ca&amp;gt;] for BGQ-specific inquiries.&lt;br /&gt;
&lt;br /&gt;
==Specifications==&lt;br /&gt;
&lt;br /&gt;
BGQ is an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16core 1.6GHz PowerPC based CPU (PowerPC A2) with 16GB of Ram.  The nodes are bundled in groups of 32 into a node board (512 cores), and 16 boards make up a midplane (8192 cores) with 2 midplanes per rack, or 16,348 cores and 16 TB of RAM per rack. The compute nodes run a very lightweight Linux-based operating system called CNK.  The compute nodes are all connected together using a custom 5D torus highspeed interconnect. Each rack has 16 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the filesystem.  SciNet has 2 BGQ systems, a half-rack 8192 core development system, and a two-rack 32,768 core production system.&lt;br /&gt;
&lt;br /&gt;
[[Image:BlueGeneQHardware2.png‎ |center]]&lt;br /&gt;
&lt;br /&gt;
=== 5D Torus Network ===&lt;br /&gt;
&lt;br /&gt;
The network topology of BlueGene/Q is a five-dimensional (5D) torus, with direct links between the nearest neighbors in the ±A, ±B, ±C, ±D, and ±E directions.  As such there are only a few optimum block sizes that will use the network efficiently.&lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
| '''Node Boards '''&lt;br /&gt;
| '''Compute Nodes'''&lt;br /&gt;
| '''Cores'''&lt;br /&gt;
| '''Torus Dimensions'''&lt;br /&gt;
|-&lt;br /&gt;
| 1&lt;br /&gt;
| 32&lt;br /&gt;
| 512&lt;br /&gt;
| 2x2x2x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 2 (adjacent pairs)&lt;br /&gt;
| 64&lt;br /&gt;
| 1024&lt;br /&gt;
| 2x2x4x2x2&lt;br /&gt;
|-&lt;br /&gt;
| 4 (quadrants)&lt;br /&gt;
| 128&lt;br /&gt;
| 2048&lt;br /&gt;
| 2x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 8 (halves)&lt;br /&gt;
| 256&lt;br /&gt;
| 4096&lt;br /&gt;
| 4x2x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 16 (midplane)&lt;br /&gt;
| 512&lt;br /&gt;
| 8192&lt;br /&gt;
| 4x4x4x4x2&lt;br /&gt;
|-&lt;br /&gt;
| 32 (1 rack)&lt;br /&gt;
| 1024&lt;br /&gt;
| 16384&lt;br /&gt;
| 4x4x4x8x2 &lt;br /&gt;
|-&lt;br /&gt;
| 64 (2 racks)&lt;br /&gt;
| 2048&lt;br /&gt;
| 32768&lt;br /&gt;
| 4x4x8x8x2&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login/Devel Nodes ==&lt;br /&gt;
&lt;br /&gt;
The development nodes for the BGQ are '''bgqdev-fen1''' for the half-rack development system and '''bgq-fen1''' for the 2-rack production system. &lt;br /&gt;
You can login to them from the regular '''login.scinet.utoronto.ca''' login nodes or directly from outside to the half-rack development system using '''bgqdev.scinet.utoronto.ca''', e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh -l USERNAME bgqdev.scinet.utoronto.ca -X&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
where USERNAME is your username on the BGQ and the &amp;lt;tt&amp;gt;-X&amp;lt;/tt&amp;gt; flag is optional, needed only if you will use X graphics.&amp;lt;br/&amp;gt;&lt;br /&gt;
Note: To learn how to setup ssh keys for logging in please see [[Ssh keys]].&lt;br /&gt;
&lt;br /&gt;
These development nodes are Power7 machines running Linux which serve as compilation and submission hosts for the BGQ.  Programs are cross-compiled for the BGQ on these nodes and then submitted to the queue using loadleveler.&lt;br /&gt;
&lt;br /&gt;
===Modules and Environment Variables===&lt;br /&gt;
&lt;br /&gt;
To use most packages on the SciNet machines - including most of the compilers - , you will have to use the `modules' command.  The command &amp;lt;tt&amp;gt;module load some-package&amp;lt;/tt&amp;gt; will set your environment variables (&amp;lt;tt&amp;gt;PATH&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;LD_LIBRARY_PATH&amp;lt;/tt&amp;gt;, etc) to include the default version of that package.   &amp;lt;tt&amp;gt;module load some-package/specific-version&amp;lt;/tt&amp;gt; will load a specific version of that package.  This makes it very easy for different users to use different versions of compilers, MPI versions, libraries etc.&lt;br /&gt;
&lt;br /&gt;
A list of the installed software can be seen on the system by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module avail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load a module (for example, the default version of the intel compilers)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload a module&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module unload vacpp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To unload all modules&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module purge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
These commands can go in your .bashrc files to make sure you are using the correct packages.&lt;br /&gt;
&lt;br /&gt;
Modules that load libraries, define environment variables pointing to the location of library files and include files for use Makefiles. These environment variables follow the naming convention&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 $SCINET_[short-module-name]_BASE&lt;br /&gt;
 $SCINET_[short-module-name]_LIB&lt;br /&gt;
 $SCINET_[short-module-name]_INC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
for the base location of the module's files, the location of the libraries binaries and the header files, respectively.&lt;br /&gt;
&lt;br /&gt;
So to compile and link the library, you will have to add &amp;lt;tt&amp;gt;-I${SCINET_[module-basename]_INC}&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;-L${SCINET_[module-basename]_LIB}&amp;lt;/tt&amp;gt;, respectively, in addition to the usual &amp;lt;tt&amp;gt;-l[libname]&amp;lt;/tt&amp;gt;.  &lt;br /&gt;
&lt;br /&gt;
Note that a &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; command ''only'' sets the environment variables in your current shell (and any subprocesses that the shell launches).   It does ''not'' effect other shell environments.&lt;br /&gt;
&lt;br /&gt;
If you always require the same modules, it is easiest to load those modules in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and then they will always be present in your environment; if you routinely have to flip back and forth between modules, it is easiest to have almost no modules loaded in your &amp;lt;tt&amp;gt;.bashrc&amp;lt;/tt&amp;gt; and simply load them as you need them (and have the required &amp;lt;tt&amp;gt;module load&amp;lt;/tt&amp;gt; commands in your job submission scripts).&lt;br /&gt;
&lt;br /&gt;
=== Compilers ===&lt;br /&gt;
&lt;br /&gt;
The BGQ uses IBM XL compilers to cross-compile code for the BGQ.  Compilers are available for FORTRAN, C, and C++.  They are accessible by default, or by loading the '''xlf''' and '''vacpp''' modules. The compilers by default produce&lt;br /&gt;
static binaries, however with BGQ it is possible to now use dynamic libraries as well.  The compilers follow the XL conventions with the prefix '''bg''',&lt;br /&gt;
so '''bgxlc''' and '''bgxlf90''' are the C and FORTRAN compilers respectively.  &lt;br /&gt;
&lt;br /&gt;
Most users however will use the MPI variants, i.e. '''mpixlf90''' and '''mpixlc''' and  which are available by loading&lt;br /&gt;
the '''mpich2''' module. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mpich2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is recommended to use at least the following flags when compiling and linking&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-O3 -qarch=qp -qtune=qp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to build a package for which the configure script tries to run small test jobs, the cross-compiling nature of the bgq can get in the way.  In that case, you should use the interactive [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] environment as described below.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
&lt;br /&gt;
As the BGQ architecture is different from the development nodes, the only way to test your program is to submit a job to the BGQ.  Jobs are submitted through loadleveler using '''runjob''' which in many ways similar to mpirun or mpiexec.  As shown above in the network topology overview, there are only a few optimum job size configurations which is also further constrained by each block requiring a minimum of one IO node.  In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest block size. Normally a block size matches the job size to offer fully dedicated resources to the job.  Smaller jobs can be run within the same block however this results in shared resources (network and IO) and are referred to as sub-block jobs and are described in more detail below.  &lt;br /&gt;
&lt;br /&gt;
=== runjob ===&lt;br /&gt;
&lt;br /&gt;
All BGQ jobs are launced using '''runjob''' which for those familiar with MPI is analogous to mpirun/mpiexec.  Jobs run on a block, which is a predefined group of nodes that have already been configured and booted.  When using loadleveler this is set for you, and you do not have to specify the block name.  For example, if your loadleveler script requests 64 nodes, each with 16 cores (for a total of 1024 cores), you run a job with 16 processes per node and 1024 total processes with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Here, &amp;lt;tt&amp;gt;--np 1024&amp;lt;/tt&amp;gt; sets the total number of mpi tasks, while &amp;lt;tt&amp;gt;--ranks-per-node=16&amp;lt;/tt&amp;gt; specifies that 16 processes should run on each node.&lt;br /&gt;
For pure mpi jobs, it is advisable always to give the number of ranks per node, because the default value of 1 may leave 15 cores on the node idle. The argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64. &lt;br /&gt;
&lt;br /&gt;
(Note: If this were not a loadleveler job, and the block ID was R00-M0-N03-64, the command would be &amp;quot;&amp;lt;tt&amp;gt;runjob --block R00-M0-N03-64 --np 1024 --ranks-per-node=16 --cwd=$PWD : $PWD/code -f file.in&amp;lt;/tt&amp;gt;&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
runjob flags are shown with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob -h&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
a particularly useful one is&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--verbose #&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where # is from 1-7 which can be helpful in debugging an application.&lt;br /&gt;
&lt;br /&gt;
=== How to set ranks-per-node ===&lt;br /&gt;
&lt;br /&gt;
There are 16 cores per node, but the argument to ranks-per-node may be 1, 2, 4, 8, 16, 32, or 64.  While it may seem natural to set ranks-per-node to 16, this is not generally recommended.  On the BGQ, one can efficiently run more than 1 process per node, because each core has four &amp;quot;hardware threads&amp;quot; (similar to HyperThreading on the GPC and Simultaneous Multi Threading on the TCS and P7), which can keep the different parts of each core busy at the same time. One would therefore ideally use 64 ranks per node.  There are two main reason why one might not set ranks-per-node to 64:&lt;br /&gt;
# The memory requirements do not allow 64 ranks (each rank only has 256MB of memory)&lt;br /&gt;
# The application is more efficient in a hybrid MPI/OpenMP mode (or MPI/pthreads). Using less ranks-per-node, the hardware threads are used as OpenMP threads within each process.&lt;br /&gt;
Because threads can share memory, the memory requirements of the hybrid runs is typically smaller than that of pure MPI runs.&lt;br /&gt;
&lt;br /&gt;
Note that the total number of mpi processes in a runjob (i.e., the --np argument) should be the ranks-per-node times the number of nodes (set by bg_size in the loadleveler script). So for the same number of nodes, if you change ranks-per-node by a factor of two, you should also multiply the total number of mpi processes by two.&lt;br /&gt;
&lt;br /&gt;
=== Queue Limits ===&lt;br /&gt;
&lt;br /&gt;
The maximum wall_clock_limit on the development '''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' system is 12 hours and 24 hours on the production '''&amp;lt;tt&amp;gt;bgq&amp;lt;/tt&amp;gt;'''.  Official SOSCIP porject jobs are prioritized over all other jobs using a fairshare algorithm with a 14 day rolling window.&lt;br /&gt;
&lt;br /&gt;
A 64 node block is reserved on the devel system for development and interactive testing from 8AM-8PM everyday including weekends.  This block is accessed by using the [[BGQ#Interactive_Use_.2F_Debugging | &amp;lt;tt&amp;gt;'''debugjob'''&amp;lt;/tt&amp;gt;]] command which has a 30 minute maximum wall_clock_limit. The purpose of this reservation is to ensure short testing jobs are run quickly without being held up by longer production type jobs.&lt;br /&gt;
&lt;br /&gt;
=== Batch Jobs ===&lt;br /&gt;
&lt;br /&gt;
Job submission is done through loadleveler with a few blue gene specific commands.  The command &amp;quot;bg_size&amp;quot; is in number of nodes, not cores, so a bg_size=64 would be 64x16=1024 cores.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# @ job_name           = bgsample&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job By Size&amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue &lt;br /&gt;
&lt;br /&gt;
# Launch all BGQ jobs using runjob&lt;br /&gt;
runjob --np 1024 --ranks-per-node=16 --envs OMP_NUM_THREADS=1 --cwd=$SCRATCH/ : $HOME/mycode.exe myflags&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit to the queue use &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llsubmit myscript.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Monitoring Jobs ===&lt;br /&gt;
&lt;br /&gt;
To see running jobs&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
or&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llq -b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
to cancel a job use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llcancel JOBID&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and to look at details of the bluegene resources use&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
llbgstatus -M all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Note: the loadleveler script commands  are not run on a bgq compute node but on the front-end node. Only programs started with runjob run on the bgq compute nodes. You should therefore keep scripting in the submission script to a bare minimum.'''&lt;br /&gt;
&lt;br /&gt;
=== Interactive Use / Debugging ===&lt;br /&gt;
&lt;br /&gt;
As BGQ codes are cross-compiled they cannot be run direclty on the front-nodes.  &lt;br /&gt;
Users however only have access to the BGQ through loadleveler which is appropriate for batch jobs, &lt;br /&gt;
however an interactive session is typically beneficial when debugging and developing.   As such a &lt;br /&gt;
script has been written to allow a session in which runjob can be run interactively.  The script&lt;br /&gt;
uses loadleveler to setup a block and set all the correct environment variables and then launch a spawned shell on&lt;br /&gt;
the front-end node. The '''debugjob''' session currently allows a 30 minute session on 64 nodes and when run on &lt;br /&gt;
'''&amp;lt;tt&amp;gt;bgqdev&amp;lt;/tt&amp;gt;''' runs in a dedicated reservation as described previously in the [[BGQ#Queue_Limits | queue limits]] section. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@bgqdev-fen1]$ debugjob&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ runjob --np 64 --ranks-per-node=16 --cwd=$PWD : $PWD/my_code -f myflags&lt;br /&gt;
&lt;br /&gt;
[user@bgqdev-fen1]$ exit&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For debugging, gdb and Allinea DDT are available. The latter is recommended as it automatically attaches to all the processes of a process (instead of attaching a gdbtool by hand (as explained in the BGQ Application Development guide, link below). Simply compile with &amp;lt;tt&amp;gt;-g&amp;lt;/tt&amp;gt;, load the &amp;lt;tt&amp;gt;ddt/4.1&amp;lt;/tt&amp;gt; module, type &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt; and follow the graphical user interface.  The DDT user guide can be found below.&lt;br /&gt;
&lt;br /&gt;
Apart from debugging, this environment is also useful for building libraries and applications that need to run small tests as part of their 'configure' step.   Within the debugjob session, applications compiled with the bgxl compilers or the mpcc/mpCC/mpfort wrappers, will automatically run on the BGQ, skipping the need for the runjob command, provided if you set the following environment variables &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ export BG_PGM_LAUNCHER=yes&lt;br /&gt;
$ export RUNJOB_NP=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The latter setting sets the number of mpi processes to run.  Most configure scripts expect only one mpi process, thus, &amp;lt;tt&amp;gt;RUNJOB_NP=1&amp;lt;/tt&amp;gt; is appropriate.&lt;br /&gt;
&lt;br /&gt;
=== Sub-block jobs ===&lt;br /&gt;
&lt;br /&gt;
BGQ allows multiple applications to share the same block, which is referred to as sub-block jobs, however this needs to be done from within the same loadleveler submission script using multiple calls to runjob.  To run a sub-block job, you need to specify a &amp;quot;--corner&amp;quot; within the block to start each job and a 5D Torus AxBxCxDxE &amp;quot;--shape&amp;quot;.  The starting corner will depend on the specific block details provided by loadleveler and the shape and size of job trying to be used.  &lt;br /&gt;
&lt;br /&gt;
Figuring out what the corners and shapes should be is very tricky (especially since it depends on the block you get allocated).  For that reason, we've created a script called &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; that determines the corners and shape of the sub-blocks.  It only handles the (presumable common) case in which you want to subdivide the block into n equally sized sub-blocks, where n may be 1,2,4,8,16 and 32.&lt;br /&gt;
&lt;br /&gt;
Here is an example script calling &amp;lt;tt&amp;gt;subblocks&amp;lt;/tt&amp;gt; with a size of 4 that will return the appropriate $SHAPE argument and an array of 16 starting $CORNER. &lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# @ job_name           = bgsubblock&lt;br /&gt;
# @ job_type           = bluegene&lt;br /&gt;
# @ comment            = &amp;quot;BGQ Job SUBBLOCK &amp;quot;&lt;br /&gt;
# @ error              = $(job_name).$(Host).$(jobid).err&lt;br /&gt;
# @ output             = $(job_name).$(Host).$(jobid).out&lt;br /&gt;
# @ bg_size            = 64&lt;br /&gt;
# @ wall_clock_limit   = 30:00&lt;br /&gt;
# @ bg_connectivity    = Torus&lt;br /&gt;
# @ queue&lt;br /&gt;
&lt;br /&gt;
# Using subblocks script to set $SHAPE and array of ${CORNERS[n]}&lt;br /&gt;
# with size of subblocks in nodes (ie similiar to bg_size)&lt;br /&gt;
&lt;br /&gt;
# In this case 16 sub-blocks of 4 cnodes each (64 total ie bg_size)&lt;br /&gt;
source subblocks 4&lt;br /&gt;
&lt;br /&gt;
# 16 jobs of 4 each&lt;br /&gt;
for (( i=0; i &amp;lt;  16 ; i++)); do&lt;br /&gt;
   runjob --corner ${CORNER[$i]} --shape $SHAPE --np 64 --ranks-per-node=16 :  your_code_here &amp;gt; $i.out &amp;amp;&lt;br /&gt;
done&lt;br /&gt;
wait&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Remember that subjobs are not the ideal way to run on the BlueGene/Qs. One needs to consider that these sub-blocks all have to share the same I/O nodes, so for I/O intensive jobs this will be an inefficient setup.  Also consider that if you need to run such small jobs that you have to run in sub-blocks, it may be more efficient to use other clusters such as the GPC.&lt;br /&gt;
&lt;br /&gt;
Let us know if you run into any issues with this technique, please contact bgq-support for help.&lt;br /&gt;
&lt;br /&gt;
== Filesystem ==&lt;br /&gt;
&lt;br /&gt;
The BGQ has its own dedicated 500TB file system based on GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not backed up. The path to your home directory is in the environment variable $HOME, and will look like /home/G/GROUP/USER, .  The path to your scratch directory is in the environment variable $SCRATCH, and will look like /scratch/G/GROUP/USER (following the conventions of the rest of the SciNet systems).  &lt;br /&gt;
&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! {{Hl2}} | file system &lt;br /&gt;
! {{Hl2}} | purpose &lt;br /&gt;
! {{Hl2}} | user quota &lt;br /&gt;
! {{Hl2}} | backed up&lt;br /&gt;
! {{Hl2}} | purged&lt;br /&gt;
|- &lt;br /&gt;
| /home&lt;br /&gt;
| development&lt;br /&gt;
| 50 GB&lt;br /&gt;
| yes&lt;br /&gt;
| never&lt;br /&gt;
|-&lt;br /&gt;
| /scratch&lt;br /&gt;
| computation&lt;br /&gt;
| 20 TB&lt;br /&gt;
| no&lt;br /&gt;
| not currently&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Transfering files===&lt;br /&gt;
Although the GPFS file system of the BGQ is shared between the bgq development and production system, except for HPSS &lt;br /&gt;
the file system is '''not''' shared with the other SciNet systems (gpc, tcs, p7, arc), nor is the other file system mounted on the BGQ.  &lt;br /&gt;
Use scp to copy files from one file system to the other, e.g., from bgqdev-fen1, you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour login.scinet.utoronto.ca:code.tgz .&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
or from a login node you could do&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  $ scp -c arcfour code.tgz bgqdev.scinet.utoronto.ca:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The flag &amp;lt;tt&amp;gt;-c arcfour&amp;lt;/tt&amp;gt; is optional. It tells scp (or really, ssh), to use a non-default encryption. The one chosen here, arcfour, has been found to speed up the transfer by a factor of two (you may expect around 85MB/s).  This encryption method is only recommended for copying from the BGQ file system to the regular SciNet GPFS file system or back. &lt;br /&gt;
 &lt;br /&gt;
Note that although these transfers are witihin the same data center, you have to use the full names of the systems, login.scinet.utoronto.ca and bgq.scinet.utoronto.ca, respectively, and that you will be asked you for your password.&lt;br /&gt;
&lt;br /&gt;
===How much Disk Space Do I have left?===&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;'''diskUsage'''&amp;lt;/tt&amp;gt; command, available on the bgqdev nodes, provides information in a number of ways on the home and scratch file systems. For instance, how much disk space is being used by yourself and your group (with the -a option), or how much your usage has changed over a certain period (&amp;quot;delta information&amp;quot;) or you may generate plots of your usage over time. Please see the usage help below for more details.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Usage: diskUsage [-h|-?| [-a] [-u &amp;lt;user&amp;gt;] [-de|-plot]&lt;br /&gt;
       -h|-?: help&lt;br /&gt;
       -a: list usages of all members on the group&lt;br /&gt;
       -u &amp;lt;user&amp;gt;: as another user on your group&lt;br /&gt;
       -de: include delta information&lt;br /&gt;
       -plot: create plots of disk usages&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that the information on usage and quota is only updated hourly!&lt;br /&gt;
&lt;br /&gt;
===Bridge to HPSS===&lt;br /&gt;
&lt;br /&gt;
BGQ users may transfer material to/from HPSS via the GPC archive queue. On the HPSS gateway node (gpc-archive01), the BGQ GPFS file systems are mounted under a single mounting point /bgq (/bgq/scratch and /bgq/home). For detailed information on the use of HPSS [https://support.scinet.utoronto.ca/wiki/index.php/HPSS please read the HPSS wiki section.]&lt;br /&gt;
&lt;br /&gt;
== Software modules installed on the BGQ ==&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
!{{Hl2}} |Software  &lt;br /&gt;
!{{Hl2}}| Version&lt;br /&gt;
!{{Hl2}}| Comments&lt;br /&gt;
!{{Hl2}}| Command/Library&lt;br /&gt;
!{{Hl2}}| Module Name&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Compilers'''''&lt;br /&gt;
|-&lt;br /&gt;
|IBM fortran compiler&lt;br /&gt;
|14.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlf,bgxlf_r,bgxlf90,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|xlf&lt;br /&gt;
|-&lt;br /&gt;
|IBM c/c++ compilers&lt;br /&gt;
|12.1&lt;br /&gt;
|These are cross compilers&lt;br /&gt;
|&amp;lt;tt&amp;gt;bgxlc,bgxlC,bgxlc_r,bgxlC_r,...&amp;lt;/tt&amp;gt;&lt;br /&gt;
|vacpp&lt;br /&gt;
|-&lt;br /&gt;
|MPICH2 MPI library&lt;br /&gt;
|1.4.1&lt;br /&gt;
|There are 4 versions (see BGQ Applications Development document).&lt;br /&gt;
|&amp;lt;tt&amp;gt;mpicc,mpicxx,mpif77,mpif90&amp;lt;/tt&amp;gt;&lt;br /&gt;
|mpich2&lt;br /&gt;
|- &lt;br /&gt;
| GCC Compiler&lt;br /&gt;
| 4.4.6, 4.8.1&lt;br /&gt;
| GNU Compiler Collection for BGQ&amp;lt;br&amp;gt;(4.8.1 requires binutils/2.23 to be loaded)&lt;br /&gt;
| &amp;lt;tt&amp;gt;powerpc64-bgq-linux-gcc, powerpc64-bgq-linux-g++, powerpc64-bgq-linux-gfortran&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;bgqgcc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Binutils&lt;br /&gt;
| 2.21.1, 2.23&lt;br /&gt;
| Cross-compilation utilities&lt;br /&gt;
| &amp;lt;tt&amp;gt;addr2line, ar, ld, ...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;binutils&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| CMake	&lt;br /&gt;
| 2.8.8&lt;br /&gt;
| cross-platform, open-source build system&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cmake&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Debug/performance tools'''''&lt;br /&gt;
|-&lt;br /&gt;
| gdb&lt;br /&gt;
| 7.2&lt;br /&gt;
| GNU Debugger&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gdb&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [http://www.allinea.com/products/ddt/ DDT]&lt;br /&gt;
| 4.1&lt;br /&gt;
| Allinea's Distributed Debugging Tool&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;ddt&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[HPCTW]]&lt;br /&gt;
| 1.0&lt;br /&gt;
| BGQ MPI and Hardware Counters&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmpihpm.a, libmpihpm_smp.a, libmpitrace.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hptibm&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| [[MemP]]&lt;br /&gt;
| 1.0.3&lt;br /&gt;
| BGQ Memory Stats&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmemP.a &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;memP&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Storage tools/libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
| 1.8.9-v18&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;h5ls, h5diff, ..., libhdf5&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;hdf5/189-v18-serial-xlc*&amp;lt;br/&amp;gt;hdf5/189-v18-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NetCDF&lt;br /&gt;
| 4.2.1.1&lt;br /&gt;
| Scientific data storage and retrieval&lt;br /&gt;
| &amp;lt;tt&amp;gt;ncdump,ncgen,libnetcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;netcdf/4.2.1.1-serial-xlc*&amp;lt;br/&amp;gt;netcdf/4.2.1.1-mpich2-xlc&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| Parallel NetCDF&lt;br /&gt;
| 1.3.1&lt;br /&gt;
| Parallel scientific data storage and retrieval using MPI-IO&lt;br /&gt;
| &amp;lt;tt&amp;gt;libpnetcdf.a&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parallel-netcdf&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Libraries'''''&lt;br /&gt;
|-&lt;br /&gt;
| ESSL&lt;br /&gt;
| 5.1&lt;br /&gt;
| IBM Engineering and Scientific Subroutine Library (manual below)&lt;br /&gt;
| &amp;lt;tt&amp;gt;libesslbg,libesslsmpbg&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;essl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|- &lt;br /&gt;
| FFTW&lt;br /&gt;
| 2.1.5, 3.3.2, 3.1.2-esslwrapper&lt;br /&gt;
| Fast fourier transform &lt;br /&gt;
| &amp;lt;tt&amp;gt;libsfftw,libdfftw,libfftw3, libfftw3f&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;fftw/2.1.5, fftw/3.3.2, fftw/3.1.2-esslwrapper&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAPACK + ScaLAPACK&lt;br /&gt;
| 3.4.2 + 2.0.2&lt;br /&gt;
| Linear algebra routines. A subset of Lapack may be found in ESSL as well.&lt;br /&gt;
| &amp;lt;tt&amp;gt;liblapack, libscalpack&amp;lt;/tt&amp;gt;&lt;br /&gt;
| lapack&lt;br /&gt;
|-&lt;br /&gt;
| GSL&lt;br /&gt;
| 1.15&lt;br /&gt;
| GNU Scientific Library&lt;br /&gt;
| &amp;lt;tt&amp;gt;libgsl, libgslcblas&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gsl&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| BOOST&lt;br /&gt;
| 1.47.0, 1.54&lt;br /&gt;
| C++ Boost libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libboost...&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;cxxlibraries/boost&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| bzip2+szip+zlib&lt;br /&gt;
| 1.0.6,2.1,1.2.3&lt;br /&gt;
| compression libraries&lt;br /&gt;
| &amp;lt;tt&amp;gt;libbz2,libz,libsz&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;compression&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| METIS&lt;br /&gt;
| 5.0.2&lt;br /&gt;
| Serial Graph Partitioning and Fill-reducing Matrix Ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;metis&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| ParMETIS&lt;br /&gt;
| 4.0.2&lt;br /&gt;
| Parallel graph partitioning and fill-reducing matrix ordering&lt;br /&gt;
| &amp;lt;tt&amp;gt;libparmetis&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;parmetis&amp;lt;/tt&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Scripting/interpreted languages'''''&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.6.6&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-2.6/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 2.7.3&lt;br /&gt;
| Python programming language. Modules included : numpy-1.8.0, pyFFTW-0.9.2, astropy-0.3, scipy-0.13.1, mpi4py-1.3.1, h5py-2.2.1&lt;br /&gt;
| &amp;lt;tt&amp;gt;/scinet/bgq/tools/Python/python2.7.3-20131205/bin/python&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[Python]]&lt;br /&gt;
| 3.2.2&lt;br /&gt;
| Python programming language&lt;br /&gt;
| &amp;lt;tt&amp;gt;/bgsys/tools/Python-3.2/bin/python3&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;python&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|colspan=5 style='background: #E0E0E0'|'''''Applications'''''&lt;br /&gt;
|-&lt;br /&gt;
| gnuplot&lt;br /&gt;
| 4.6.1&lt;br /&gt;
| interactive plotting program to be run on front-end nodes&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;gnuplot&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| LAMMPS&lt;br /&gt;
| Nov 2012&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;lmp_bgq&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;lammps&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| NAMD&lt;br /&gt;
| 2.9&lt;br /&gt;
| Molecular Dynamics &lt;br /&gt;
| &amp;lt;tt&amp;gt;namd2&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;namd&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.quantum-espresso.org/index.php Quantum Espresso]&lt;br /&gt;
| 5.0.3&lt;br /&gt;
| Molecular Structure / Quantum Chemistry &lt;br /&gt;
| &amp;lt;tt&amp;gt;qe_pw.x, etc&amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;espresso&amp;lt;/tt&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| [[BGQ_OpenFOAM | OpenFOAM]]&lt;br /&gt;
| 2.2.0&lt;br /&gt;
| Computational Fluid Dynamics&lt;br /&gt;
| &amp;lt;tt&amp;gt;icofoam,etc. &amp;lt;/tt&amp;gt;&lt;br /&gt;
| &amp;lt;tt&amp;gt;openfoam&amp;lt;/tt&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
#BGQ Day: Intro to Using the BGQ&amp;lt;br/&amp;gt;[[File:BgqIntro-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html]]&amp;lt;br/&amp;gt;[[Media:Bgqintro.pdf|Slides ]] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqintro/bgqintro.mp4 (direct link)]&lt;br /&gt;
#BGQ Day: BGQ Hardware Overview&amp;lt;br/&amp;gt;[[File:Bgqhardware-FirstFrame.png|180px|link=http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html]]&amp;lt;br/&amp;gt;[https://support.scinet.utoronto.ca/~northrup/bgqhardware.pdf Slides] &amp;amp;nbsp;/ &amp;amp;nbsp; [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.html Video recording] [http://support.scinet.utoronto.ca/CourseVideo/BGQ/bgqhardware/bgqhardware.mp4 (direct link)]&lt;br /&gt;
# [http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUQUEEN/Documentation/Documention_node.html Julich BGQ Documentation]&lt;br /&gt;
# [https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q Argonne Mira BGQ Wiki]&lt;br /&gt;
# [https://computing.llnl.gov/tutorials/bgq/ LLNL Sequoia BGQ Info]&lt;br /&gt;
# [https://www.alcf.anl.gov/presentations Argonne MiraCon Presentations]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247869/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ System Administration Guide]&lt;br /&gt;
# [http://www.redbooks.ibm.com/redbooks/SG247948/wwhelp/wwhimpl/js/html/wwhelp.htm BGQ Application Development ]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqccompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqclangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL C/C++ for Blue Gene/Q: [[Media:bgqcproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfgetstart.pdf|Getting started]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfcompiler.pdf|Compiler reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqflangref.pdf|Language reference]]&lt;br /&gt;
# IBM XL Fortran for Blue Gene/Q: [[Media:bgqfproguide.pdf|Optimization and Programming Guide]]&lt;br /&gt;
# [[Media:essl51.pdf|IBM ESSL (Engineering and Scientific Subroutine Library) 5.1 for Linux on Power]]&lt;br /&gt;
# [http://content.allinea.com/downloads/userguide.pdf Allinea DDT 4.1 User Guide]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--  PUT IN TRAC !!!&lt;br /&gt;
&lt;br /&gt;
=== *Manual Block Creation* ===&lt;br /&gt;
&lt;br /&gt;
To reconfigure the BGQ nodes you can use the bg_console or the web based navigator from the service node &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bg_console&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the&lt;br /&gt;
following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gen_small_block &amp;lt;blockid&amp;gt; &amp;lt;midplane&amp;gt; &amp;lt;cnodes&amp;gt; &amp;lt;nodeboard&amp;gt; &lt;br /&gt;
gen_small_block  R00-M0-N03-32 R00-M0 32 N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The block then needs to be booted using:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allocate R00-M0-N03-32&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If those resources are already booted into another block, that block must be freed before the new block can be &lt;br /&gt;
allocated.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
free R00-M0-N03&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many other functions in bg_console:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
help all&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The BGQ default nomenclature for hardware is as follows:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
(R)ack - (M)idplane - (N)ode board or block - (J)node - (C)ore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So R00-M01-N03-J00-C02 would correspond to the first rack, second midplane, 3rd block, 1st node, and second core.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
--!&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Hdf5_table&amp;diff=6729</id>
		<title>Hdf5 table</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Hdf5_table&amp;diff=6729"/>
		<updated>2014-01-07T17:30:17Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Storing tables in HDF5 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=='''Storing tables in HDF5'''==&lt;br /&gt;
The HDF5 Table interface condenses the steps needed to create tables in HDF5. The datatype of the dataset that gets created is of type H5T_COMPOUND. The members of the table can have different datatypes.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Writting a table using Python (PyTables)===&lt;br /&gt;
PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. &lt;br /&gt;
PyTables is built on top of the HDF5 library, using the Python language and the NumPy package.&lt;br /&gt;
The following example shows how to store a table of 10 records with 8 members :&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-  &lt;br /&gt;
! {{Hl2}} | name &lt;br /&gt;
! {{Hl2}} | ADCcount &lt;br /&gt;
! {{Hl2}} | grid_i &lt;br /&gt;
! {{Hl2}} | grid_j&lt;br /&gt;
! {{Hl2}} | pressure&lt;br /&gt;
! {{Hl2}} | energy &lt;br /&gt;
! {{Hl2}} | idnumber&lt;br /&gt;
! {{Hl2}} | pressure2&lt;br /&gt;
|- &lt;br /&gt;
|16-character String&lt;br /&gt;
|Unsigned short integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|float  (single-precision) &lt;br /&gt;
|double (double-precision)&lt;br /&gt;
|Signed 64-bit integer&lt;br /&gt;
|2-dim table of float (2*3)&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The script has been run on gpc with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1  intel/14.0.0  python/2.7.2  hdf5/1811-v18-serial-gcc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
PyTable 3.0.0 has been compiled in my scratch directory.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
from tables import *&lt;br /&gt;
&lt;br /&gt;
class Particle(IsDescription):&lt;br /&gt;
    name      = StringCol(16)   # 16-character String                                                                                                         &lt;br /&gt;
    ADCcount  = UInt16Col()     # Unsigned short integer                                                                                                      &lt;br /&gt;
    grid_i    = Int32Col()      # 32-bit integer                                                                                                              &lt;br /&gt;
    grid_j    = Int32Col()      # 32-bit integer                                                                                                              &lt;br /&gt;
    pressure  = Float32Col()    # float  (single-precision)                                                                                                   &lt;br /&gt;
    energy    = Float64Col()    # double (double-precision)                                                                                                   &lt;br /&gt;
    idnumber  = Int64Col()      # Signed 64-bit integer                                                                                                       &lt;br /&gt;
    pressure2    = Float32Col(shape=(2,3)) # array of floats (single-precision)&lt;br /&gt;
&lt;br /&gt;
h5file = open_file(&amp;quot;tutorial1.h5&amp;quot;, mode = &amp;quot;w&amp;quot;, title = &amp;quot;Test file&amp;quot;)&lt;br /&gt;
group = h5file.create_group(&amp;quot;/&amp;quot;, 'detector', 'Detector information')&lt;br /&gt;
table = h5file.create_table(group, 'readout', Particle, &amp;quot;Readout example&amp;quot;)&lt;br /&gt;
particle = table.row&lt;br /&gt;
for i in xrange(10):&lt;br /&gt;
    particle['name']  = 'Particle: %6d' % (i)&lt;br /&gt;
    particle['ADCcount'] = (i * 256) % (1 &amp;lt;&amp;lt; 16)&lt;br /&gt;
    particle['grid_i'] = i&lt;br /&gt;
    particle['grid_j'] = 10 - i&lt;br /&gt;
    particle['pressure'] = float(i*i)&lt;br /&gt;
    particle['energy'] = float(particle['pressure'] ** 4)&lt;br /&gt;
    particle['idnumber'] = i * (2 ** 34)&lt;br /&gt;
    particle['pressure2'] = [&lt;br /&gt;
        [0.5+float(i),1.5+float(i),2.5+float(i)],&lt;br /&gt;
        [-1.5+float(i),-2.5+float(i),-3.5+float(i)]]&lt;br /&gt;
    # Insert a new particle record                                                                                                                            &lt;br /&gt;
    particle.append()&lt;br /&gt;
&lt;br /&gt;
h5file.close()&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Reading the table with a C++ code with MPI for parallel programming===&lt;br /&gt;
The following example shows how to read the table in a MPI process (each MPI process will read one individual record). &lt;br /&gt;
The code has been compiled and tested on BlueGene with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load vacpp/12.1  xlf/14.1  mpich2/xl hdf5/189-v18-mpich2-xlc&lt;br /&gt;
mpixlcxx -I$SCINET_HDF5_INC -L$SCINET_ZLIB_LIB -L$SCINET_SZIP_LIB -L$SCINET_HDF5_LIB Test.cpp -o Test -lhdf5_hl -lhdf5 -lsz -lz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Test.cpp '''(the order of the variables (alphabetical) is important. The c++ code has to read the variables in the same order as in the hdf5 file : check with h5dump YourFile.h5 )''' :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;quot;hdf5.h&amp;quot;&lt;br /&gt;
#include &amp;quot;hdf5_hl.h&amp;quot;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
#include &amp;lt;iostream&amp;gt;&lt;br /&gt;
#include &amp;lt;stdint.h&amp;gt;&lt;br /&gt;
#include &amp;lt;mpi.h&amp;gt;&lt;br /&gt;
#define NFIELDS  (hsize_t)  8&lt;br /&gt;
#define H5FILE_NAME     &amp;quot;tutorial1.h5&amp;quot;&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char *argv[])&lt;br /&gt;
{&lt;br /&gt;
  // DEF OF SIZE OF VARIABLES TO READ                                                                                                                         &lt;br /&gt;
  typedef struct Particle&lt;br /&gt;
  {&lt;br /&gt;
    unsigned short int    ADCcount;&lt;br /&gt;
    double energy;&lt;br /&gt;
    int grid_i;&lt;br /&gt;
    int grid_j;&lt;br /&gt;
    long  idnumber;&lt;br /&gt;
    char   name[16];&lt;br /&gt;
    float pressure;&lt;br /&gt;
    float pressure2[2][3];&lt;br /&gt;
  } Particle;&lt;br /&gt;
&lt;br /&gt;
  /* Calculate the size and the offsets of our struct members in memory */&lt;br /&gt;
  size_t dst_size =  sizeof( Particle );&lt;br /&gt;
&lt;br /&gt;
  size_t dst_offset[NFIELDS] = {&lt;br /&gt;
    HOFFSET( Particle, ADCcount ),&lt;br /&gt;
    HOFFSET( Particle, energy ),&lt;br /&gt;
    HOFFSET( Particle, grid_i ),&lt;br /&gt;
    HOFFSET( Particle, grid_j ),&lt;br /&gt;
    HOFFSET( Particle, idnumber ),&lt;br /&gt;
    HOFFSET( Particle, name ),&lt;br /&gt;
    HOFFSET( Particle, pressure ),&lt;br /&gt;
    HOFFSET( Particle, pressure2),&lt;br /&gt;
  };&lt;br /&gt;
&lt;br /&gt;
  //////////////////////////////////////////////////////////////////////////////////////////////////////////////                                              &lt;br /&gt;
  //MPI                                                                                                                                                       &lt;br /&gt;
&lt;br /&gt;
  //HDF5 APIs definitions                                                                                                                                     &lt;br /&gt;
  hid_t       file_id;         /* file and dataset identifiers */&lt;br /&gt;
  hid_t plist_id;        /* property list identifier( access template) */&lt;br /&gt;
  herr_t status;&lt;br /&gt;
&lt;br /&gt;
  // MPI variables                                                                                                                                            &lt;br /&gt;
  int mpi_size, mpi_rank;&lt;br /&gt;
  MPI_Comm comm  = MPI_COMM_WORLD;&lt;br /&gt;
  MPI_Info info  = MPI_INFO_NULL;&lt;br /&gt;
&lt;br /&gt;
  //Initialize MPI                                                                                                                                            &lt;br /&gt;
  MPI_Init(&amp;amp;argc, &amp;amp;argv);&lt;br /&gt;
  MPI_Comm_size(comm, &amp;amp;mpi_size);&lt;br /&gt;
  MPI_Comm_rank(comm, &amp;amp;mpi_rank);&lt;br /&gt;
&lt;br /&gt;
  // Set up file access property list with parallel I/O access                                                                                                &lt;br /&gt;
  plist_id = H5Pcreate(H5P_FILE_ACCESS);//creates a new property list as an instance of some property list class                                              &lt;br /&gt;
  H5Pset_fapl_mpio(plist_id, comm, info);&lt;br /&gt;
&lt;br /&gt;
  // Read file collectively.                                                                                                                                  &lt;br /&gt;
  file_id = H5Fopen(H5FILE_NAME, H5F_ACC_RDONLY, plist_id);//H5F_ACC_RDONLY : read-only mode                                                                  &lt;br /&gt;
&lt;br /&gt;
  Particle  dst_buf[1];&lt;br /&gt;
  size_t dst_sizes[NFIELDS] = {&lt;br /&gt;
    sizeof( dst_buf[0].ADCcount),&lt;br /&gt;
    sizeof( dst_buf[0].energy),&lt;br /&gt;
    sizeof( dst_buf[0].grid_i),&lt;br /&gt;
    sizeof( dst_buf[0].grid_j),&lt;br /&gt;
    sizeof( dst_buf[0].idnumber),&lt;br /&gt;
    sizeof( dst_buf[0].name),&lt;br /&gt;
    sizeof( dst_buf[0].pressure),&lt;br /&gt;
    sizeof( dst_buf[0].pressure2)&lt;br /&gt;
  };&lt;br /&gt;
&lt;br /&gt;
  //READ FRACTION OF TABLE : example reading one record per MPI process                                                                                       &lt;br /&gt;
  hsize_t start=mpi_rank;//read Record number mpi_rank                                                                                                        &lt;br /&gt;
  hsize_t nrecords=1;//read 1 record                                                                                                                          &lt;br /&gt;
  status=H5TBread_records(file_id,&amp;quot;/detector/readout&amp;quot;,start,nrecords,dst_size,dst_offset,dst_sizes,dst_buf);&lt;br /&gt;
&lt;br /&gt;
  std::cout&amp;lt;&amp;lt;&amp;quot;Rank = &amp;quot;&amp;lt;&amp;lt;mpi_rank&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,ADCcount = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].ADCcount&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,idnumber = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].idnumber&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,grid_i = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].grid_i&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,grid_j = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].grid_j&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,pressure = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,name = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].name&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,energy = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].energy&lt;br /&gt;
           &amp;lt;&amp;lt;std::endl;&lt;br /&gt;
  for(int j=0;j&amp;lt;2;j++){&lt;br /&gt;
    std::cout&amp;lt;&amp;lt;&amp;quot;Rank = &amp;quot;&amp;lt;&amp;lt;mpi_rank&amp;lt;&amp;lt;&amp;quot; : &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][0]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][1]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][2]&amp;lt;&amp;lt;std::endl;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  //Close property list.                                                                                                                                      &lt;br /&gt;
  H5Pclose(plist_id);&lt;br /&gt;
&lt;br /&gt;
  // Close the file.                                                                                                                                          &lt;br /&gt;
  H5Fclose(file_id);&lt;br /&gt;
&lt;br /&gt;
  MPI_Finalize();&lt;br /&gt;
&lt;br /&gt;
  return 0;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The job has been launched on BlueGene :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 10 --ranks-per-node=1 --envs OMP_NUM_THREADS=1 : /PATH_TO_THE_TEST_DIRECTORY/Test&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The output of the job :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Rank = 9 ,ADCcount = 2304 ,idnumber = 154618822656 ,grid_i = 9 ,grid_j = 1 ,pressure = 81 ,name = Particle:      9 ,energy = 4.30467e+07&lt;br /&gt;
Rank = 9 : 9.5 10.5 11.5&lt;br /&gt;
Rank = 9 : 7.5 6.5 5.5&lt;br /&gt;
Rank = 8 ,ADCcount = 2048 ,idnumber = 137438953472 ,grid_i = 8 ,grid_j = 2 ,pressure = 64 ,name = Particle:      8 ,energy = 1.67772e+07&lt;br /&gt;
Rank = 8 : 8.5 9.5 10.5&lt;br /&gt;
Rank = 8 : 6.5 5.5 4.5&lt;br /&gt;
Rank = 2 ,ADCcount = 512 ,idnumber = 34359738368 ,grid_i = 2 ,grid_j = 8 ,pressure = 4 ,name = Particle:      2 ,energy = 256&lt;br /&gt;
Rank = 2 : 2.5 3.5 4.5&lt;br /&gt;
Rank = 2 : 0.5 -0.5 -1.5&lt;br /&gt;
Rank = 7 ,ADCcount = 1792 ,idnumber = 120259084288 ,grid_i = 7 ,grid_j = 3 ,pressure = 49 ,name = Particle:      7 ,energy = 5.7648e+06&lt;br /&gt;
Rank = 7 : 7.5 8.5 9.5&lt;br /&gt;
Rank = 7 : 5.5 4.5 3.5&lt;br /&gt;
Rank = 4 ,ADCcount = 1024 ,idnumber = 68719476736 ,grid_i = 4 ,grid_j = 6 ,pressure = 16 ,name = Particle:      4 ,energy = 65536&lt;br /&gt;
Rank = 4 : 4.5 5.5 6.5&lt;br /&gt;
Rank = 4 : 2.5 1.5 0.5&lt;br /&gt;
Rank = 5 ,ADCcount = 1280 ,idnumber = 85899345920 ,grid_i = 5 ,grid_j = 5 ,pressure = 25 ,name = Particle:      5 ,energy = 390625&lt;br /&gt;
Rank = 5 : 5.5 6.5 7.5&lt;br /&gt;
Rank = 5 : 3.5 2.5 1.5&lt;br /&gt;
Rank = 6 ,ADCcount = 1536 ,idnumber = 103079215104 ,grid_i = 6 ,grid_j = 4 ,pressure = 36 ,name = Particle:      6 ,energy = 1.67962e+06&lt;br /&gt;
Rank = 6 : 6.5 7.5 8.5&lt;br /&gt;
Rank = 6 : 4.5 3.5 2.5&lt;br /&gt;
Rank = 0 ,ADCcount = 0 ,idnumber = 0 ,grid_i = 0 ,grid_j = 10 ,pressure = 0 ,name = Particle:      0 ,energy = 0&lt;br /&gt;
Rank = 0 : 0.5 1.5 2.5&lt;br /&gt;
Rank = 0 : -1.5 -2.5 -3.5&lt;br /&gt;
Rank = 1 ,ADCcount = 256 ,idnumber = 17179869184 ,grid_i = 1 ,grid_j = 9 ,pressure = 1 ,name = Particle:      1 ,energy = 1&lt;br /&gt;
Rank = 1 : 1.5 2.5 3.5&lt;br /&gt;
Rank = 1 : -0.5 -1.5 -2.5&lt;br /&gt;
Rank = 3 ,ADCcount = 768 ,idnumber = 51539607552 ,grid_i = 3 ,grid_j = 7 ,pressure = 9 ,name = Particle:      3 ,energy = 6561&lt;br /&gt;
Rank = 3 : 3.5 4.5 5.5&lt;br /&gt;
Rank = 3 : 1.5 0.5 -0.5&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Hdf5_table&amp;diff=6728</id>
		<title>Hdf5 table</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Hdf5_table&amp;diff=6728"/>
		<updated>2014-01-07T14:35:46Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Reading the table with a C++ code with MPI for parallel programming */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=='''Storing tables in HDF5'''==&lt;br /&gt;
The HDF5 Table interface condenses the steps needed to create tables in HDF5. The datatype of the dataset that gets created is of type H5T_COMPOUND. The members of the table can have different datatypes.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Writting a table using Python (PyTables)===&lt;br /&gt;
PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. &lt;br /&gt;
PyTables is built on top of the HDF5 library, using the Python language and the NumPy package.&lt;br /&gt;
The following example shows how to store a table of 10 records with 8 members :&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-  &lt;br /&gt;
! {{Hl2}} | name &lt;br /&gt;
! {{Hl2}} | ADCcount &lt;br /&gt;
! {{Hl2}} | grid_i &lt;br /&gt;
! {{Hl2}} | grid_j&lt;br /&gt;
! {{Hl2}} | pressure&lt;br /&gt;
! {{Hl2}} | energy &lt;br /&gt;
! {{Hl2}} | idnumber&lt;br /&gt;
! {{Hl2}} | pressure2&lt;br /&gt;
|- &lt;br /&gt;
|16-character String&lt;br /&gt;
|Unsigned short integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|float  (single-precision) &lt;br /&gt;
|double (double-precision)&lt;br /&gt;
|Signed 64-bit integer&lt;br /&gt;
|2-dim table of float (2*3)&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The script has been run on gpc with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1  intel/14.0.0  python/2.7.2  hdf5/1811-v18-serial-gcc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
PyTable 3.0.0 has been compiled in my scratch directory.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
from tables import *&lt;br /&gt;
&lt;br /&gt;
class Particle(IsDescription):&lt;br /&gt;
    name      = StringCol(16)   # 16-character String                                                                                                         &lt;br /&gt;
    ADCcount  = UInt16Col()     # Unsigned short integer                                                                                                      &lt;br /&gt;
    grid_i    = Int32Col()      # 32-bit integer                                                                                                              &lt;br /&gt;
    grid_j    = Int32Col()      # 32-bit integer                                                                                                              &lt;br /&gt;
    pressure  = Float32Col()    # float  (single-precision)                                                                                                   &lt;br /&gt;
    energy    = Float64Col()    # double (double-precision)                                                                                                   &lt;br /&gt;
    idnumber  = Int64Col()      # Signed 64-bit integer                                                                                                       &lt;br /&gt;
    pressure2    = Float32Col(shape=(2,3)) # array of floats (single-precision)&lt;br /&gt;
&lt;br /&gt;
h5file = open_file(&amp;quot;tutorial1.h5&amp;quot;, mode = &amp;quot;w&amp;quot;, title = &amp;quot;Test file&amp;quot;)&lt;br /&gt;
group = h5file.create_group(&amp;quot;/&amp;quot;, 'detector', 'Detector information')&lt;br /&gt;
table = h5file.create_table(group, 'readout', Particle, &amp;quot;Readout example&amp;quot;)&lt;br /&gt;
particle = table.row&lt;br /&gt;
for i in xrange(10):&lt;br /&gt;
    particle['name']  = 'Particle: %6d' % (i)&lt;br /&gt;
    particle['ADCcount'] = (i * 256) % (1 &amp;lt;&amp;lt; 16)&lt;br /&gt;
    particle['grid_i'] = i&lt;br /&gt;
    particle['grid_j'] = 10 - i&lt;br /&gt;
    particle['pressure'] = float(i*i)&lt;br /&gt;
    particle['energy'] = float(particle['pressure'] ** 4)&lt;br /&gt;
    particle['idnumber'] = i * (2 ** 34)&lt;br /&gt;
    particle['pressure2'] = [&lt;br /&gt;
        [0.5+float(i),1.5+float(i),2.5+float(i)],&lt;br /&gt;
        [-1.5+float(i),-2.5+float(i),-3.5+float(i)]]&lt;br /&gt;
    # Insert a new particle record                                                                                                                            &lt;br /&gt;
    particle.append()&lt;br /&gt;
&lt;br /&gt;
h5file.close()&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Reading the table with a C++ code with MPI for parallel programming===&lt;br /&gt;
The following example shows how to read the table in a MPI process (each MPI process will read one individual record). &lt;br /&gt;
The code has been compiled and tested on BlueGene with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load vacpp/12.1  xlf/14.1  mpich2/xl hdf5/189-v18-mpich2-xlc&lt;br /&gt;
mpixlcxx -I$SCINET_HDF5_INC -L$SCINET_ZLIB_LIB -L$SCINET_SZIP_LIB -L$SCINET_HDF5_LIB Test.cpp -o Test -lhdf5_hl -lhdf5 -lsz -lz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Test.cpp '''(the order of the variables (alphabetical) is important. The c++ code has to read the variables in the same order as in the hdf5 file : check with h5dump YourFile.h5 )''' :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;quot;hdf5.h&amp;quot;&lt;br /&gt;
#include &amp;quot;hdf5_hl.h&amp;quot;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
#include &amp;lt;iostream&amp;gt;&lt;br /&gt;
#include &amp;lt;stdint.h&amp;gt;&lt;br /&gt;
#include &amp;lt;mpi.h&amp;gt;&lt;br /&gt;
#define NFIELDS  (hsize_t)  8&lt;br /&gt;
#define H5FILE_NAME     &amp;quot;tutorial1.h5&amp;quot;&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char *argv[])&lt;br /&gt;
{&lt;br /&gt;
  // DEF OF SIZE OF VARIABLES TO READ                                                                                                                         &lt;br /&gt;
  typedef struct Particle&lt;br /&gt;
  {&lt;br /&gt;
    unsigned short int    ADCcount;&lt;br /&gt;
    double energy;&lt;br /&gt;
    int grid_i;&lt;br /&gt;
    int grid_j;&lt;br /&gt;
    long  idnumber;&lt;br /&gt;
    char   name[16];&lt;br /&gt;
    float pressure;&lt;br /&gt;
    float pressure2[2][3];&lt;br /&gt;
  } Particle;&lt;br /&gt;
&lt;br /&gt;
  /* Calculate the size and the offsets of our struct members in memory */&lt;br /&gt;
  size_t dst_size =  sizeof( Particle );&lt;br /&gt;
&lt;br /&gt;
  size_t dst_offset[NFIELDS] = {&lt;br /&gt;
    HOFFSET( Particle, ADCcount ),&lt;br /&gt;
    HOFFSET( Particle, energy ),&lt;br /&gt;
    HOFFSET( Particle, grid_i ),&lt;br /&gt;
    HOFFSET( Particle, grid_j ),&lt;br /&gt;
    HOFFSET( Particle, idnumber ),&lt;br /&gt;
    HOFFSET( Particle, name ),&lt;br /&gt;
    HOFFSET( Particle, pressure ),&lt;br /&gt;
    HOFFSET( Particle, pressure2),&lt;br /&gt;
  };&lt;br /&gt;
&lt;br /&gt;
  //////////////////////////////////////////////////////////////////////////////////////////////////////////////                                              &lt;br /&gt;
  //MPI                                                                                                                                                       &lt;br /&gt;
&lt;br /&gt;
  //HDF5 APIs definitions                                                                                                                                     &lt;br /&gt;
  hid_t       file_id;         /* file and dataset identifiers */&lt;br /&gt;
  hid_t plist_id;        /* property list identifier( access template) */&lt;br /&gt;
  herr_t status;&lt;br /&gt;
&lt;br /&gt;
  // MPI variables                                                                                                                                            &lt;br /&gt;
  int mpi_size, mpi_rank;&lt;br /&gt;
  MPI_Comm comm  = MPI_COMM_WORLD;&lt;br /&gt;
  MPI_Info info  = MPI_INFO_NULL;&lt;br /&gt;
&lt;br /&gt;
  //Initialize MPI                                                                                                                                            &lt;br /&gt;
  MPI_Init(&amp;amp;argc, &amp;amp;argv);&lt;br /&gt;
  MPI_Comm_size(comm, &amp;amp;mpi_size);&lt;br /&gt;
  MPI_Comm_rank(comm, &amp;amp;mpi_rank);&lt;br /&gt;
&lt;br /&gt;
  // Set up file access property list with parallel I/O access                                                                                                &lt;br /&gt;
  plist_id = H5Pcreate(H5P_FILE_ACCESS);//creates a new property list as an instance of some property list class                                              &lt;br /&gt;
  H5Pset_fapl_mpio(plist_id, comm, info);&lt;br /&gt;
&lt;br /&gt;
  // Read file collectively.                                                                                                                                  &lt;br /&gt;
  file_id = H5Fopen(H5FILE_NAME, H5F_ACC_RDONLY, plist_id);//H5F_ACC_RDONLY : read-only mode                                                                  &lt;br /&gt;
&lt;br /&gt;
  Particle  dst_buf[1];&lt;br /&gt;
  size_t dst_sizes[NFIELDS] = {&lt;br /&gt;
    sizeof( dst_buf[0].ADCcount),&lt;br /&gt;
    sizeof( dst_buf[0].energy),&lt;br /&gt;
    sizeof( dst_buf[0].grid_i),&lt;br /&gt;
    sizeof( dst_buf[0].grid_j),&lt;br /&gt;
    sizeof( dst_buf[0].idnumber),&lt;br /&gt;
    sizeof( dst_buf[0].name),&lt;br /&gt;
    sizeof( dst_buf[0].pressure),&lt;br /&gt;
    sizeof( dst_buf[0].pressure2)&lt;br /&gt;
  };&lt;br /&gt;
&lt;br /&gt;
  //READ FRACTION OF TABLE : example reading one record per MPI process                                                                                       &lt;br /&gt;
  hsize_t start=mpi_rank;//read Record number mpi_rank                                                                                                        &lt;br /&gt;
  hsize_t nrecords=1;//read 1 record                                                                                                                          &lt;br /&gt;
  status=H5TBread_records(file_id,&amp;quot;/detector/readout&amp;quot;,start,nrecords,dst_size,dst_offset,dst_sizes,dst_buf);&lt;br /&gt;
&lt;br /&gt;
  std::cout&amp;lt;&amp;lt;&amp;quot;Rank = &amp;quot;&amp;lt;&amp;lt;mpi_rank&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,ADCcount = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].ADCcount&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,idnumber = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].idnumber&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,grid_i = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].grid_i&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,grid_j = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].grid_j&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,pressure = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,name = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].name&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,energy = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].energy&lt;br /&gt;
           &amp;lt;&amp;lt;std::endl;&lt;br /&gt;
  for(int j=0;j&amp;lt;2;j++){&lt;br /&gt;
    std::cout&amp;lt;&amp;lt;&amp;quot;Rank = &amp;quot;&amp;lt;&amp;lt;mpi_rank&amp;lt;&amp;lt;&amp;quot; : &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][0]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][1]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][2]&amp;lt;&amp;lt;std::endl;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  //Close property list.                                                                                                                                      &lt;br /&gt;
  H5Pclose(plist_id);&lt;br /&gt;
&lt;br /&gt;
  // Close the file.                                                                                                                                          &lt;br /&gt;
  H5Fclose(file_id);&lt;br /&gt;
&lt;br /&gt;
  MPI_Finalize();&lt;br /&gt;
&lt;br /&gt;
  return 1;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The job has been launched on BlueGene :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 10 --ranks-per-node=1 --envs OMP_NUM_THREADS=1 : /PATH_TO_THE_TEST_DIRECTORY/Test&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The output of the job :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Rank = 9 ,ADCcount = 2304 ,idnumber = 154618822656 ,grid_i = 9 ,grid_j = 1 ,pressure = 81 ,name = Particle:      9 ,energy = 4.30467e+07&lt;br /&gt;
Rank = 9 : 9.5 10.5 11.5&lt;br /&gt;
Rank = 9 : 7.5 6.5 5.5&lt;br /&gt;
Rank = 8 ,ADCcount = 2048 ,idnumber = 137438953472 ,grid_i = 8 ,grid_j = 2 ,pressure = 64 ,name = Particle:      8 ,energy = 1.67772e+07&lt;br /&gt;
Rank = 8 : 8.5 9.5 10.5&lt;br /&gt;
Rank = 8 : 6.5 5.5 4.5&lt;br /&gt;
Rank = 2 ,ADCcount = 512 ,idnumber = 34359738368 ,grid_i = 2 ,grid_j = 8 ,pressure = 4 ,name = Particle:      2 ,energy = 256&lt;br /&gt;
Rank = 2 : 2.5 3.5 4.5&lt;br /&gt;
Rank = 2 : 0.5 -0.5 -1.5&lt;br /&gt;
Rank = 7 ,ADCcount = 1792 ,idnumber = 120259084288 ,grid_i = 7 ,grid_j = 3 ,pressure = 49 ,name = Particle:      7 ,energy = 5.7648e+06&lt;br /&gt;
Rank = 7 : 7.5 8.5 9.5&lt;br /&gt;
Rank = 7 : 5.5 4.5 3.5&lt;br /&gt;
Rank = 4 ,ADCcount = 1024 ,idnumber = 68719476736 ,grid_i = 4 ,grid_j = 6 ,pressure = 16 ,name = Particle:      4 ,energy = 65536&lt;br /&gt;
Rank = 4 : 4.5 5.5 6.5&lt;br /&gt;
Rank = 4 : 2.5 1.5 0.5&lt;br /&gt;
Rank = 5 ,ADCcount = 1280 ,idnumber = 85899345920 ,grid_i = 5 ,grid_j = 5 ,pressure = 25 ,name = Particle:      5 ,energy = 390625&lt;br /&gt;
Rank = 5 : 5.5 6.5 7.5&lt;br /&gt;
Rank = 5 : 3.5 2.5 1.5&lt;br /&gt;
Rank = 6 ,ADCcount = 1536 ,idnumber = 103079215104 ,grid_i = 6 ,grid_j = 4 ,pressure = 36 ,name = Particle:      6 ,energy = 1.67962e+06&lt;br /&gt;
Rank = 6 : 6.5 7.5 8.5&lt;br /&gt;
Rank = 6 : 4.5 3.5 2.5&lt;br /&gt;
Rank = 0 ,ADCcount = 0 ,idnumber = 0 ,grid_i = 0 ,grid_j = 10 ,pressure = 0 ,name = Particle:      0 ,energy = 0&lt;br /&gt;
Rank = 0 : 0.5 1.5 2.5&lt;br /&gt;
Rank = 0 : -1.5 -2.5 -3.5&lt;br /&gt;
Rank = 1 ,ADCcount = 256 ,idnumber = 17179869184 ,grid_i = 1 ,grid_j = 9 ,pressure = 1 ,name = Particle:      1 ,energy = 1&lt;br /&gt;
Rank = 1 : 1.5 2.5 3.5&lt;br /&gt;
Rank = 1 : -0.5 -1.5 -2.5&lt;br /&gt;
Rank = 3 ,ADCcount = 768 ,idnumber = 51539607552 ,grid_i = 3 ,grid_j = 7 ,pressure = 9 ,name = Particle:      3 ,energy = 6561&lt;br /&gt;
Rank = 3 : 3.5 4.5 5.5&lt;br /&gt;
Rank = 3 : 1.5 0.5 -0.5&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=Hdf5_table&amp;diff=6727</id>
		<title>Hdf5 table</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=Hdf5_table&amp;diff=6727"/>
		<updated>2014-01-07T14:32:56Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Reading the table with a C++ code with MPI for parallel programming */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=='''Storing tables in HDF5'''==&lt;br /&gt;
The HDF5 Table interface condenses the steps needed to create tables in HDF5. The datatype of the dataset that gets created is of type H5T_COMPOUND. The members of the table can have different datatypes.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Writting a table using Python (PyTables)===&lt;br /&gt;
PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. &lt;br /&gt;
PyTables is built on top of the HDF5 library, using the Python language and the NumPy package.&lt;br /&gt;
The following example shows how to store a table of 10 records with 8 members :&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-  &lt;br /&gt;
! {{Hl2}} | name &lt;br /&gt;
! {{Hl2}} | ADCcount &lt;br /&gt;
! {{Hl2}} | grid_i &lt;br /&gt;
! {{Hl2}} | grid_j&lt;br /&gt;
! {{Hl2}} | pressure&lt;br /&gt;
! {{Hl2}} | energy &lt;br /&gt;
! {{Hl2}} | idnumber&lt;br /&gt;
! {{Hl2}} | pressure2&lt;br /&gt;
|- &lt;br /&gt;
|16-character String&lt;br /&gt;
|Unsigned short integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|float  (single-precision) &lt;br /&gt;
|double (double-precision)&lt;br /&gt;
|Signed 64-bit integer&lt;br /&gt;
|2-dim table of float (2*3)&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The script has been run on gpc with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1  intel/14.0.0  python/2.7.2  hdf5/1811-v18-serial-gcc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
PyTable 3.0.0 has been compiled in my scratch directory.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
from tables import *&lt;br /&gt;
&lt;br /&gt;
class Particle(IsDescription):&lt;br /&gt;
    name      = StringCol(16)   # 16-character String                                                                                                         &lt;br /&gt;
    ADCcount  = UInt16Col()     # Unsigned short integer                                                                                                      &lt;br /&gt;
    grid_i    = Int32Col()      # 32-bit integer                                                                                                              &lt;br /&gt;
    grid_j    = Int32Col()      # 32-bit integer                                                                                                              &lt;br /&gt;
    pressure  = Float32Col()    # float  (single-precision)                                                                                                   &lt;br /&gt;
    energy    = Float64Col()    # double (double-precision)                                                                                                   &lt;br /&gt;
    idnumber  = Int64Col()      # Signed 64-bit integer                                                                                                       &lt;br /&gt;
    pressure2    = Float32Col(shape=(2,3)) # array of floats (single-precision)&lt;br /&gt;
&lt;br /&gt;
h5file = open_file(&amp;quot;tutorial1.h5&amp;quot;, mode = &amp;quot;w&amp;quot;, title = &amp;quot;Test file&amp;quot;)&lt;br /&gt;
group = h5file.create_group(&amp;quot;/&amp;quot;, 'detector', 'Detector information')&lt;br /&gt;
table = h5file.create_table(group, 'readout', Particle, &amp;quot;Readout example&amp;quot;)&lt;br /&gt;
particle = table.row&lt;br /&gt;
for i in xrange(10):&lt;br /&gt;
    particle['name']  = 'Particle: %6d' % (i)&lt;br /&gt;
    particle['ADCcount'] = (i * 256) % (1 &amp;lt;&amp;lt; 16)&lt;br /&gt;
    particle['grid_i'] = i&lt;br /&gt;
    particle['grid_j'] = 10 - i&lt;br /&gt;
    particle['pressure'] = float(i*i)&lt;br /&gt;
    particle['energy'] = float(particle['pressure'] ** 4)&lt;br /&gt;
    particle['idnumber'] = i * (2 ** 34)&lt;br /&gt;
    particle['pressure2'] = [&lt;br /&gt;
        [0.5+float(i),1.5+float(i),2.5+float(i)],&lt;br /&gt;
        [-1.5+float(i),-2.5+float(i),-3.5+float(i)]]&lt;br /&gt;
    # Insert a new particle record                                                                                                                            &lt;br /&gt;
    particle.append()&lt;br /&gt;
&lt;br /&gt;
h5file.close()&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Reading the table with a C++ code with MPI for parallel programming===&lt;br /&gt;
The following example shows how to read the table in a MPI process (each MPI process will read one individual record). &lt;br /&gt;
The code has been compiled and tested on BlueGene with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load vacpp/12.1  xlf/14.1  mpich2/xl hdf5/189-v18-mpich2-xlc&lt;br /&gt;
mpixlcxx -I$SCINET_HDF5_INC -L$SCINET_ZLIB_LIB -L$SCINET_SZIP_LIB -L$SCINET_HDF5_LIB Test.cpp -o Test -lhdf5_hl -lhdf5 -lsz -lz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Test.cpp (the order of the variables is important. The c++ code has to read the variables in the same order as in the hdf5 file) :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;quot;hdf5.h&amp;quot;&lt;br /&gt;
#include &amp;quot;hdf5_hl.h&amp;quot;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
#include &amp;lt;iostream&amp;gt;&lt;br /&gt;
#include &amp;lt;stdint.h&amp;gt;&lt;br /&gt;
#include &amp;lt;mpi.h&amp;gt;&lt;br /&gt;
#define NFIELDS  (hsize_t)  8&lt;br /&gt;
#define H5FILE_NAME     &amp;quot;tutorial1.h5&amp;quot;&lt;br /&gt;
&lt;br /&gt;
int main(int argc, char *argv[])&lt;br /&gt;
{&lt;br /&gt;
  // DEF OF SIZE OF VARIABLES TO READ                                                                                                                         &lt;br /&gt;
  typedef struct Particle&lt;br /&gt;
  {&lt;br /&gt;
    unsigned short int    ADCcount;&lt;br /&gt;
    double energy;&lt;br /&gt;
    int grid_i;&lt;br /&gt;
    int grid_j;&lt;br /&gt;
    long  idnumber;&lt;br /&gt;
    char   name[16];&lt;br /&gt;
    float pressure;&lt;br /&gt;
    float pressure2[2][3];&lt;br /&gt;
  } Particle;&lt;br /&gt;
&lt;br /&gt;
  /* Calculate the size and the offsets of our struct members in memory */&lt;br /&gt;
  size_t dst_size =  sizeof( Particle );&lt;br /&gt;
&lt;br /&gt;
  size_t dst_offset[NFIELDS] = {&lt;br /&gt;
    HOFFSET( Particle, ADCcount ),&lt;br /&gt;
    HOFFSET( Particle, energy ),&lt;br /&gt;
    HOFFSET( Particle, grid_i ),&lt;br /&gt;
    HOFFSET( Particle, grid_j ),&lt;br /&gt;
    HOFFSET( Particle, idnumber ),&lt;br /&gt;
    HOFFSET( Particle, name ),&lt;br /&gt;
    HOFFSET( Particle, pressure ),&lt;br /&gt;
    HOFFSET( Particle, pressure2),&lt;br /&gt;
  };&lt;br /&gt;
&lt;br /&gt;
  //////////////////////////////////////////////////////////////////////////////////////////////////////////////                                              &lt;br /&gt;
  //MPI                                                                                                                                                       &lt;br /&gt;
&lt;br /&gt;
  //HDF5 APIs definitions                                                                                                                                     &lt;br /&gt;
  hid_t       file_id;         /* file and dataset identifiers */&lt;br /&gt;
  hid_t plist_id;        /* property list identifier( access template) */&lt;br /&gt;
  herr_t status;&lt;br /&gt;
&lt;br /&gt;
  // MPI variables                                                                                                                                            &lt;br /&gt;
  int mpi_size, mpi_rank;&lt;br /&gt;
  MPI_Comm comm  = MPI_COMM_WORLD;&lt;br /&gt;
  MPI_Info info  = MPI_INFO_NULL;&lt;br /&gt;
&lt;br /&gt;
  //Initialize MPI                                                                                                                                            &lt;br /&gt;
  MPI_Init(&amp;amp;argc, &amp;amp;argv);&lt;br /&gt;
  MPI_Comm_size(comm, &amp;amp;mpi_size);&lt;br /&gt;
  MPI_Comm_rank(comm, &amp;amp;mpi_rank);&lt;br /&gt;
&lt;br /&gt;
  // Set up file access property list with parallel I/O access                                                                                                &lt;br /&gt;
  plist_id = H5Pcreate(H5P_FILE_ACCESS);//creates a new property list as an instance of some property list class                                              &lt;br /&gt;
  H5Pset_fapl_mpio(plist_id, comm, info);&lt;br /&gt;
&lt;br /&gt;
  // Read file collectively.                                                                                                                                  &lt;br /&gt;
  file_id = H5Fopen(H5FILE_NAME, H5F_ACC_RDONLY, plist_id);//H5F_ACC_RDONLY : read-only mode                                                                  &lt;br /&gt;
&lt;br /&gt;
  Particle  dst_buf[1];&lt;br /&gt;
  size_t dst_sizes[NFIELDS] = {&lt;br /&gt;
    sizeof( dst_buf[0].ADCcount),&lt;br /&gt;
    sizeof( dst_buf[0].energy),&lt;br /&gt;
    sizeof( dst_buf[0].grid_i),&lt;br /&gt;
    sizeof( dst_buf[0].grid_j),&lt;br /&gt;
    sizeof( dst_buf[0].idnumber),&lt;br /&gt;
    sizeof( dst_buf[0].name),&lt;br /&gt;
    sizeof( dst_buf[0].pressure),&lt;br /&gt;
    sizeof( dst_buf[0].pressure2)&lt;br /&gt;
  };&lt;br /&gt;
&lt;br /&gt;
  //READ FRACTION OF TABLE : example reading one record per MPI process                                                                                       &lt;br /&gt;
  hsize_t start=mpi_rank;//read Record number mpi_rank                                                                                                        &lt;br /&gt;
  hsize_t nrecords=1;//read 1 record                                                                                                                          &lt;br /&gt;
  status=H5TBread_records(file_id,&amp;quot;/detector/readout&amp;quot;,start,nrecords,dst_size,dst_offset,dst_sizes,dst_buf);&lt;br /&gt;
&lt;br /&gt;
  std::cout&amp;lt;&amp;lt;&amp;quot;Rank = &amp;quot;&amp;lt;&amp;lt;mpi_rank&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,ADCcount = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].ADCcount&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,idnumber = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].idnumber&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,grid_i = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].grid_i&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,grid_j = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].grid_j&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,pressure = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,name = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].name&lt;br /&gt;
           &amp;lt;&amp;lt;&amp;quot; ,energy = &amp;quot;&amp;lt;&amp;lt;dst_buf[0].energy&lt;br /&gt;
           &amp;lt;&amp;lt;std::endl;&lt;br /&gt;
  for(int j=0;j&amp;lt;2;j++){&lt;br /&gt;
    std::cout&amp;lt;&amp;lt;&amp;quot;Rank = &amp;quot;&amp;lt;&amp;lt;mpi_rank&amp;lt;&amp;lt;&amp;quot; : &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][0]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][1]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;dst_buf[0].pressure2[j][2]&amp;lt;&amp;lt;std::endl;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  //Close property list.                                                                                                                                      &lt;br /&gt;
  H5Pclose(plist_id);&lt;br /&gt;
&lt;br /&gt;
  // Close the file.                                                                                                                                          &lt;br /&gt;
  H5Fclose(file_id);&lt;br /&gt;
&lt;br /&gt;
  MPI_Finalize();&lt;br /&gt;
&lt;br /&gt;
  return 1;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The job has been launched on BlueGene :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
runjob --np 10 --ranks-per-node=1 --envs OMP_NUM_THREADS=1 : /PATH_TO_THE_TEST_DIRECTORY/Test&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The output of the job :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Rank = 9 ,ADCcount = 2304 ,idnumber = 154618822656 ,grid_i = 9 ,grid_j = 1 ,pressure = 81 ,name = Particle:      9 ,energy = 4.30467e+07&lt;br /&gt;
Rank = 9 : 9.5 10.5 11.5&lt;br /&gt;
Rank = 9 : 7.5 6.5 5.5&lt;br /&gt;
Rank = 8 ,ADCcount = 2048 ,idnumber = 137438953472 ,grid_i = 8 ,grid_j = 2 ,pressure = 64 ,name = Particle:      8 ,energy = 1.67772e+07&lt;br /&gt;
Rank = 8 : 8.5 9.5 10.5&lt;br /&gt;
Rank = 8 : 6.5 5.5 4.5&lt;br /&gt;
Rank = 2 ,ADCcount = 512 ,idnumber = 34359738368 ,grid_i = 2 ,grid_j = 8 ,pressure = 4 ,name = Particle:      2 ,energy = 256&lt;br /&gt;
Rank = 2 : 2.5 3.5 4.5&lt;br /&gt;
Rank = 2 : 0.5 -0.5 -1.5&lt;br /&gt;
Rank = 7 ,ADCcount = 1792 ,idnumber = 120259084288 ,grid_i = 7 ,grid_j = 3 ,pressure = 49 ,name = Particle:      7 ,energy = 5.7648e+06&lt;br /&gt;
Rank = 7 : 7.5 8.5 9.5&lt;br /&gt;
Rank = 7 : 5.5 4.5 3.5&lt;br /&gt;
Rank = 4 ,ADCcount = 1024 ,idnumber = 68719476736 ,grid_i = 4 ,grid_j = 6 ,pressure = 16 ,name = Particle:      4 ,energy = 65536&lt;br /&gt;
Rank = 4 : 4.5 5.5 6.5&lt;br /&gt;
Rank = 4 : 2.5 1.5 0.5&lt;br /&gt;
Rank = 5 ,ADCcount = 1280 ,idnumber = 85899345920 ,grid_i = 5 ,grid_j = 5 ,pressure = 25 ,name = Particle:      5 ,energy = 390625&lt;br /&gt;
Rank = 5 : 5.5 6.5 7.5&lt;br /&gt;
Rank = 5 : 3.5 2.5 1.5&lt;br /&gt;
Rank = 6 ,ADCcount = 1536 ,idnumber = 103079215104 ,grid_i = 6 ,grid_j = 4 ,pressure = 36 ,name = Particle:      6 ,energy = 1.67962e+06&lt;br /&gt;
Rank = 6 : 6.5 7.5 8.5&lt;br /&gt;
Rank = 6 : 4.5 3.5 2.5&lt;br /&gt;
Rank = 0 ,ADCcount = 0 ,idnumber = 0 ,grid_i = 0 ,grid_j = 10 ,pressure = 0 ,name = Particle:      0 ,energy = 0&lt;br /&gt;
Rank = 0 : 0.5 1.5 2.5&lt;br /&gt;
Rank = 0 : -1.5 -2.5 -3.5&lt;br /&gt;
Rank = 1 ,ADCcount = 256 ,idnumber = 17179869184 ,grid_i = 1 ,grid_j = 9 ,pressure = 1 ,name = Particle:      1 ,energy = 1&lt;br /&gt;
Rank = 1 : 1.5 2.5 3.5&lt;br /&gt;
Rank = 1 : -0.5 -1.5 -2.5&lt;br /&gt;
Rank = 3 ,ADCcount = 768 ,idnumber = 51539607552 ,grid_i = 3 ,grid_j = 7 ,pressure = 9 ,name = Particle:      3 ,energy = 6561&lt;br /&gt;
Rank = 3 : 3.5 4.5 5.5&lt;br /&gt;
Rank = 3 : 1.5 0.5 -0.5&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=NetCDF_table&amp;diff=6726</id>
		<title>NetCDF table</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=NetCDF_table&amp;diff=6726"/>
		<updated>2014-01-06T14:38:20Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Using Parallel netCDF */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=='''Storing tables in NetCDF4'''==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Writting a table using Python ===&lt;br /&gt;
The following example shows how to store a table of 10 records with 8 members :&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-  &lt;br /&gt;
! {{Hl2}} | name &lt;br /&gt;
! {{Hl2}} | ADCcount &lt;br /&gt;
! {{Hl2}} | grid_i &lt;br /&gt;
! {{Hl2}} | grid_j&lt;br /&gt;
! {{Hl2}} | pressure&lt;br /&gt;
! {{Hl2}} | energy &lt;br /&gt;
! {{Hl2}} | idnumber&lt;br /&gt;
! {{Hl2}} | pressure2&lt;br /&gt;
|- &lt;br /&gt;
|16-character String&lt;br /&gt;
|Unsigned short integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|float  (single-precision) &lt;br /&gt;
|double (double-precision)&lt;br /&gt;
|Signed 64-bit integer&lt;br /&gt;
|2-dim table of float (2*3)&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The script has been run on gpc with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1 hdf5/187-v18-serial-gcc netcdf/4.1.3_hdf5_serial-gcc intel/14.0.0 python/2.7.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
from netCDF4 import Dataset&lt;br /&gt;
from netCDF4 import chartostring, stringtoarr&lt;br /&gt;
import numpy&lt;br /&gt;
&lt;br /&gt;
f = Dataset('particles.nc','w',format='NETCDF4')&lt;br /&gt;
&lt;br /&gt;
size = 10&lt;br /&gt;
&lt;br /&gt;
Particle = numpy.dtype([('name', 'S1', 16),                       # 16-character String                                                &lt;br /&gt;
                        ('ADCcount',numpy.uint16),                # Unsigned short integer                                             &lt;br /&gt;
                        ('grid_i',numpy.int32),                   # 32-bit integer                                                     &lt;br /&gt;
                        ('grid_j',numpy.int32),                   # 32-bit integer                                                     &lt;br /&gt;
                        ('pressure',numpy.float32),               # float  (single-precision)                                          &lt;br /&gt;
                        ('energy',numpy.float64),                 # double (double-precision)                                          &lt;br /&gt;
                        ('idnumber',numpy.int64),                 # Signed 64-bit integer                                              &lt;br /&gt;
                        ('pressure2' , numpy.float32 , (2,3) )    # array of floats (single-precision) table 2 lines * 3 columns       &lt;br /&gt;
                        ])&lt;br /&gt;
&lt;br /&gt;
Particle_t = f.createCompoundType(Particle,'Particle')&lt;br /&gt;
&lt;br /&gt;
f.createDimension('NRecords',None)&lt;br /&gt;
v = f.createVariable('Data',Particle_t,'NRecords')&lt;br /&gt;
data = numpy.empty(size,Particle)&lt;br /&gt;
&lt;br /&gt;
for i in xrange(10):&lt;br /&gt;
    data['name'][i] = stringtoarr('Particle: %6d' % (i),16)&lt;br /&gt;
    data['ADCcount'][i] = (i * 256) % (1 &amp;lt;&amp;lt; 16)&lt;br /&gt;
    data['grid_i'][i] = i&lt;br /&gt;
    data['grid_j'][i] = 10 - i&lt;br /&gt;
    data['pressure'][i] = float(i*i)&lt;br /&gt;
    data['energy'][i] = float(data['pressure'][i] ** 4)&lt;br /&gt;
    data['idnumber'][i] = i * (2 ** 34)&lt;br /&gt;
    data['pressure2'][i] = [&lt;br /&gt;
        [0.5+float(i),1.5+float(i),2.5+float(i)],&lt;br /&gt;
        [-1.5+float(i),-2.5+float(i),-3.5+float(i)]]&lt;br /&gt;
#Fill data in File                                                                                                                     &lt;br /&gt;
v[:] = data&lt;br /&gt;
&lt;br /&gt;
f.close()&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The NetCDF file can be dumped using : ncdump particles.nc :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
netcdf particles {&lt;br /&gt;
types:&lt;br /&gt;
  compound Particle {&lt;br /&gt;
    char name(16) ;&lt;br /&gt;
    ushort ADCcount ;&lt;br /&gt;
    int grid_i ;&lt;br /&gt;
    int grid_j ;&lt;br /&gt;
    float pressure ;&lt;br /&gt;
    double energy ;&lt;br /&gt;
    int64 idnumber ;&lt;br /&gt;
    float pressure2(2, 3) ;&lt;br /&gt;
  }; // Particle&lt;br /&gt;
dimensions:&lt;br /&gt;
	NRecords = UNLIMITED ; // (10 currently)&lt;br /&gt;
variables:&lt;br /&gt;
	Particle Data(NRecords) ;&lt;br /&gt;
data:&lt;br /&gt;
&lt;br /&gt;
 Data = &lt;br /&gt;
    {{&amp;quot;Particle:      0&amp;quot;}, 0, 0, 10, 0, 0, 0, {0.5, 1.5, 2.5, -1.5, -2.5, -3.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      1&amp;quot;}, 256, 1, 9, 1, 1, 17179869184, {1.5, 2.5, 3.5, -0.5, -1.5, -2.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      2&amp;quot;}, 512, 2, 8, 4, 256, 34359738368, {2.5, 3.5, 4.5, 0.5, -0.5, -1.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      3&amp;quot;}, 768, 3, 7, 9, 6561, 51539607552, {3.5, 4.5, 5.5, 1.5, 0.5, -0.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      4&amp;quot;}, 1024, 4, 6, 16, 65536, 68719476736, {4.5, 5.5, 6.5, 2.5, 1.5, 0.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      5&amp;quot;}, 1280, 5, 5, 25, 390625, 85899345920, {5.5, 6.5, 7.5, 3.5, 2.5, 1.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      6&amp;quot;}, 1536, 6, 4, 36, 1679616, 103079215104, {6.5, 7.5, 8.5, 4.5, 3.5, 2.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      7&amp;quot;}, 1792, 7, 3, 49, 5764801, 120259084288, {7.5, 8.5, 9.5, 5.5, 4.5, 3.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      8&amp;quot;}, 2048, 8, 2, 64, 16777216, 137438953472, {8.5, 9.5, 10.5, 6.5, 5.5, 4.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      9&amp;quot;}, 2304, 9, 1, 81, 43046721, 154618822656, {9.5, 10.5, 11.5, 7.5, 6.5, 5.5}} ;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Reading the table with a C++ code ===&lt;br /&gt;
The code has been compiled and tested on GPC with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1 hdf5/187-v18-serial-gcc netcdf/4.1.3_hdf5_serial-gcc intel/14.0.0 python/2.7.2&lt;br /&gt;
gcc -I$SCINET_NETCDF_INC Test.cpp -o Test -lnetcdf_c++&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Test.cpp :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;iostream&amp;gt;&lt;br /&gt;
#include &amp;lt;netcdfcpp.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;string.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
using namespace std;&lt;br /&gt;
&lt;br /&gt;
#define FILE_NAME &amp;quot;particles.nc&amp;quot;&lt;br /&gt;
#define DIM_LEN 10 //number of records in file &lt;br /&gt;
int main(void)&lt;br /&gt;
{&lt;br /&gt;
  typedef struct Particle&lt;br /&gt;
  {&lt;br /&gt;
    char   name[16];&lt;br /&gt;
    unsigned short int    ADCcount;&lt;br /&gt;
    int grid_i;&lt;br /&gt;
    int grid_j;&lt;br /&gt;
    float pressure;&lt;br /&gt;
    double energy ;&lt;br /&gt;
    long  idnumber;&lt;br /&gt;
    float pressure2[2][3];&lt;br /&gt;
  } Particle;&lt;br /&gt;
&lt;br /&gt;
  //Definition of the data_in variable&lt;br /&gt;
  Particle data_in[DIM_LEN];&lt;br /&gt;
&lt;br /&gt;
  //Initialization of the variable&lt;br /&gt;
  for (int i=0; i&amp;lt;DIM_LEN; i++){&lt;br /&gt;
    data_in[i].ADCcount=0;&lt;br /&gt;
    data_in[i].grid_i=0;&lt;br /&gt;
    data_in[i].grid_j=0;&lt;br /&gt;
    data_in[i].pressure=0.;&lt;br /&gt;
    data_in[i].energy=0.;&lt;br /&gt;
    for (int j=0; j&amp;lt;2; j++){&lt;br /&gt;
      for (int k=0; k&amp;lt;3; k++){&lt;br /&gt;
        data_in[i].pressure2[j][k]=0.;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  //Variables needed to open the NetCDF file :&lt;br /&gt;
  int ncid, typeidd, varid;&lt;br /&gt;
  int dimid;&lt;br /&gt;
  int dimids[] = {0}, fieldid;&lt;br /&gt;
&lt;br /&gt;
  nc_def_dim(ncid, &amp;quot;NRecords&amp;quot;, DIM_LEN, &amp;amp;dimid);&lt;br /&gt;
  nc_def_var(ncid, &amp;quot;Data&amp;quot;, typeidd, 1, dimids, &amp;amp;varid);&lt;br /&gt;
&lt;br /&gt;
  //Open the NetCDF file :&lt;br /&gt;
  if (nc_open(FILE_NAME, NC_NOWRITE, &amp;amp;ncid)) cout&amp;lt;&amp;lt;&amp;quot;ERROR&amp;quot;&amp;lt;&amp;lt;endl; //NC_NOWRITE for read only, NC_WRITE for read and write&lt;br /&gt;
  //Read the data and fill the variable data_in :&lt;br /&gt;
  if (nc_get_var(ncid, varid, data_in)) cout&amp;lt;&amp;lt;&amp;quot;ERROR&amp;quot;&amp;lt;&amp;lt;endl;&lt;br /&gt;
&lt;br /&gt;
  for (int i=0; i&amp;lt;DIM_LEN; i++){&lt;br /&gt;
    std::cout&amp;lt;&amp;lt;&amp;quot;ADCcount = &amp;quot;&amp;lt;&amp;lt;data_in[i].ADCcount&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,idnumber = &amp;quot;&amp;lt;&amp;lt;data_in[i].idnumber&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,grid_i = &amp;quot;&amp;lt;&amp;lt;data_in[i].grid_i&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,grid_j = &amp;quot;&amp;lt;&amp;lt;data_in[i].grid_j&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,pressure = &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,energy = &amp;quot;&amp;lt;&amp;lt;data_in[i].energy&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,name = &amp;quot;&amp;lt;&amp;lt;data_in[i].name&lt;br /&gt;
             &amp;lt;&amp;lt;std::endl;&lt;br /&gt;
    for(int j=0;j&amp;lt;2;j++){&lt;br /&gt;
      std::cout&amp;lt;&amp;lt;data_in[i].pressure2[j][0]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure2[j][1]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure2[j][2]&amp;lt;&amp;lt;std::endl;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
cout &amp;lt;&amp;lt; &amp;quot;*** SUCCESS reading example file &amp;quot;&amp;lt;&amp;lt;FILE_NAME&amp;lt;&amp;lt;&amp;quot;!&amp;quot; &amp;lt;&amp;lt; endl;&lt;br /&gt;
  return 0;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Using Parallel netCDF ===&lt;br /&gt;
The compound format (actually HDF5) isn't&lt;br /&gt;
compatible with parallel-netcdf.  There are parallel I/O facilities in&lt;br /&gt;
NetCDF-4. To use the parallel features of netcdf-4, use the call&lt;br /&gt;
nc_open_par().&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=NetCDF_table&amp;diff=6725</id>
		<title>NetCDF table</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=NetCDF_table&amp;diff=6725"/>
		<updated>2014-01-06T14:36:47Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Storing tables in NetCDF4 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=='''Storing tables in NetCDF4'''==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Writting a table using Python ===&lt;br /&gt;
The following example shows how to store a table of 10 records with 8 members :&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-  &lt;br /&gt;
! {{Hl2}} | name &lt;br /&gt;
! {{Hl2}} | ADCcount &lt;br /&gt;
! {{Hl2}} | grid_i &lt;br /&gt;
! {{Hl2}} | grid_j&lt;br /&gt;
! {{Hl2}} | pressure&lt;br /&gt;
! {{Hl2}} | energy &lt;br /&gt;
! {{Hl2}} | idnumber&lt;br /&gt;
! {{Hl2}} | pressure2&lt;br /&gt;
|- &lt;br /&gt;
|16-character String&lt;br /&gt;
|Unsigned short integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|float  (single-precision) &lt;br /&gt;
|double (double-precision)&lt;br /&gt;
|Signed 64-bit integer&lt;br /&gt;
|2-dim table of float (2*3)&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The script has been run on gpc with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1 hdf5/187-v18-serial-gcc netcdf/4.1.3_hdf5_serial-gcc intel/14.0.0 python/2.7.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
from netCDF4 import Dataset&lt;br /&gt;
from netCDF4 import chartostring, stringtoarr&lt;br /&gt;
import numpy&lt;br /&gt;
&lt;br /&gt;
f = Dataset('particles.nc','w',format='NETCDF4')&lt;br /&gt;
&lt;br /&gt;
size = 10&lt;br /&gt;
&lt;br /&gt;
Particle = numpy.dtype([('name', 'S1', 16),                       # 16-character String                                                &lt;br /&gt;
                        ('ADCcount',numpy.uint16),                # Unsigned short integer                                             &lt;br /&gt;
                        ('grid_i',numpy.int32),                   # 32-bit integer                                                     &lt;br /&gt;
                        ('grid_j',numpy.int32),                   # 32-bit integer                                                     &lt;br /&gt;
                        ('pressure',numpy.float32),               # float  (single-precision)                                          &lt;br /&gt;
                        ('energy',numpy.float64),                 # double (double-precision)                                          &lt;br /&gt;
                        ('idnumber',numpy.int64),                 # Signed 64-bit integer                                              &lt;br /&gt;
                        ('pressure2' , numpy.float32 , (2,3) )    # array of floats (single-precision) table 2 lines * 3 columns       &lt;br /&gt;
                        ])&lt;br /&gt;
&lt;br /&gt;
Particle_t = f.createCompoundType(Particle,'Particle')&lt;br /&gt;
&lt;br /&gt;
f.createDimension('NRecords',None)&lt;br /&gt;
v = f.createVariable('Data',Particle_t,'NRecords')&lt;br /&gt;
data = numpy.empty(size,Particle)&lt;br /&gt;
&lt;br /&gt;
for i in xrange(10):&lt;br /&gt;
    data['name'][i] = stringtoarr('Particle: %6d' % (i),16)&lt;br /&gt;
    data['ADCcount'][i] = (i * 256) % (1 &amp;lt;&amp;lt; 16)&lt;br /&gt;
    data['grid_i'][i] = i&lt;br /&gt;
    data['grid_j'][i] = 10 - i&lt;br /&gt;
    data['pressure'][i] = float(i*i)&lt;br /&gt;
    data['energy'][i] = float(data['pressure'][i] ** 4)&lt;br /&gt;
    data['idnumber'][i] = i * (2 ** 34)&lt;br /&gt;
    data['pressure2'][i] = [&lt;br /&gt;
        [0.5+float(i),1.5+float(i),2.5+float(i)],&lt;br /&gt;
        [-1.5+float(i),-2.5+float(i),-3.5+float(i)]]&lt;br /&gt;
#Fill data in File                                                                                                                     &lt;br /&gt;
v[:] = data&lt;br /&gt;
&lt;br /&gt;
f.close()&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The NetCDF file can be dumped using : ncdump particles.nc :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
netcdf particles {&lt;br /&gt;
types:&lt;br /&gt;
  compound Particle {&lt;br /&gt;
    char name(16) ;&lt;br /&gt;
    ushort ADCcount ;&lt;br /&gt;
    int grid_i ;&lt;br /&gt;
    int grid_j ;&lt;br /&gt;
    float pressure ;&lt;br /&gt;
    double energy ;&lt;br /&gt;
    int64 idnumber ;&lt;br /&gt;
    float pressure2(2, 3) ;&lt;br /&gt;
  }; // Particle&lt;br /&gt;
dimensions:&lt;br /&gt;
	NRecords = UNLIMITED ; // (10 currently)&lt;br /&gt;
variables:&lt;br /&gt;
	Particle Data(NRecords) ;&lt;br /&gt;
data:&lt;br /&gt;
&lt;br /&gt;
 Data = &lt;br /&gt;
    {{&amp;quot;Particle:      0&amp;quot;}, 0, 0, 10, 0, 0, 0, {0.5, 1.5, 2.5, -1.5, -2.5, -3.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      1&amp;quot;}, 256, 1, 9, 1, 1, 17179869184, {1.5, 2.5, 3.5, -0.5, -1.5, -2.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      2&amp;quot;}, 512, 2, 8, 4, 256, 34359738368, {2.5, 3.5, 4.5, 0.5, -0.5, -1.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      3&amp;quot;}, 768, 3, 7, 9, 6561, 51539607552, {3.5, 4.5, 5.5, 1.5, 0.5, -0.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      4&amp;quot;}, 1024, 4, 6, 16, 65536, 68719476736, {4.5, 5.5, 6.5, 2.5, 1.5, 0.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      5&amp;quot;}, 1280, 5, 5, 25, 390625, 85899345920, {5.5, 6.5, 7.5, 3.5, 2.5, 1.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      6&amp;quot;}, 1536, 6, 4, 36, 1679616, 103079215104, {6.5, 7.5, 8.5, 4.5, 3.5, 2.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      7&amp;quot;}, 1792, 7, 3, 49, 5764801, 120259084288, {7.5, 8.5, 9.5, 5.5, 4.5, 3.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      8&amp;quot;}, 2048, 8, 2, 64, 16777216, 137438953472, {8.5, 9.5, 10.5, 6.5, 5.5, 4.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      9&amp;quot;}, 2304, 9, 1, 81, 43046721, 154618822656, {9.5, 10.5, 11.5, 7.5, 6.5, 5.5}} ;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Reading the table with a C++ code ===&lt;br /&gt;
The code has been compiled and tested on GPC with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1 hdf5/187-v18-serial-gcc netcdf/4.1.3_hdf5_serial-gcc intel/14.0.0 python/2.7.2&lt;br /&gt;
gcc -I$SCINET_NETCDF_INC Test.cpp -o Test -lnetcdf_c++&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Test.cpp :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;iostream&amp;gt;&lt;br /&gt;
#include &amp;lt;netcdfcpp.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;string.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
using namespace std;&lt;br /&gt;
&lt;br /&gt;
#define FILE_NAME &amp;quot;particles.nc&amp;quot;&lt;br /&gt;
#define DIM_LEN 10 //number of records in file &lt;br /&gt;
int main(void)&lt;br /&gt;
{&lt;br /&gt;
  typedef struct Particle&lt;br /&gt;
  {&lt;br /&gt;
    char   name[16];&lt;br /&gt;
    unsigned short int    ADCcount;&lt;br /&gt;
    int grid_i;&lt;br /&gt;
    int grid_j;&lt;br /&gt;
    float pressure;&lt;br /&gt;
    double energy ;&lt;br /&gt;
    long  idnumber;&lt;br /&gt;
    float pressure2[2][3];&lt;br /&gt;
  } Particle;&lt;br /&gt;
&lt;br /&gt;
  //Definition of the data_in variable&lt;br /&gt;
  Particle data_in[DIM_LEN];&lt;br /&gt;
&lt;br /&gt;
  //Initialization of the variable&lt;br /&gt;
  for (int i=0; i&amp;lt;DIM_LEN; i++){&lt;br /&gt;
    data_in[i].ADCcount=0;&lt;br /&gt;
    data_in[i].grid_i=0;&lt;br /&gt;
    data_in[i].grid_j=0;&lt;br /&gt;
    data_in[i].pressure=0.;&lt;br /&gt;
    data_in[i].energy=0.;&lt;br /&gt;
    for (int j=0; j&amp;lt;2; j++){&lt;br /&gt;
      for (int k=0; k&amp;lt;3; k++){&lt;br /&gt;
        data_in[i].pressure2[j][k]=0.;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  //Variables needed to open the NetCDF file :&lt;br /&gt;
  int ncid, typeidd, varid;&lt;br /&gt;
  int dimid;&lt;br /&gt;
  int dimids[] = {0}, fieldid;&lt;br /&gt;
&lt;br /&gt;
  nc_def_dim(ncid, &amp;quot;NRecords&amp;quot;, DIM_LEN, &amp;amp;dimid);&lt;br /&gt;
  nc_def_var(ncid, &amp;quot;Data&amp;quot;, typeidd, 1, dimids, &amp;amp;varid);&lt;br /&gt;
&lt;br /&gt;
  //Open the NetCDF file :&lt;br /&gt;
  if (nc_open(FILE_NAME, NC_NOWRITE, &amp;amp;ncid)) cout&amp;lt;&amp;lt;&amp;quot;ERROR&amp;quot;&amp;lt;&amp;lt;endl; //NC_NOWRITE for read only, NC_WRITE for read and write&lt;br /&gt;
  //Read the data and fill the variable data_in :&lt;br /&gt;
  if (nc_get_var(ncid, varid, data_in)) cout&amp;lt;&amp;lt;&amp;quot;ERROR&amp;quot;&amp;lt;&amp;lt;endl;&lt;br /&gt;
&lt;br /&gt;
  for (int i=0; i&amp;lt;DIM_LEN; i++){&lt;br /&gt;
    std::cout&amp;lt;&amp;lt;&amp;quot;ADCcount = &amp;quot;&amp;lt;&amp;lt;data_in[i].ADCcount&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,idnumber = &amp;quot;&amp;lt;&amp;lt;data_in[i].idnumber&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,grid_i = &amp;quot;&amp;lt;&amp;lt;data_in[i].grid_i&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,grid_j = &amp;quot;&amp;lt;&amp;lt;data_in[i].grid_j&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,pressure = &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,energy = &amp;quot;&amp;lt;&amp;lt;data_in[i].energy&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,name = &amp;quot;&amp;lt;&amp;lt;data_in[i].name&lt;br /&gt;
             &amp;lt;&amp;lt;std::endl;&lt;br /&gt;
    for(int j=0;j&amp;lt;2;j++){&lt;br /&gt;
      std::cout&amp;lt;&amp;lt;data_in[i].pressure2[j][0]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure2[j][1]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure2[j][2]&amp;lt;&amp;lt;std::endl;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
cout &amp;lt;&amp;lt; &amp;quot;*** SUCCESS reading example file &amp;quot;&amp;lt;&amp;lt;FILE_NAME&amp;lt;&amp;lt;&amp;quot;!&amp;quot; &amp;lt;&amp;lt; endl;&lt;br /&gt;
  return 0;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Using Parallel netCDF ===&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
	<entry>
		<id>https://oldwiki.scinet.utoronto.ca/index.php?title=NetCDF_table&amp;diff=6718</id>
		<title>NetCDF table</title>
		<link rel="alternate" type="text/html" href="https://oldwiki.scinet.utoronto.ca/index.php?title=NetCDF_table&amp;diff=6718"/>
		<updated>2013-12-23T18:43:22Z</updated>

		<summary type="html">&lt;p&gt;Brelier: /* Reading the table with a C++ code */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=='''Storing tables in NetCDF4'''==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Writting a table using Python ===&lt;br /&gt;
The following example shows how to store a table of 10 records with 8 members :&lt;br /&gt;
{|border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;10&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-  &lt;br /&gt;
! {{Hl2}} | name &lt;br /&gt;
! {{Hl2}} | ADCcount &lt;br /&gt;
! {{Hl2}} | grid_i &lt;br /&gt;
! {{Hl2}} | grid_j&lt;br /&gt;
! {{Hl2}} | pressure&lt;br /&gt;
! {{Hl2}} | energy &lt;br /&gt;
! {{Hl2}} | idnumber&lt;br /&gt;
! {{Hl2}} | pressure2&lt;br /&gt;
|- &lt;br /&gt;
|16-character String&lt;br /&gt;
|Unsigned short integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|32-bit integer&lt;br /&gt;
|float  (single-precision) &lt;br /&gt;
|double (double-precision)&lt;br /&gt;
|Signed 64-bit integer&lt;br /&gt;
|2-dim table of float (2*3)&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The script has been run on gpc with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1 hdf5/187-v18-serial-gcc netcdf/4.1.3_hdf5_serial-gcc intel/14.0.0 python/2.7.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
from netCDF4 import Dataset&lt;br /&gt;
from netCDF4 import chartostring, stringtoarr&lt;br /&gt;
import numpy&lt;br /&gt;
&lt;br /&gt;
f = Dataset('particles.nc','w',format='NETCDF4')&lt;br /&gt;
&lt;br /&gt;
size = 10&lt;br /&gt;
&lt;br /&gt;
Particle = numpy.dtype([('name', 'S1', 16),                       # 16-character String                                                &lt;br /&gt;
                        ('ADCcount',numpy.uint16),                # Unsigned short integer                                             &lt;br /&gt;
                        ('grid_i',numpy.int32),                   # 32-bit integer                                                     &lt;br /&gt;
                        ('grid_j',numpy.int32),                   # 32-bit integer                                                     &lt;br /&gt;
                        ('pressure',numpy.float32),               # float  (single-precision)                                          &lt;br /&gt;
                        ('energy',numpy.float64),                 # double (double-precision)                                          &lt;br /&gt;
                        ('idnumber',numpy.int64),                 # Signed 64-bit integer                                              &lt;br /&gt;
                        ('pressure2' , numpy.float32 , (2,3) )    # array of floats (single-precision) table 2 lines * 3 columns       &lt;br /&gt;
                        ])&lt;br /&gt;
&lt;br /&gt;
Particle_t = f.createCompoundType(Particle,'Particle')&lt;br /&gt;
&lt;br /&gt;
f.createDimension('NRecords',None)&lt;br /&gt;
v = f.createVariable('Data',Particle_t,'NRecords')&lt;br /&gt;
data = numpy.empty(size,Particle)&lt;br /&gt;
&lt;br /&gt;
for i in xrange(10):&lt;br /&gt;
    data['name'][i] = stringtoarr('Particle: %6d' % (i),16)&lt;br /&gt;
    data['ADCcount'][i] = (i * 256) % (1 &amp;lt;&amp;lt; 16)&lt;br /&gt;
    data['grid_i'][i] = i&lt;br /&gt;
    data['grid_j'][i] = 10 - i&lt;br /&gt;
    data['pressure'][i] = float(i*i)&lt;br /&gt;
    data['energy'][i] = float(data['pressure'][i] ** 4)&lt;br /&gt;
    data['idnumber'][i] = i * (2 ** 34)&lt;br /&gt;
    data['pressure2'][i] = [&lt;br /&gt;
        [0.5+float(i),1.5+float(i),2.5+float(i)],&lt;br /&gt;
        [-1.5+float(i),-2.5+float(i),-3.5+float(i)]]&lt;br /&gt;
#Fill data in File                                                                                                                     &lt;br /&gt;
v[:] = data&lt;br /&gt;
&lt;br /&gt;
f.close()&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The NetCDF file can be dumped using : ncdump particles.nc :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
netcdf particles {&lt;br /&gt;
types:&lt;br /&gt;
  compound Particle {&lt;br /&gt;
    char name(16) ;&lt;br /&gt;
    ushort ADCcount ;&lt;br /&gt;
    int grid_i ;&lt;br /&gt;
    int grid_j ;&lt;br /&gt;
    float pressure ;&lt;br /&gt;
    double energy ;&lt;br /&gt;
    int64 idnumber ;&lt;br /&gt;
    float pressure2(2, 3) ;&lt;br /&gt;
  }; // Particle&lt;br /&gt;
dimensions:&lt;br /&gt;
	NRecords = UNLIMITED ; // (10 currently)&lt;br /&gt;
variables:&lt;br /&gt;
	Particle Data(NRecords) ;&lt;br /&gt;
data:&lt;br /&gt;
&lt;br /&gt;
 Data = &lt;br /&gt;
    {{&amp;quot;Particle:      0&amp;quot;}, 0, 0, 10, 0, 0, 0, {0.5, 1.5, 2.5, -1.5, -2.5, -3.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      1&amp;quot;}, 256, 1, 9, 1, 1, 17179869184, {1.5, 2.5, 3.5, -0.5, -1.5, -2.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      2&amp;quot;}, 512, 2, 8, 4, 256, 34359738368, {2.5, 3.5, 4.5, 0.5, -0.5, -1.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      3&amp;quot;}, 768, 3, 7, 9, 6561, 51539607552, {3.5, 4.5, 5.5, 1.5, 0.5, -0.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      4&amp;quot;}, 1024, 4, 6, 16, 65536, 68719476736, {4.5, 5.5, 6.5, 2.5, 1.5, 0.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      5&amp;quot;}, 1280, 5, 5, 25, 390625, 85899345920, {5.5, 6.5, 7.5, 3.5, 2.5, 1.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      6&amp;quot;}, 1536, 6, 4, 36, 1679616, 103079215104, {6.5, 7.5, 8.5, 4.5, 3.5, 2.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      7&amp;quot;}, 1792, 7, 3, 49, 5764801, 120259084288, {7.5, 8.5, 9.5, 5.5, 4.5, 3.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      8&amp;quot;}, 2048, 8, 2, 64, 16777216, 137438953472, {8.5, 9.5, 10.5, 6.5, 5.5, 4.5}}, &lt;br /&gt;
    {{&amp;quot;Particle:      9&amp;quot;}, 2304, 9, 1, 81, 43046721, 154618822656, {9.5, 10.5, 11.5, 7.5, 6.5, 5.5}} ;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Reading the table with a C++ code ===&lt;br /&gt;
The code has been compiled and tested on GPC with the following modules :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load gcc/4.8.1 hdf5/187-v18-serial-gcc netcdf/4.1.3_hdf5_serial-gcc intel/14.0.0 python/2.7.2&lt;br /&gt;
gcc -I$SCINET_NETCDF_INC Test.cpp -o Test -lnetcdf_c++&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Test.cpp :&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#include &amp;lt;iostream&amp;gt;&lt;br /&gt;
#include &amp;lt;netcdfcpp.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdlib.h&amp;gt;&lt;br /&gt;
#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;
#include &amp;lt;string.h&amp;gt;&lt;br /&gt;
&lt;br /&gt;
using namespace std;&lt;br /&gt;
&lt;br /&gt;
#define FILE_NAME &amp;quot;particles.nc&amp;quot;&lt;br /&gt;
#define DIM_LEN 10 //number of records in file &lt;br /&gt;
int main(void)&lt;br /&gt;
{&lt;br /&gt;
  typedef struct Particle&lt;br /&gt;
  {&lt;br /&gt;
    char   name[16];&lt;br /&gt;
    unsigned short int    ADCcount;&lt;br /&gt;
    int grid_i;&lt;br /&gt;
    int grid_j;&lt;br /&gt;
    float pressure;&lt;br /&gt;
    double energy ;&lt;br /&gt;
    long  idnumber;&lt;br /&gt;
    float pressure2[2][3];&lt;br /&gt;
  } Particle;&lt;br /&gt;
&lt;br /&gt;
  //Definition of the data_in variable&lt;br /&gt;
  Particle data_in[DIM_LEN];&lt;br /&gt;
&lt;br /&gt;
  //Initialization of the variable&lt;br /&gt;
  for (int i=0; i&amp;lt;DIM_LEN; i++){&lt;br /&gt;
    data_in[i].ADCcount=0;&lt;br /&gt;
    data_in[i].grid_i=0;&lt;br /&gt;
    data_in[i].grid_j=0;&lt;br /&gt;
    data_in[i].pressure=0.;&lt;br /&gt;
    data_in[i].energy=0.;&lt;br /&gt;
    for (int j=0; j&amp;lt;2; j++){&lt;br /&gt;
      for (int k=0; k&amp;lt;3; k++){&lt;br /&gt;
        data_in[i].pressure2[j][k]=0.;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  //Variables needed to open the NetCDF file :&lt;br /&gt;
  int ncid, typeidd, varid;&lt;br /&gt;
  int dimid;&lt;br /&gt;
  int dimids[] = {0}, fieldid;&lt;br /&gt;
&lt;br /&gt;
  nc_def_dim(ncid, &amp;quot;NRecords&amp;quot;, DIM_LEN, &amp;amp;dimid);&lt;br /&gt;
  nc_def_var(ncid, &amp;quot;Data&amp;quot;, typeidd, 1, dimids, &amp;amp;varid);&lt;br /&gt;
&lt;br /&gt;
  //Open the NetCDF file :&lt;br /&gt;
  if (nc_open(FILE_NAME, NC_NOWRITE, &amp;amp;ncid)) cout&amp;lt;&amp;lt;&amp;quot;ERROR&amp;quot;&amp;lt;&amp;lt;endl; //NC_NOWRITE for read only, NC_WRITE for read and write&lt;br /&gt;
  //Read the data and fill the variable data_in :&lt;br /&gt;
  if (nc_get_var(ncid, varid, data_in)) cout&amp;lt;&amp;lt;&amp;quot;ERROR&amp;quot;&amp;lt;&amp;lt;endl;&lt;br /&gt;
&lt;br /&gt;
  for (int i=0; i&amp;lt;DIM_LEN; i++){&lt;br /&gt;
    std::cout&amp;lt;&amp;lt;&amp;quot;ADCcount = &amp;quot;&amp;lt;&amp;lt;data_in[i].ADCcount&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,idnumber = &amp;quot;&amp;lt;&amp;lt;data_in[i].idnumber&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,grid_i = &amp;quot;&amp;lt;&amp;lt;data_in[i].grid_i&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,grid_j = &amp;quot;&amp;lt;&amp;lt;data_in[i].grid_j&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,pressure = &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,energy = &amp;quot;&amp;lt;&amp;lt;data_in[i].energy&lt;br /&gt;
             &amp;lt;&amp;lt;&amp;quot; ,name = &amp;quot;&amp;lt;&amp;lt;data_in[i].name&lt;br /&gt;
             &amp;lt;&amp;lt;std::endl;&lt;br /&gt;
    for(int j=0;j&amp;lt;2;j++){&lt;br /&gt;
      std::cout&amp;lt;&amp;lt;data_in[i].pressure2[j][0]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure2[j][1]&amp;lt;&amp;lt;&amp;quot; &amp;quot;&amp;lt;&amp;lt;data_in[i].pressure2[j][2]&amp;lt;&amp;lt;std::endl;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
cout &amp;lt;&amp;lt; &amp;quot;*** SUCCESS reading example file &amp;quot;&amp;lt;&amp;lt;FILE_NAME&amp;lt;&amp;lt;&amp;quot;!&amp;quot; &amp;lt;&amp;lt; endl;&lt;br /&gt;
  return 0;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Brelier</name></author>
	</entry>
</feed>