oldwiki.scinet.utoronto.ca - User contributions [en-gb]

PWC Python

2014-12-02T20:38:01Z

Ljdursi:

This page contains the slides for the PWC Python class.

== Slides ==

* [http://wiki.scinethpc.ca/wiki/images/3/3c/PWCintro.pdf Morning of the first day].
* [http://wiki.scinethpc.ca/wiki/images/2/2d/PWCFirstAfternoon.pdf Afternoon of the first day].

* [[Media:pwcfunctions.pdf | Second day ]]
* [http://support.scinet.utoronto.ca/CourseVideo/pwcfunctions.zip Code ]
** [http://support.scinet.utoronto.ca/~ljdursi/pwc/mapreduce.py mapreduce.py]
** [http://support.scinet.utoronto.ca/~ljdursi/pwc/mapreduce-ans.py mapreduce partial answer]
* [[Media:pwcobjects.pdf | Second day, objects ]]

PWC Python

2014-12-02T18:29:27Z

Ljdursi: /* Slides */

Python

2014-09-16T22:00:24Z

Ljdursi: /* Python on the GPC */

[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing. It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.

There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].

__FORCETOC__

== Python on the GPC ==

We currently have python 2.7.2, 2.7.3, 2.7.5, and 3.3.4 installed, compiled against fast intel math libraries. To load the python modules, type the following commands:

{|
! Version
! Command
|-
|2.7.2
|<tt>module load gcc intel python</tt>
|-
|2.7.3
|<tt>module load gcc intel/13.1.1 python/2.7.3</tt>
|-
|2.7.5
|<tt>module load gcc intel/13.1.1 python/2.7.5</tt>
|-
|3.3.4
|<tt>module load gcc intel/14.0.1 python/3.3.4</tt>
|}

== Modules installed system-wide ==

Many optional packages are available for Python which greatly extend the language adding important new functionality. Those packages which are likely to be important to all of our users — eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.

Below is a list of the packages currently installed system-wide.

{| border="1" cellpadding="10" cellspacing="0"
!{{Hl2}}| Module
!{{Hl2}}| python/2.7.2
!{{Hl2}}| python/2.7.3
!{{Hl2}}| python/2.7.5
!{{Hl2}}| python/3.3.4
!{{Hl2}}| Comments
|-
|[http://www.scipy.org/ SciPy]
| 0.10.0
| 0.11.0
| 0.14.0
| 0.14.0
| An Open-source software for mathematics, science, and engineering. Version in Python 2.7.x is linked against very fast MKL numerical libraries.
|-
|[http://numpy.scipy.org/ NumPy]
| 1.6.1
| 1.7.0
| 1.7.0
| 1.8.1
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc. SciPy is built on top of NumPy.
|-
| [http://mpi4py.scipy.org/ mpi4py]
| 1.2.2
| 1.2.2
| 1.2.2
| 1.2.2
| A pythonic interface to mpi. Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)
|-
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]
| 2.0
| 2.0.1
| 2.2.1
| 2.4_rc2
| Fast, memory-efficient elementwise operations on Numpy arrays.
|-
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]
| 2.8
| -
| -
| -
| A collection of scientific python utilities. Does not include MPI support. No longer supported.
|-
| [http://yt.enzotools.org/ yt]
| 2.2
| 2.5.3
| 2.5.5
| -
| A collection of python tools for analyzing astrophysical simulation output.
|-
| [http://ipython.scipy.org/moin/ iPython]
| 0.11
| 0.13.1
| 1.0.0
| 1.2.1
| An enhanced interactive python.
|-
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab
| 1.1.0
| 1.2.0
| 1.3.0
| 1.3.1
| Matlab-like plotting for python.
|-
| [http://www.pytables.org/moin PyTables]
| 2.3.1
| 2.4.0
| 3.0.0
| 3.1.1
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.) Requires the <tt>hdf5/184-p1-v18-serial-gcc</tt> module to be loaded.
|-
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]
| 0.9.8
| 1.0.4
| 1.1.1
| 1.1.0
| Python interface to NetCDF4 files. Requires the <tt>netcdf/4.0.1_hdf5_v18-serial.shared-nofortran</tt> module to be loaded.
|-
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]
| 1.4.1
| -
| -
| -
| Yet another Python interface to NetCDF4 files; again, requires the <tt>netcdf/4.0.1_hdf5_v18-serial.shared-nofortran</tt> module. No longer supported.
|-
| [http://alfven.org/wp/hdf5-for-python/ h5py]
| 2.0.1
| 2.1.3
| 2.2.0
| 2.3.0
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.
|-
| [http://pysvn.tigris.org/ PySVN]
| 1.7.1
|
|
|
| Python interface to the svn version control system.
|-
| [http://mercurial.selenic.com/ Mercurial]
| 2.0.1
| 2.6.2
| 2.7.1
| -
| A distributed version-control system written in Python.
|-
| [http://cython.org/ Cython]
| 0.15.1
| 0.18
| 0.19.1
| 0.20.1
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.
|-
| [http://code.google.com/p/python-nose/ nose]
| 1.1.2
| 1.2.1
| 1.3.0
| 1.3.0
| A unit-testing framework for python.
|-
| [http://pypi.python.org/pypi/setuptools setuptools]
| 0.6c11
| 0.6c11
| 1.1
| 5.1
| Enables easy installation of new python modules
|-
| [http://pandas.pydata.org/ pandas]
| 0.13.0
| 0.13.0
| 0.13.0
| 0.14.1
| high-performance, easy-to-use data structures and data analysis tools.
|-
|}

== Installing your own Python Modules ==

Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories. This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict.

To install your own Python modules, follow the instructions below. Where the instructions say <tt>python2.X</tt>, type <tt>python2.6</tt> or <tt>python2.7</tt> depending on the version of python you are using.

* First, create a directory in your home directory, <tt>${HOME}/lib/python2.X/site-packages</tt>, where the packages will go.
* Next, in your <tt>.bashrc</tt>, *after* you <tt>module load python</tt> and in the "GPC" section, add the following line:
<source lang=bash>
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/
</source>

* Re-load the modified .bashrc by typing <tt>source ~/.bashrc</tt>.

* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,
** install with the following command. where <tt>packagename</tt> is the name of the package you are installing:
<source lang=bash>
easy_install --prefix=${HOME} -O1 [packagename]
</source>

** Continue doing this until all of the packages you need to install are successfully installed.
** If, upon importing the new python package, you get error messages like <tt>undefined symbol: __stack_chk_guard</tt>, you may need to use the following command instead:
<source lang=bash>
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]
</source>

* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using <tt>python setup.py install</tt> then instead:
** Download the relevant files
** You will probably have to uncompress and untar them: <tt>tar -xzvf packagename.tgz</tt> or <tt>tar -xjvf packagename.bz2</tt>.
** cd into the newly created directory, and run
<source lang=bash>
python setup.py install --prefix=${HOME}
</source>

* Now, the install process may have added some .egg files or directories to your path. For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,
<source lang=bash>
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg
</source>

* You should now be done! Now, re-source your .bashrc and test your new python modules.

* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own modules (for the "module" system, not specifically python modules).

[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.

Python

2014-09-16T21:42:48Z

Ljdursi: /* Modules installed system-wide */

[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing. It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.

There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].

__FORCETOC__

== Python on the GPC ==

We currently have python 2.7.2 installed, compiled against fast intel math libraries. To use this version,
<source lang=bash>
module load gcc intel python
</source>

== Modules installed system-wide ==

Many optional packages are available for Python which greatly extend the language adding important new functionality. Those packages which are likely to be important to all of our users — eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.

Below is a list of the packages currently installed system-wide.

{| border="1" cellpadding="10" cellspacing="0"
!{{Hl2}}| Module
!{{Hl2}}| python/2.7.2
!{{Hl2}}| python/2.7.3
!{{Hl2}}| python/2.7.5
!{{Hl2}}| python/3.3.4
!{{Hl2}}| Comments
|-
|[http://www.scipy.org/ SciPy]
| 0.10.0
| 0.11.0
| 0.14.0
| 0.14.0
| An Open-source software for mathematics, science, and engineering. Version in Python 2.7.x is linked against very fast MKL numerical libraries.
|-
|[http://numpy.scipy.org/ NumPy]
| 1.6.1
| 1.7.0
| 1.7.0
| 1.8.1
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc. SciPy is built on top of NumPy.
|-
| [http://mpi4py.scipy.org/ mpi4py]
| 1.2.2
| 1.2.2
| 1.2.2
| 1.2.2
| A pythonic interface to mpi. Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)
|-
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]
| 2.0
| 2.0.1
| 2.2.1
| 2.4_rc2
| Fast, memory-efficient elementwise operations on Numpy arrays.
|-
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]
| 2.8
| -
| -
| -
| A collection of scientific python utilities. Does not include MPI support. No longer supported.
|-
| [http://yt.enzotools.org/ yt]
| 2.2
| 2.5.3
| 2.5.5
| -
| A collection of python tools for analyzing astrophysical simulation output.
|-
| [http://ipython.scipy.org/moin/ iPython]
| 0.11
| 0.13.1
| 1.0.0
| 1.2.1
| An enhanced interactive python.
|-
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab
| 1.1.0
| 1.2.0
| 1.3.0
| 1.3.1
| Matlab-like plotting for python.
|-
| [http://www.pytables.org/moin PyTables]
| 2.3.1
| 2.4.0
| 3.0.0
| 3.1.1
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.) Requires the <tt>hdf5/184-p1-v18-serial-gcc</tt> module to be loaded.
|-
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]
| 0.9.8
| 1.0.4
| 1.1.1
| 1.1.0
| Python interface to NetCDF4 files. Requires the <tt>netcdf/4.0.1_hdf5_v18-serial.shared-nofortran</tt> module to be loaded.
|-
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]
| 1.4.1
| -
| -
| -
| Yet another Python interface to NetCDF4 files; again, requires the <tt>netcdf/4.0.1_hdf5_v18-serial.shared-nofortran</tt> module. No longer supported.
|-
| [http://alfven.org/wp/hdf5-for-python/ h5py]
| 2.0.1
| 2.1.3
| 2.2.0
| 2.3.0
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.
|-
| [http://pysvn.tigris.org/ PySVN]
| 1.7.1
|
|
|
| Python interface to the svn version control system.
|-
| [http://mercurial.selenic.com/ Mercurial]
| 2.0.1
| 2.6.2
| 2.7.1
| -
| A distributed version-control system written in Python.
|-
| [http://cython.org/ Cython]
| 0.15.1
| 0.18
| 0.19.1
| 0.20.1
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.
|-
| [http://code.google.com/p/python-nose/ nose]
| 1.1.2
| 1.2.1
| 1.3.0
| 1.3.0
| A unit-testing framework for python.
|-
| [http://pypi.python.org/pypi/setuptools setuptools]
| 0.6c11
| 0.6c11
| 1.1
| 5.1
| Enables easy installation of new python modules
|-
| [http://pandas.pydata.org/ pandas]
| 0.13.0
| 0.13.0
| 0.13.0
| 0.14.1
| high-performance, easy-to-use data structures and data analysis tools.
|-
|}

== Installing your own Python Modules ==

Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories. This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict.

To install your own Python modules, follow the instructions below. Where the instructions say <tt>python2.X</tt>, type <tt>python2.6</tt> or <tt>python2.7</tt> depending on the version of python you are using.

* First, create a directory in your home directory, <tt>${HOME}/lib/python2.X/site-packages</tt>, where the packages will go.
* Next, in your <tt>.bashrc</tt>, *after* you <tt>module load python</tt> and in the "GPC" section, add the following line:
<source lang=bash>
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/
</source>

* Re-load the modified .bashrc by typing <tt>source ~/.bashrc</tt>.

* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,
** install with the following command. where <tt>packagename</tt> is the name of the package you are installing:
<source lang=bash>
easy_install --prefix=${HOME} -O1 [packagename]
</source>

** Continue doing this until all of the packages you need to install are successfully installed.
** If, upon importing the new python package, you get error messages like <tt>undefined symbol: __stack_chk_guard</tt>, you may need to use the following command instead:
<source lang=bash>
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]
</source>

* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using <tt>python setup.py install</tt> then instead:
** Download the relevant files
** You will probably have to uncompress and untar them: <tt>tar -xzvf packagename.tgz</tt> or <tt>tar -xjvf packagename.bz2</tt>.
** cd into the newly created directory, and run
<source lang=bash>
python setup.py install --prefix=${HOME}
</source>

* Now, the install process may have added some .egg files or directories to your path. For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,
<source lang=bash>
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg
</source>

* You should now be done! Now, re-source your .bashrc and test your new python modules.

* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own modules (for the "module" system, not specifically python modules).

[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.

Python

2014-09-16T21:42:14Z

Ljdursi: /* Modules installed system-wide */

[http://www.python.org/ Python] is programing language that continues to grow in popularity for scientific computing. It is very fast to write code in, but the software that results is much much slower than C or Fortran; one should be wary of doing too much compute-intensive work in Python.

There is a dizzying amount of documentation available for programming in Python on the [http://python.org/ Python.org webpage]; SciNet has given a mini-course of 8 lectures on [[Research Computing with Python]] in the Fall of 2013.
An excellent set of material for teaching scientists to program in Python is also available at the [http://software-carpentry.org/4_0/python/ Software Carpentry homepage].

__FORCETOC__

== Python on the GPC ==

We currently have python 2.7.2 installed, compiled against fast intel math libraries. To use this version,
<source lang=bash>
module load gcc intel python
</source>

== Modules installed system-wide ==

Many optional packages are available for Python which greatly extend the language adding important new functionality. Those packages which are likely to be important to all of our users — eg, [http://numpy.scipy.org/ NumPy], [http://www.scipy.org/ SciPy], and [http://matplotlib.sourceforge.net/ Matplotlib] are installed system-wide.

Below is a list of the packages currently installed system-wide.

{| border="1" cellpadding="10" cellspacing="0"
!{{Hl2}}| Module
!{{Hl2}}| python/2.7.2
!{{Hl2}}| python/2.7.3
!{{Hl2}}| python/2.7.5
!{{Hl2}}| python/3.3.4
!{{Hl2}}| Comments
|-
|[http://www.scipy.org/ SciPy]
| 0.10.0
| 0.11.0
| 0.14.0
| 0.14.0
| An Open-source software for mathematics, science, and engineering. Version in Python 2.7.x is linked against very fast MKL numerical libraries.
|-
|[http://numpy.scipy.org/ NumPy]
| 1.6.1
| 1.7.0
| 1.7.0
| 1.8.1
| NumPy is the fundamental package needed for scientific computing with Python. Contains fast arrays, tools for integrating C/C++ and Fortran code, linear algebra solvers, etc. SciPy is built on top of NumPy.
|-
| [http://mpi4py.scipy.org/ mpi4py]
| 1.2.2
| 1.2.2
| 1.2.2
| 1.2.2
| A pythonic interface to mpi. Available with openmpi; must load an openmpi module for this to work. (There is an issue with openmpi 1.4.x + infiniband, however it does appear to work fine with IntelMPI)
|-
| [http://www.scipy.org/SciPyPackages/NumExpr Numexpr]
| 2.0
| 2.0.1
| 2.2.1
| 2.4_rc2
| Fast, memory-efficient elementwise operations on Numpy arrays.
|-
| [http://dirac.cnrs-orleans.fr/plone/software/scientificpython/ ScientificPython]
| 2.8
| -
| -
| -
| A collection of scientific python utilities. Does not include MPI support. No longer supported.
|-
| [http://yt.enzotools.org/ yt]
| 2.2
| 2.5.3
| 2.5.5
| -
| A collection of python tools for analyzing astrophysical simulation output.
|-
| [http://ipython.scipy.org/moin/ iPython]
| 0.11
| 0.13.1
| 1.0.0
| 1.2.1
| An enhanced interactive python.
|-
| [http://matplotlib.sourceforge.net/ Matplotlib], pylab
| 1.1.0
| 1.2.0
| 1.3.0
| 1.3.1
| Matlab-like plotting for python.
|-
| [http://www.pytables.org/moin PyTables]
| 2.3.1
| 2.4.0
| 3.0.0
| 3.1.1
| Fast and efficient access to HDF5 files (and HDF5-format NetCDF4 files.) Requires the <tt>hdf5/184-p1-v18-serial-gcc</tt> module to be loaded.
|-
| [http://code.google.com/p/netcdf4-python/ NetCDF4-python]
| 0.9.8
| 1.0.4
| 1.1.1
| 1.1.0
| Python interface to NetCDF4 files. Requires the <tt>netcdf/4.0.1_hdf5_v18-serial.shared-nofortran</tt> module to be loaded.
|-
| [http://www.pyngl.ucar.edu/Nio.shtml pyNIO]
| 1.4.1
| -
| -
| -
| Yet another Python interface to NetCDF4 files; again, requires the <tt>netcdf/4.0.1_hdf5_v18-serial.shared-nofortran</tt> module. No longer supported.
|-
| [http://alfven.org/wp/hdf5-for-python/ h5py]
| 2.0.1
| 2.1.3
| 2.2.0
| 2.3.0
| Yet another Python interface to HDF5 files; again, requires an HDF5 module to be loaded.
|-
| [http://pysvn.tigris.org/ PySVN]
| 1.7.1
|
|
|
| Python interface to the svn version control system. Requires the <tt>svn</tt> module to be loaded on CentOS5.
|-
| [http://mercurial.selenic.com/ Mercurial]
| 2.0.1
| 2.6.2
| 2.7.1
| -
| A distributed version-control system written in Python.
|-
| [http://cython.org/ Cython]
| 0.15.1
| 0.18
| 0.19.1
| 0.20.1
| Cython is a compiler which compiles Python-like code files to C code and allows them to be easily called from Python.
|-
| [http://code.google.com/p/python-nose/ nose]
| 1.1.2
| 1.2.1
| 1.3.0
| 1.3.0
| A unit-testing framework for python.
|-
| [http://pypi.python.org/pypi/setuptools setuptools]
| 0.6c11
| 0.6c11
| 1.1
| 5.1
| Enables easy installation of new python modules
|-
| [http://pandas.pydata.org/ pandas]
| 0.13.0
| 0.13.0
| 0.13.0
| 0.14.1
| high-performance, easy-to-use data structures and data analysis tools.
|-
|}

== Installing your own Python Modules ==

Python provides an easy way for users to install the libraries they need in their home directories rather than having them installed system-wide. There are so many optional packages for Python people could potentially want (see e.g. http://pypi.python.org/pypi), that we recommend users install these additional packages locally in their home directories. This is almost certainly the easiest way to deal with the wide range of packages, ensure they're up to date, and ensure that users' package choices don't conflict.

To install your own Python modules, follow the instructions below. Where the instructions say <tt>python2.X</tt>, type <tt>python2.6</tt> or <tt>python2.7</tt> depending on the version of python you are using.

* First, create a directory in your home directory, <tt>${HOME}/lib/python2.X/site-packages</tt>, where the packages will go.
* Next, in your <tt>.bashrc</tt>, *after* you <tt>module load python</tt> and in the "GPC" section, add the following line:
<source lang=bash>
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages/
</source>

* Re-load the modified .bashrc by typing <tt>source ~/.bashrc</tt>.

* Now, if it's a standard python package and instructions say that you can use easy_intall to install it,
** install with the following command. where <tt>packagename</tt> is the name of the package you are installing:
<source lang=bash>
easy_install --prefix=${HOME} -O1 [packagename]
</source>

** Continue doing this until all of the packages you need to install are successfully installed.
** If, upon importing the new python package, you get error messages like <tt>undefined symbol: __stack_chk_guard</tt>, you may need to use the following command instead:
<source lang=bash>
LDFLAGS=-fstack-protector easy_install --prefix=${HOME} -O1 [packagename]
</source>

* If easy_install isn't an option for your package, and the installation instructions instead talk about downloading a file and using <tt>python setup.py install</tt> then instead:
** Download the relevant files
** You will probably have to uncompress and untar them: <tt>tar -xzvf packagename.tgz</tt> or <tt>tar -xjvf packagename.bz2</tt>.
** cd into the newly created directory, and run
<source lang=bash>
python setup.py install --prefix=${HOME}
</source>

* Now, the install process may have added some .egg files or directories to your path. For each .egg directory, add that to your python path as well in your .bashrc, in the same place as you had updated PYTHONPATH before: eg,
<source lang=bash>
export PYTHONPATH=${PYTHONPATH}:${HOME}/lib/python2.X/site-packages:${HOME}/lib/python2.X/site-packages/packagename1-x.y.z-yy2.X.egg:${HOME}/lib/python2.X/site-packages/packagename2-a.b.c-py2.X.egg
</source>

* You should now be done! Now, re-source your .bashrc and test your new python modules.

* In order to keep your .bashrc relatively uncluttered, and to avoid potential conflicts among software modules, we recommend that users create their own modules (for the "module" system, not specifically python modules).

[[Brian|Here]] is an example module for the [[Brian]] package, including instructions for the installation of the python [[Brian]] package itself.

Using Paraview

2014-09-10T20:04:37Z

Ljdursi: /* Connect Client and Server */

[http://www.paraview.org/ ParaView] is a powerful, parallel, client-server based visualization system that allows you to use SciNet's GPC nodes to render data on SciNet, and manipulate the results interactively on your own desktop. To use the paraview server on SciNet is much like using it locally, but there is an additional step in setting up a connection directly between your desktop and the compute nodes.

[[Image:Paraview.png|thumb|right|320px|The ParaView Client GUI]]

===Installing ParaView===

To use Paraview, you will have to have the client software installed on your system; you will need ParaView from [http://www.paraview.org/paraview/resources/software.html the Paraview website]. Binaries exist for Linux, Mac, and Windows systems. The client version must exactly match the version installed on the server, currently 3.12 or 3.14.1. The client version has all the functionality of the server, and can analyze data locally.

===SSH Forwarding For ParaView===

To interactively use the ParaView server on GPC, you will have to work some ssh magic to allow the client on your desktop to connect to the server through the scinet login nodes. The steps required are

* Have an SSH key that you can use to log into SciNet
* Submit an interactive job, with a shell on the head node that you'll be running the server on
* Start ssh forwarding
* Start paraview server
* Connecting client and server

====SSH Keys====

To be able to log into the compute nodes where ParaView will be running, you'll have to have an [[Ssh_keys | SSH key]] set up, as password authentication won't work. Our [[Ssh_keys | SSH Keys and SciNet]] page describes how to do this.

====Log into node====

The first thing to do is to go to the node from which you'll start the ParaView server. This is typically done by starting an interactive job on the GPC, perhaps on the [[Moab#debug | debug ]] queue or sandybridge [[GPC_Quickstart#Memory_Configuration | large memory]] nodes. Paraview can in principle make use of as many nodes as you throw at it. So one might begin jobs as below:

<pre>
qsub -l nodes=1:m128g:ppn=16,walltime=1:00:00 -q sandy -I
</pre>
or
<pre>
qsub -l nodes=2:ppn=8,walltime=1:00:00 -q debug -I
</pre>

Once this job has started, you'll be placed in a shell on the head node of the job; typing `<pre>hostname</pre>' will tell you the name of the host, eg
<pre>
$ hostname
gpc-f148n089-ib0
</pre>
or
<pre>
$ hostname
gpc-f107n045-ib0
</pre>
you will need this hostname in the following steps.

====Start SSH port forwarding====

Once the ssh configuration is set, the port forwarding can be started with the command (on your local machine in a terminal window), using the local host name from above - here we'll take the example of gpc-f148n089-ib0:
<pre>
$ export gpcnode="gpc-f148n089-ib0"
$ ssh -N -L 20080:${gpcnode}:22 -L 20090:${gpcnode}:11111 login.scinet.utoronto.ca
</pre>
this command will not return anything until the forwarding is terminated, and will just look like it's sitting there. It doesn't start a remote shell or command (-N), but it will connect to login.scinet.utoronto.ca, and from there it will redirect your local (-L) port 20080 to <tt>${gpcnode}</tt> port 22, and similarly local port 20090 to <tt>${gpcnode}</tt> port 11111. We'll use the first for ssh'ing to the remote node (mainly for testing), and the second to conect the local paraview client to the remote paraview server.

To make sure the port forwarding is working correctly, in another window try sshing directly to the compute node from your desktop:
<pre>
$ ssh -p 20080 [your-scinet-username]@localhost
</pre>
and this should land you directly on the compute node. If it does not, then something is wrong with the ssh forwarding.

====Start Server====

Now that the tunnel is set up, on the compute node you can start the paraview server. To do this, you will have to have the following modules loaded:

<pre>
$ module load Xlibraries intel gcc python openmpi paraview
</pre>
(You can replace intelmpi with openmpi, and of course any module that is already loaded does have to be loaded again.)

Then start the paraview server with the intel mpirun as with any MPI job:
<pre>
$ mpirun -np [NP] pvserver --use-offscreen-rendering
</pre>
where NP is the number of processors; 16 processors per node on the largemem nodes, or 8 per node otherwise.

Once running, the ParaView server should output
<pre>
Listen on port: 11111
Waiting for client...
</pre>

====Connect Client and Server====

[[Image:Configure.png|thumb|right|320px|Configuring the client]]

Once the server is running, you can connect the client. Start the ParaView client on your desktop, and choose File->Connect. Click `Add Server', give the server a name (say, GPC), and give the port number 20090. The other values should be correct by default; host is <tt>localhost</tt>, and the server type is Client/Server. Click `Configure'.

On the next window, you'll be asked for a command to start up the server; select `Manual', and ok.

Once the server is selected, click `Connect'. On the compute node, the server should respond `Client connected'. In the client window, when you (for instance) select File->Open, you will be seeing the files on the GPC, rather than the local host.

From here, the [http://paraview.org/Wiki/ParaView ParaView Wiki] can give you instructions as to how to plot your data.

Using Paraview

2014-09-10T20:03:15Z

Ljdursi: /* SSH Forwarding For ParaView */

[http://www.paraview.org/ ParaView] is a powerful, parallel, client-server based visualization system that allows you to use SciNet's GPC nodes to render data on SciNet, and manipulate the results interactively on your own desktop. To use the paraview server on SciNet is much like using it locally, but there is an additional step in setting up a connection directly between your desktop and the compute nodes.

[[Image:Paraview.png|thumb|right|320px|The ParaView Client GUI]]

===Installing ParaView===

To use Paraview, you will have to have the client software installed on your system; you will need ParaView from [http://www.paraview.org/paraview/resources/software.html the Paraview website]. Binaries exist for Linux, Mac, and Windows systems. The client version must exactly match the version installed on the server, currently 3.12 or 3.14.1. The client version has all the functionality of the server, and can analyze data locally.

===SSH Forwarding For ParaView===

To interactively use the ParaView server on GPC, you will have to work some ssh magic to allow the client on your desktop to connect to the server through the scinet login nodes. The steps required are

* Have an SSH key that you can use to log into SciNet
* Submit an interactive job, with a shell on the head node that you'll be running the server on
* Start ssh forwarding
* Start paraview server
* Connecting client and server

====SSH Keys====

To be able to log into the compute nodes where ParaView will be running, you'll have to have an [[Ssh_keys | SSH key]] set up, as password authentication won't work. Our [[Ssh_keys | SSH Keys and SciNet]] page describes how to do this.

====Log into node====

The first thing to do is to go to the node from which you'll start the ParaView server. This is typically done by starting an interactive job on the GPC, perhaps on the [[Moab#debug | debug ]] queue or sandybridge [[GPC_Quickstart#Memory_Configuration | large memory]] nodes. Paraview can in principle make use of as many nodes as you throw at it. So one might begin jobs as below:

<pre>
qsub -l nodes=1:m128g:ppn=16,walltime=1:00:00 -q sandy -I
</pre>
or
<pre>
qsub -l nodes=2:ppn=8,walltime=1:00:00 -q debug -I
</pre>

Once this job has started, you'll be placed in a shell on the head node of the job; typing `<pre>hostname</pre>' will tell you the name of the host, eg
<pre>
$ hostname
gpc-f148n089-ib0
</pre>
or
<pre>
$ hostname
gpc-f107n045-ib0
</pre>
you will need this hostname in the following steps.

====Start SSH port forwarding====

Once the ssh configuration is set, the port forwarding can be started with the command (on your local machine in a terminal window), using the local host name from above - here we'll take the example of gpc-f148n089-ib0:
<pre>
$ export gpcnode="gpc-f148n089-ib0"
$ ssh -N -L 20080:${gpcnode}:22 -L 20090:${gpcnode}:11111 login.scinet.utoronto.ca
</pre>
this command will not return anything until the forwarding is terminated, and will just look like it's sitting there. It doesn't start a remote shell or command (-N), but it will connect to login.scinet.utoronto.ca, and from there it will redirect your local (-L) port 20080 to <tt>${gpcnode}</tt> port 22, and similarly local port 20090 to <tt>${gpcnode}</tt> port 11111. We'll use the first for ssh'ing to the remote node (mainly for testing), and the second to conect the local paraview client to the remote paraview server.

To make sure the port forwarding is working correctly, in another window try sshing directly to the compute node from your desktop:
<pre>
$ ssh -p 20080 [your-scinet-username]@localhost
</pre>
and this should land you directly on the compute node. If it does not, then something is wrong with the ssh forwarding.

====Start Server====

Now that the tunnel is set up, on the compute node you can start the paraview server. To do this, you will have to have the following modules loaded:

<pre>
$ module load Xlibraries intel gcc python openmpi paraview
</pre>
(You can replace intelmpi with openmpi, and of course any module that is already loaded does have to be loaded again.)

Then start the paraview server with the intel mpirun as with any MPI job:
<pre>
$ mpirun -np [NP] pvserver --use-offscreen-rendering
</pre>
where NP is the number of processors; 16 processors per node on the largemem nodes, or 8 per node otherwise.

Once running, the ParaView server should output
<pre>
Listen on port: 11111
Waiting for client...
</pre>

====Connect Client and Server====

[[Image:Configure.png|thumb|right|320px|Configuring the client]]

Once the server is running, you can connect the client. Start the ParaView client on your desktop, and choose File->Connect. Click `Add Server', give the server a name (say, GPC), and give the port number 20090. The other values should be correct by default; host is <tt>localhost</tt>, and the server type is Client/Server. Click `Configure'.

On the next window, you'll be asked for a command to start up the server; select `Manual', and ok.

In future runs, you'll be able to re-use this server, even if the host is different, because the correct host will be set in your <tt>.ssh/config</tt>.

Once the server is selected, click `Connect'. On the compute node, the server should respond `Client connected'. In the client window, when you (for instance) select File->Open, you will be seeing the files on the GPC, rather than the local host.

From here, the [http://paraview.org/Wiki/ParaView ParaView Wiki] can give you instructions as to how to plot your data.

Using Paraview

2014-09-10T19:57:23Z

Ljdursi:

[http://www.paraview.org/ ParaView] is a powerful, parallel, client-server based visualization system that allows you to use SciNet's GPC nodes to render data on SciNet, and manipulate the results interactively on your own desktop. To use the paraview server on SciNet is much like using it locally, but there is an additional step in setting up a connection directly between your desktop and the compute nodes.

[[Image:Paraview.png|thumb|right|320px|The ParaView Client GUI]]

===Installing ParaView===

To use Paraview, you will have to have the client software installed on your system; you will need ParaView from [http://www.paraview.org/paraview/resources/software.html the Paraview website]. Binaries exist for Linux, Mac, and Windows systems. The client version must exactly match the version installed on the server, currently 3.12 or 3.14.1. The client version has all the functionality of the server, and can analyze data locally.

===SSH Forwarding For ParaView===

To interactively use the ParaView server on GPC, you will have to work some ssh magic to allow the client on your desktop to connect to the server through the scinet login nodes. The steps required are

* Have an SSH key that you can use to log into SciNet
* Log into the head node that you'll be using the server on
* (Mac or Linux): Edit your local <tt>~/.ssh/config</tt> to enable forwarding to that node
* Start ssh forwarding
* Start server
* Connecting client and server

====SSH Keys====

To be able to log into the compute nodes where ParaView will be running, you'll have to have an [[Ssh_keys | SSH key]] set up, as password authentication won't work. Our [[Ssh_keys | SSH Keys and SciNet]] page describes how to do this.

====Log into node====

The first thing to do is to go to the node from which you'll start the ParaView server. This is typically done by starting an interactive job on the GPC, perhaps on the [[Moab#debug | debug ]] queue or sandybridge [[GPC_Quickstart#Memory_Configuration | large memory]] nodes. Paraview can in principle make use of as many nodes as you throw at it. So one might begin jobs as below:

<pre>
qsub -l nodes=1:m128g:ppn=16,walltime=1:00:00 -q sandy -I
</pre>
or
<pre>
qsub -l nodes=2:ppn=8,walltime=1:00:00 -q debug -I
</pre>

Once this job has started, you'll be placed in a shell on the head node of the job; typing `<pre>hostname</pre>' will tell you the name of the host, eg
<pre>
$ hostname
gpc-f148n089-ib0
</pre>
or
<pre>
$ hostname
gpc-f107n045-ib0
</pre>
you will need this hostname in the following steps.

====Start SSH port forwarding====

Once the ssh configuration is set, the port forwarding can be started with the command (on your local machine in a terminal window), using the local host name from above - here we'll take the example of gpc-f148n089-ib0:
<pre>
$ export gpcnode="gpc-f148n089-ib0"
$ ssh -N -L 20080:${gpcnode}:22 -L 20090:${gpcnode}:11111 login.scinet.utoronto.ca
</pre>
this command will not return anything until the forwarding is terminated, and will just look like it's sitting there. It doesn't start a remote shell or command (-N), but it will connect to login.scinet.utoronto.ca, and from there it will redirect your local (-L) port 20080 to <tt>${gpcnode}</tt> port 22, and similarly local port 20090 to <tt>${gpcnode}</tt> port 11111. We'll use the first for ssh'ing to the remote node (mainly for testing), and the second to conect the local paraview client to the remote paraview server.

To make sure the port forwarding is working correctly, in another window try sshing directly to the compute node from your desktop:
<pre>
$ ssh -p 20080 [your-scinet-username]@localhost
</pre>
and this should land you directly on the compute node. If it does not, then something is wrong with the ssh forwarding.

====Start Server====

Now that the tunnel is set up, on the compute node you can start the paraview server. To do this, you will have to have the following modules loaded:

<pre>
$ module load Xlibraries intel gcc python openmpi paraview
</pre>
(You can replace intelmpi with openmpi, and of course any module that is already loaded does have to be loaded again.)

Then start the paraview server with the intel mpirun as with any MPI job:
<pre>
$ mpirun -np [NP] pvserver --use-offscreen-rendering
</pre>
where NP is the number of processors; 16 processors per node on the largemem nodes, or 8 per node otherwise.

Once running, the ParaView server should output
<pre>
Listen on port: 11111
Waiting for client...
</pre>

====Connect Client and Server====

[[Image:Configure.png|thumb|right|320px|Configuring the client]]

Once the server is running, you can connect the client. Start the ParaView client on your desktop, and choose File->Connect. Click `Add Server', give the server a name (say, GPC), and give the port number 20090. The other values should be correct by default; host is <tt>localhost</tt>, and the server type is Client/Server. Click `Configure'.

On the next window, you'll be asked for a command to start up the server; select `Manual', and ok.

In future runs, you'll be able to re-use this server, even if the host is different, because the correct host will be set in your <tt>.ssh/config</tt>.

Once the server is selected, click `Connect'. On the compute node, the server should respond `Client connected'. In the client window, when you (for instance) select File->Open, you will be seeing the files on the GPC, rather than the local host.

From here, the [http://paraview.org/Wiki/ParaView ParaView Wiki] can give you instructions as to how to plot your data.

Using Paraview

2014-09-10T19:52:50Z

Ljdursi:

[http://www.paraview.org/ ParaView] is a powerful, parallel, client-server based visualization system that allows you to use SciNet's GPC nodes to render data on SciNet, and manipulate the results interactively on your own desktop. To use the paraview server on SciNet is much like using it locally, but there is an additional step in setting up a connection directly between your desktop and the compute nodes.

[[Image:Paraview.png|thumb|right|320px|The ParaView Client GUI]]

===Installing ParaView===

To use Paraview, you will have to have the client software installed on your system; you will need ParaView from [http://www.paraview.org/paraview/resources/software.html the Paraview website]. Binaries exist for Linux, Mac, and Windows systems. The client version must exactly match the version installed on the server, currently 3.12 or 3.14.1. The client version has all the functionality of the server, and can analyze data locally.

===SSH Forwarding For ParaView===

To interactively use the ParaView server on GPC, you will have to work some ssh magic to allow the client on your desktop to connect to the server through the scinet login nodes. The steps required are

* Have an SSH key that you can use to log into SciNet
* Log into the head node that you'll be using the server on
* (Mac or Linux): Edit your local <tt>~/.ssh/config</tt> to enable forwarding to that node
* Start ssh forwarding
* Start server
* Connecting client and server

====SSH Keys====

To be able to log into the compute nodes where ParaView will be running, you'll have to have an [[Ssh_keys | SSH key]] set up, as password authentication won't work. Our [[Ssh_keys | SSH Keys and SciNet]] page describes how to do this.

====Log into node====

The first thing to do is to go to the node from which you'll start the ParaView server. This is typically done by starting an interactive job on the GPC, perhaps on the [[Moab#debug | debug ]] queue or sandybridge [[GPC_Quickstart#Memory_Configuration | large memory]] nodes. Paraview can in principle make use of as many nodes as you throw at it. So one might begin jobs as below:

<pre>
qsub -l nodes=1:m128g:ppn=16,walltime=1:00:00 -q sandy -I
</pre>
or
<pre>
qsub -l nodes=2:ppn=8,walltime=1:00:00 -q debug -I
</pre>

Once this job has started, you'll be placed in a shell on the head node of the job; typing `<pre>hostname</pre>' will tell you the name of the host, eg
<pre>
$ hostname
gpc-f148n089-ib0
</pre>
or
<pre>
$ hostname
gpc-f107n045-ib0
</pre>
you will need this hostname in the following steps; on your local machine, in the terminal, set a variable name <tt>gpcnode</tt> to the remote node name, eg
<pre>
export gpcnode=gpc-f148n089-ib0
</pre>

====Start SSH port forwarding====

Once the ssh configuration is set, the port forwarding can be started with the command (on your desktop, in the same terminal you set the gpcnode variable above:)

<pre>
$ ssh -N -L 20080:${gpcnode}:22 -L 20090:${gpcnode}:11111 login.scinet.utoronto.ca
</pre>
this command will not return anything until the forwarding is terminated, and will just look like it's sitting there. It doesn't start a remote shell or command (-N), but it will connect to login.scinet.utoronto.ca, and from there it will redirect your local (-L) port 20080 to <tt>${gpcnode}</tt> port 22, and similarly local port 20090 to <tt>${gpcnode}</tt> port 11111. We'll use the first for ssh'ing to the remote node (mainly for testing), and the second to conect the local paraview client to the remote paraview server.

To make sure the port forwarding is working correctly, in another window try sshing directly to the compute node from your desktop:
<pre>
$ ssh -p 20080 [your-scinet-username]@localhost
</pre>
and this should land you directly on the compute node. If it does not, then something is wrong with the ssh forwarding.

====Start Server====

Now that the tunnel is set up, on the compute node you can start the paraview server. To do this, you will have to have the following modules loaded:

<pre>
$ module load Xlibraries intel gcc python openmpi paraview
</pre>
(You can replace intelmpi with openmpi, and of course any module that is already loaded does have to be loaded again.)

Then start the paraview server with the intel mpirun as with any MPI job:
<pre>
$ mpirun -np [NP] pvserver --use-offscreen-rendering
</pre>
where NP is the number of processors; 16 processors per node on the largemem nodes, or 8 per node otherwise.

Once running, the ParaView server should output
<pre>
Listen on port: 11111
Waiting for client...
</pre>

====Connect Client and Server====

[[Image:Configure.png|thumb|right|320px|Configuring the client]]

Once the server is running, you can connect the client. Start the ParaView client on your desktop, and choose File->Connect. Click `Add Server', give the server a name (say, GPC), and give the port number 20090. The other values should be correct by default; host is <tt>localhost</tt>, and the server type is Client/Server. Click `Configure'.

On the next window, you'll be asked for a command to start up the server; select `Manual', and ok.

In future runs, you'll be able to re-use this server, even if the host is different, because the correct host will be set in your <tt>.ssh/config</tt>.

Once the server is selected, click `Connect'. On the compute node, the server should respond `Client connected'. In the client window, when you (for instance) select File->Open, you will be seeing the files on the GPC, rather than the local host.

From here, the [http://paraview.org/Wiki/ParaView ParaView Wiki] can give you instructions as to how to plot your data.

Using Paraview

2014-09-10T19:49:21Z

Ljdursi: /* Log into node */

[http://www.paraview.org/ ParaView] is a powerful, parallel, client-server based visualization system that allows you to use SciNet's GPC nodes to render data on SciNet, and manipulate the results interactively on your own desktop. To use the paraview server on SciNet is much like using it locally, but there is an additional step in setting up a connection directly between your desktop and the compute nodes.

[[Image:Paraview.png|thumb|right|320px|The ParaView Client GUI]]

===Installing ParaView===

To use Paraview, you will have to have the client software installed on your system; you will need ParaView from [http://www.paraview.org/paraview/resources/software.html the Paraview website]. Binaries exist for Linux, Mac, and Windows systems. The client version must exactly match the version installed on the server, currently 3.12 or 3.14.1. The client version has all the functionality of the server, and can analyze data locally.

===SSH Forwarding For ParaView===

To interactively use the ParaView server on GPC, you will have to work some ssh magic to allow the client on your desktop to connect to the server through the scinet login nodes. The steps required are

* Have an SSH key that you can use to log into SciNet
* Log into the head node that you'll be using the server on
* (Mac or Linux): Edit your local <tt>~/.ssh/config</tt> to enable forwarding to that node
* Start ssh forwarding
* Start server
* Connecting client and server

====SSH Keys====

To be able to log into the compute nodes where ParaView will be running, you'll have to have an [[Ssh_keys | SSH key]] set up, as password authentication won't work. Our [[Ssh_keys | SSH Keys and SciNet]] page describes how to do this.

====Log into node====

The first thing to do is to go to the node from which you'll start the ParaView server. This is typically done by starting an interactive job on the GPC, perhaps on the [[Moab#debug | debug ]] queue or sandybridge [[GPC_Quickstart#Memory_Configuration | large memory]] nodes. Paraview can in principle make use of as many nodes as you throw at it. So one might begin jobs as below:

<pre>
qsub -l nodes=1:m128g:ppn=16,walltime=1:00:00 -q sandy -I
</pre>
or
<pre>
qsub -l nodes=2:ppn=8,walltime=1:00:00 -q debug -I
</pre>

Once this job has started, you'll be placed in a shell on the head node of the job; typing `<pre>hostname</pre>' will tell you the name of the host, eg
<pre>
$ hostname
gpc-f148n089-ib0
</pre>
or
<pre>
$ hostname
gpc-f107n045-ib0
</pre>
you will need this hostname in the following steps; on your local machine, in the terminal, set a variable name <tt>gpcnode</tt> to the remote node name, eg
<pre>
export gpcnode=gpc-f148n089-ib0
</pre>

====Edit ssh config (MacOS/Linux)====

You will now need to edit your ssh config to allow ssh forwarding so that you can seemingly connect directly to the compute node above. Add the following lines to your <tt>~/.ssh/config</tt> file in MacOS or Linux; windows users will have to consult their ssh client's documentation as to how to implement the forwarding:

<pre>
Host gpc_gw
HostName login.scinet.utoronto.ca
User [username]
LocalForward 20080 [hostname]:22
LocalForward 20090 [hostname]:11111

Host gpcnode
HostName localhost
HostKeyAlias gpcnode
User [username]
Port 20080
</pre>

Replace <tt>[username]</tt> with your username, and <tt>[hostname]</tt> with the name of the host from the previous step. This sets two ssh port forwards; one to port 11111 of the compute nodes, which is needed by ParaView; and one is just the usual SSH port 22, which can be used for testing. In future runs of the server, only the hostname in the first stanza needs to be changed.

====Edit ssh config (Windows, Cygwin)====

If you have Cygwin X installed on Windows, including the <tt>openssh</tt> package, take the following steps. First run in your Cygwin Bash Shell:
<pre>
$ ssh-user-config
</pre>
There is no need to create any of the keys, but this will create the ssh directory where you need to put the config file.

Next, go to the directory <tt>cygwin\home\[username]\.ssh\</tt>, where <tt>username</tt> is your computer login name. In this directory, create a file called <tt>config</tt> containing the code from the previous section, with the stanzas replaced by the appropriate host- and username. Make the file read-only.
The ssh port forwarding is now set up. Open a Cygwin Bash Shell and follow the rest of the instructions below.

====Start SSH port forwarding====

Once the ssh configuration is set, the port forwarding can be started with the command (on your desktop)

<pre>
$ ssh -N gpc_gw
</pre>
this command will not return anything until the forwarding is terminated, and will just look like it's sitting there. To make sure the port forwarding is working correctly, in another window try sshing directly to the compute node from your desktop:
<pre>
$ ssh gpcnode
</pre>
and this should land you directly on the compute node. If it does not, then something is wrong with the ssh forwarding.

====Start Server====

Now that the tunnel is set up, on the compute node you can start the paraview server. To do this, you will have to have the following modules loaded:

<pre>
$ module load Xlibraries intel gcc python openmpi paraview
</pre>
(You can replace intelmpi with openmpi, and of course any module that is already loaded does have to be loaded again.)

Then start the paraview server with the intel mpirun as with any MPI job:
<pre>
$ mpirun -np [NP] pvserver --use-offscreen-rendering
</pre>
where NP is the number of processors; 16 processors per node on the largemem nodes, or 8 per node otherwise.

Once running, the ParaView server should output
<pre>
Listen on port: 11111
Waiting for client...
</pre>

====Connect Client and Server====

[[Image:Configure.png|thumb|right|320px|Configuring the client]]

Once the server is running, you can connect the client. Start the ParaView client on your desktop, and choose File->Connect. Click `Add Server', give the server a name (say, GPC), and give the port number 20090. The other values should be correct by default; host is <tt>localhost</tt>, and the server type is Client/Server. Click `Configure'.

On the next window, you'll be asked for a command to start up the server; select `Manual', and ok.

In future runs, you'll be able to re-use this server, even if the host is different, because the correct host will be set in your <tt>.ssh/config</tt>.

Once the server is selected, click `Connect'. On the compute node, the server should respond `Client connected'. In the client window, when you (for instance) select File->Open, you will be seeing the files on the GPC, rather than the local host.

From here, the [http://paraview.org/Wiki/ParaView ParaView Wiki] can give you instructions as to how to plot your data.

Hadoop for HPCers

2014-09-03T17:28:55Z

Ljdursi:

=Overview=

This is a ~3 hour class that will introduce Hadoop to HPC users with a background in numerical simulation. We will walk through a brief overview of:
* The Hadoop File System (HDFS)
* Map Reduce
* Pig
* Spark

Most examples will be written in Python.

=VM Instructions=

This course will feature hands-on work with a 1-node Hadoop cluster running on your laptop. The VMs are created with [http://www.vagrantup.com Vagrant]. Before the course, ensure this is up and running:

* Install [https://www.virtualbox.org VirtualBox] on your laptop and start it. (Note! At time of writing, the newest version, 4.3.14, is broken on at least Mac and Windows; you'll want to install 4.3.12 from [https://www.virtualbox.org/wiki/Download_Old_Builds_4_3 "older builds"].)
* Under Settings or Preferences, go to Network, then Host-only networks, and add/create two host-only networks.
* Then download the virtual machine image you want to use:
** [https://support.scinet.utoronto.ca/~ljdursi/Hadoop/VMs/GUI/GUI-VM.ova Full Size VM with GUI (require peak of ~8GB free disk space)]
** [https://support.scinet.utoronto.ca/~ljdursi/Hadoop/VMs/text/Text-VM.ova Smaller, Text-only (require peak of ~6GB free disk space)]
* "Import Appliance", and select the downloaded image; this will uncompress the image which will take some minutes.
* Start the new virtual machine.

If you get any warnings about shared folders not existing, that's fine.

The GUI VM will start up a console with a full desktop environment; you can open a terminal and begin working. For the text VM, you will have to login to the console; the username/password is vagrant/vagrant. For either machine, you can also ssh into the VM from your laptop from the terminal: <pre>ssh vagrant@192.168.33.10</pre> (or <pre>ssh -p 2222 vagrant@localhost</pre>) or to the laptop from the VM with <pre>ssh [username]@192.168.33.1</pre>.

(If that particular address pair doesn't work, from a window within the VM, type "ifconfig" to find a line like "inet addr: 192.168...." or "inet adde: 10. .."; that's the VMs IP address)

Then make sure everything is working:
* From a terminal, start up the hadoop cluster by typing <pre>~/bin/init.sh</pre> You may have to answer "yes" a few times to start up some servers.
* Go to one of the example directories by typing <pre>cd ~/examples/wordcount/streaming</pre>
* Then start the example by typing <pre>make</pre>

You've now run your (maybe) first Hadoop job!

If you'd like, you can also create the virtual machine image yourself by downloading [http://www.vagrantup.com Vagrant] and the Vagrantfile for the [https://support.scinet.utoronto.ca/~ljdursi/Hadoop/VMs/GUI/Vagrantfile GUI] or [https://support.scinet.utoronto.ca/~ljdursi/Hadoop/VMs/text/Vagrantfile text] image and running "vagrant up". If you vagrant-up the GUI VM, you will have to "vagrant reload" after installation is completed to restart with all the software installed.

If you can't get the VM working for whatever reason, please contact us and we will make alternate arrangements.

= Updated Examples =

If you've downloaded the image before Wednesday morning, from within the VM you may want to download the updated examples from [https://support.scinet.utoronto.ca/~ljdursi/Hadoop/examples.tgz https://support.scinet.utoronto.ca/~ljdursi/Hadoop/examples.tgz]

= Slides =

You can download [https://support.scinet.utoronto.ca/~ljdursi/Hadoop/presentation.pdf the slides from here].

Hadoop for HPCers

2014-09-03T16:59:30Z