Difference between revisions of "GPU Devel Nodes"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 2: Line 2:
 
|image=[[Image:GeForce_9800_GT_3qtr_low.png|center|300px|thumb]]
 
|image=[[Image:GeForce_9800_GT_3qtr_low.png|center|300px|thumb]]
 
|name=GPU Development Cluster
 
|name=GPU Development Cluster
|installed=June 2010
+
|installed=April 2011
 
|operatingsystem= Linux
 
|operatingsystem= Linux
|loginnode= cell-srv01 (from <tt>login.scinet</tt>)
+
|loginnode= arc01 (from <tt>login.scinet</tt>)
 
|numberofnodes=8
 
|numberofnodes=8
 
|rampernode=48 Gb  
 
|rampernode=48 Gb  
 
|corespernode=8
 
|corespernode=8
|interconnect=Infiniband,GigE
+
|interconnect=Infiniband
 
|vendorcompilers=gcc,nvcc
 
|vendorcompilers=gcc,nvcc
 
}}
 
}}
  
The Intel nodes have two 4 core Xeon X5550 2.67GHz CPU's with 48GB of RAM per node with 3 containing NVIDIA 9800GT GPUs.  
+
The Intel nodes have two 4 core Xeon X5550 2.67GHz CPU's with 48GB of RAM per node along with two NVIDIA M2070 (Fermi) GPU's each with 6 GB or RAM.
  
 
===Login===
 
===Login===
  
First login via ssh with your scinet account at <tt>login.scinet.utoronto.ca</tt>, and from there you can proceed to <tt>cell-srv01</tt> which  
+
First login via ssh with your scinet account at <tt>login.scinet.utoronto.ca</tt>, and from there you can proceed to <tt>ar01</tt> which  
is currently the gateway machine.
+
is the GPU development node.
  
 
Access to these machines is currently controlled. Please email support@scinet.utoronto.ca for access.
 
Access to these machines is currently controlled. Please email support@scinet.utoronto.ca for access.
  
 
==Compile/Devel/Compute Nodes==
 
==Compile/Devel/Compute Nodes==
 +
  
 
=== Nehalem (x86_64) ===
 
=== Nehalem (x86_64) ===
You can log into any of 8 nodes '''<tt>cell-srv[01-08]</tt>''' directly however the nodes have differing configurations as follows:
 
  
* '''<tt>cell-srv01</tt>''' - login node & nfs server, GigE connected
 
* '''<tt>cell-srv[02-05]</tt>''' - no GPU, GigE connected
 
* '''<tt>cell-srv[06-07]</tt>''' - 1x NVIDIA 9800GT GPU, Infiniband connected
 
* '''<tt>cell-srv08</tt>''' - 2x NVIDIA 9800GT GPU, GigE connected
 
* '''<tt>gpu01</tt>''' - NVIDIA C2050 Tesla (Fermi), GigE connected
 
  
 
=== Software ===
 
=== Software ===
Line 39: Line 34:
 
=== Driver Version ===
 
=== Driver Version ===
  
The current NVIDIA driver version installed is 256.40.
+
The current NVIDIA driver version installed is 270.40.
  
 
==Programming Frameworks==
 
==Programming Frameworks==
Line 47: Line 42:
 
=== CUDA ===
 
=== CUDA ===
  
The current CUDA Toolkits in use are 3.0 (default) and 3.1. To use 3.0 just add the following module
+
The current CUDA Toolkits in use are 3.0, 3.1, 3.2 (default) and 4.0. To use 3.2 just add the following module
  
 
<pre>
 
<pre>
module load cuda  
+
module load cuda/3.2
 
</pre>
 
</pre>
  
or for 3.1 use
+
Note that to use the full 6GB or memory per GPU, at least CUDA 3.2 must be used.
 +
 
 +
The CUDA driver is installed locally, however the CUDA Toolkits are installed in.
  
 
<pre>
 
<pre>
module load cuda/cuda-3.1
+
/project/scinet/arc/cuda-$VERSION/
 
</pre>
 
</pre>
  
The CUDA driver is installed locally, however the CUDA Toolkit is installed in.
+
The variable $SCINET_CUDA_INSTALL is set when a cuda module is loaded and is pointed to the
 +
install location.  This is useful when setting up your makefile or if you use the NVIDIA_SDK
 +
makefiles modify the NVIDIA_SDK/C/common/common.mk file accordingly.
 +
 
 
<pre>
 
<pre>
/project/scinet/arc/cuda-3.0/
+
CUDA_INSTALL_PATH ?= $SCINET_CUDA_INSTALL
/project/scinet/arc/cuda-3.1/
 
 
</pre>
 
</pre>
 +
 +
 +
  
 
=== OpenCL ===
 
=== OpenCL ===

Revision as of 11:39, 8 April 2011

GPU Development Cluster
GeForce 9800 GT 3qtr low.png
Installed April 2011
Operating System Linux
Interconnect Infiniband
Ram/Node 48 Gb
Cores/Node 8
Login/Devel Node arc01 (from login.scinet)
Vendor Compilers gcc,nvcc

The Intel nodes have two 4 core Xeon X5550 2.67GHz CPU's with 48GB of RAM per node along with two NVIDIA M2070 (Fermi) GPU's each with 6 GB or RAM.

Login

First login via ssh with your scinet account at login.scinet.utoronto.ca, and from there you can proceed to ar01 which is the GPU development node.

Access to these machines is currently controlled. Please email support@scinet.utoronto.ca for access.

Compile/Devel/Compute Nodes

Nehalem (x86_64)

Software

The same software installed on the GPC is available on ARC using the same modules framework. See here for full details.

Driver Version

The current NVIDIA driver version installed is 270.40.

Programming Frameworks

Currently there are two programming frameworks to use, NVIDIA's CUDA framework or OpenCL.

CUDA

The current CUDA Toolkits in use are 3.0, 3.1, 3.2 (default) and 4.0. To use 3.2 just add the following module

module load cuda/3.2

Note that to use the full 6GB or memory per GPU, at least CUDA 3.2 must be used.

The CUDA driver is installed locally, however the CUDA Toolkits are installed in.

/project/scinet/arc/cuda-$VERSION/

The variable $SCINET_CUDA_INSTALL is set when a cuda module is loaded and is pointed to the install location. This is useful when setting up your makefile or if you use the NVIDIA_SDK makefiles modify the NVIDIA_SDK/C/common/common.mk file accordingly.

CUDA_INSTALL_PATH ?= $SCINET_CUDA_INSTALL 



OpenCL

As of 3.0, OpenCL is included in the CUDA Toolkit so loading the CUDA module is all the is required.

Compilers

  • nvcc -- Nvidia compiler

MPI

The GPC MPI packages can be used on this system. See the GPC section on MPI for more details.

Documentation

  • CUDA
    • google "CUDA"
  • OpenCL
    • see above

Further Info

User Codes

Please discuss put any relevant information/problems/best practices you have encountered when using/developing for CUDA and/or OpenCL