Difference between revisions of "P8"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 9: Line 9:
 
|corespernode= 2 x 8core (16 physical, 128 SMT)
 
|corespernode= 2 x 8core (16 physical, 128 SMT)
 
|interconnect=Infiniband EDR  
 
|interconnect=Infiniband EDR  
|vendorcompilers=xlc/xlf
+
|vendorcompilers=xlc/xlf, nvcc
 
}}
 
}}
  
 
== Specifications==
 
== Specifications==
  
The P8 Test System consists of  of 4 IBM Power 822LC Servers each with 2x8core 3.25GHz Power8 CPUs and 512GB Ram. Similar to Power 7, the Power 8 utilizes Simultaneous MultiThreading (SMT), but extends the design to 8 threads per core allowing the 16 physical cores to support up to 128 threads.  2 nodes have two NVIDIA Tesla K80 GPUs with CUDA Capability 3.7 (Kepler), consisting of 2xGK210 GPUs each with 12 GB of RAM connected using PCI-E, and 2 others have 4x NVIDIA Tesla P100 GPUs each wit h 24GB of RAM with CUDA Capability 6.0 (Pascal) connected using NVlink.
+
The P8 Test System consists of  of 4 IBM Power 822LC Servers each with 2x8core 3.25GHz Power8 CPUs and 512GB Ram. Similar to Power 7, the Power 8 utilizes Simultaneous MultiThreading (SMT), but extends the design to 8 threads per core allowing the 16 physical cores to support up to 128 threads.  2 nodes have two NVIDIA Tesla K80 GPUs with CUDA Capability 3.7 (Kepler), consisting of 2xGK210 GPUs each with 12 GB of RAM connected using PCI-E, and 2 others have 4x NVIDIA Tesla P100 GPUs each wit h 16GB of RAM with CUDA Capability 6.0 (Pascal) connected using NVlink.
  
 
== Compile/Devel/Test ==
 
== Compile/Devel/Test ==
Line 20: Line 20:
 
First login via ssh with your scinet account at '''<tt>login.scinet.utoronto.ca</tt>''', and from there you can proceed to '''<tt>p8t0[1-2]</tt>''' for the K80 GPUs and '''<tt>p8t0[3-4]</tt>''' for the Pascal GPUs.
 
First login via ssh with your scinet account at '''<tt>login.scinet.utoronto.ca</tt>''', and from there you can proceed to '''<tt>p8t0[1-2]</tt>''' for the K80 GPUs and '''<tt>p8t0[3-4]</tt>''' for the Pascal GPUs.
  
== Software ==
+
== Software for  ==
 
 
THIS IS OUTDATED FOR THE NEW PASCAL NODES. 
 
  
 
==== GNU Compilers ====
 
==== GNU Compilers ====
  
gcc version 4.8.5 is the default with RHEL 7.2.  To load the newer advance toolchain version use:
+
To load the newer advance toolchain version use:
  
 +
For '''<tt>p8t0[1-2]</tt>'''
 
<pre>
 
<pre>
 
module load gcc/5.3.1
 
module load gcc/5.3.1
 +
</pre>
 +
 +
For '''<tt>p8t0[3-4]</tt>'''
 +
<pre>
 +
module load gcc/6.2.1
 
</pre>
 
</pre>
  
 
==== IBM Compilers ====
 
==== IBM Compilers ====
IBM compilers xlc/xlc++ 13.1.4 and xlf 15.1.4 are available by default
 
  
=== NVIDIA toolkit ===
+
To load the native IBM xlc/xlc++ compilers
 +
 
 +
For '''<tt>p8t0[1-2]</tt>'''
 +
<pre>
 +
module load xlc/13.1.4
 +
module load xlf/13.1.4
 +
</pre>
 +
 
 +
For '''<tt>p8t0[3-4]</tt>'''
 +
<pre>
 +
module load xlc/13.1.5
 +
module load xlf/13.1.5
 +
</pre>
 +
 
  
 
==== Driver Version ====
 
==== Driver Version ====
  
The current NVIDIA driver version is 361.62
+
The current NVIDIA driver version is 361.93
  
 
==== CUDA ====
 
==== CUDA ====
  
The current installed CUDA Tookit is 7.5.
+
The current installed CUDA Tookit is 8.0
  
 
<pre>
 
<pre>
module load cuda/7.5
+
module load cuda/8.0
 
</pre>
 
</pre>
 
  
 
The CUDA driver is installed locally, however the CUDA Toolkit is installed in:
 
The CUDA driver is installed locally, however the CUDA Toolkit is installed in:
  
 
<pre>
 
<pre>
/usr/local/cuda-7.5/
+
/usr/local/cuda-8.0
 
</pre>
 
</pre>
  
 
==== OpenMPI ====
 
==== OpenMPI ====
  
Currently OpenMPI has been setup on the two nodes connected over FDR Infiniband.
+
Currently OpenMPI has been setup on the four nodes connected over QDR Infiniband.
  
 +
For '''<tt>p8t0[1-2]</tt>'''
 
<pre>
 
<pre>
$ module load openmpi/1.10.3-gcc-4.8.5
 
 
$ module load openmpi/1.10.3-gcc-5.3.1
 
$ module load openmpi/1.10.3-gcc-5.3.1
 
$ module load openmpi/1.10.3-XL-13_15.1.4
 
$ module load openmpi/1.10.3-XL-13_15.1.4
 +
</pre>
 +
 +
For '''<tt>p8t0[3-4]</tt>'''
 +
<pre>
 +
$ module load openmpi/1.10.3-gcc-6.2.1
 +
$ module load openmpi/1.10.3-XL-13_15.1.5
 
</pre>
 
</pre>
  

Revision as of 10:06, 24 October 2016

P8
P8 s822.jpg
Installed June 2016
Operating System Linux RHEL 7.2 le / Ubuntu 16.04 le
Number of Nodes 2x Power8 with 2x NVIDIA K80, 2x Power 8 with 4x NVIDIA P100
Interconnect Infiniband EDR
Ram/Node 512 GB
Cores/Node 2 x 8core (16 physical, 128 SMT)
Login/Devel Node p8t0[1-2] / p8t0[3-4]
Vendor Compilers xlc/xlf, nvcc

Specifications

The P8 Test System consists of of 4 IBM Power 822LC Servers each with 2x8core 3.25GHz Power8 CPUs and 512GB Ram. Similar to Power 7, the Power 8 utilizes Simultaneous MultiThreading (SMT), but extends the design to 8 threads per core allowing the 16 physical cores to support up to 128 threads. 2 nodes have two NVIDIA Tesla K80 GPUs with CUDA Capability 3.7 (Kepler), consisting of 2xGK210 GPUs each with 12 GB of RAM connected using PCI-E, and 2 others have 4x NVIDIA Tesla P100 GPUs each wit h 16GB of RAM with CUDA Capability 6.0 (Pascal) connected using NVlink.

Compile/Devel/Test

First login via ssh with your scinet account at login.scinet.utoronto.ca, and from there you can proceed to p8t0[1-2] for the K80 GPUs and p8t0[3-4] for the Pascal GPUs.

Software for

GNU Compilers

To load the newer advance toolchain version use:

For p8t0[1-2]

module load gcc/5.3.1

For p8t0[3-4]

module load gcc/6.2.1

IBM Compilers

To load the native IBM xlc/xlc++ compilers

For p8t0[1-2]

module load xlc/13.1.4
module load xlf/13.1.4

For p8t0[3-4]

module load xlc/13.1.5
module load xlf/13.1.5


Driver Version

The current NVIDIA driver version is 361.93

CUDA

The current installed CUDA Tookit is 8.0

module load cuda/8.0

The CUDA driver is installed locally, however the CUDA Toolkit is installed in:

/usr/local/cuda-8.0

OpenMPI

Currently OpenMPI has been setup on the four nodes connected over QDR Infiniband.

For p8t0[1-2]

$ module load openmpi/1.10.3-gcc-5.3.1
$ module load openmpi/1.10.3-XL-13_15.1.4

For p8t0[3-4]

$ module load openmpi/1.10.3-gcc-6.2.1
$ module load openmpi/1.10.3-XL-13_15.1.5

PE

IBM's Parallel Environment (PE), is available for use with XL compilers using the following

$ module pe/xl.perf
mpiexec -n 4 ./a.out

documentation is here