Difference between revisions of "Phi"

Intel Xeon Phi / NVIDIA Tesla K20
Intel Xeon Phi / NVIDIA Tesla K20
Installed	April 2013
Operating System	Linux Centos 6.4
Number of Nodes	1
Interconnect	DDR Infiniband
Ram/Node	32 GB
Cores/Node	8 with Xeon Phi & K20
Login/Devel Node	gravity01
Vendor Compilers	nvcc,pgcc,icc,gcc
Queue Submission	none

Latest revision as of 19:29, 31 August 2018

WARNING: SciNet is in the process of replacing this wiki with a new documentation site. For current information, please go to https://docs.scinet.utoronto.ca

This is a single test node, for investigating new accelerator technologies. It consists of a single x86_64 node with one 8-core Intel Sandybridge Xeon E5-2650 2.0GHz CPU with 32GB of RAM. It has a single NVIDIA Tesla K20 GPU with CUDA Capability 3.0 (Kepler) with 2496 CUDA Cores and 5 GB of RAM as well as a single Intel Xeon Phi 3120A with 57 1.1 GHz cores and 6GB of RAM. The node is interconnected to the rest of the clusters with DDR Infiniband and mounts the regular SciNet GPFS filesystems.

Login

First login via ssh with your scinet account at login.scinet.utoronto.ca, and from there you can proceed to gravity01.

Queue

As this is a single node users are expected to use it in a "friendly" manner as this system is not setup for production usage, and primarily for investigating new technologies run times are limited to under 4 hours. To access the node you need to use the queue, similar to the standard ARC and GPC compute nodes, however with a maximum walltime of 4 hours.

For an interactive job use

qsub -l nodes=1:ppn=8,walltime=1:00:00 -q arcX -I

Software

The same software installed on the GPC is available on arcX using the modules framework. See here for full details.

NVIDIA Tesla K20

See the Gravity wiki page for full details of the available CUDA and OpenCL compilers and modules. To use all the K20 (Kepler) features a minimum of CUDA 5.0 is required. Cuda/6.5 is recommended for the K20.

CUDA

module load gcc/4.8.1 cuda/6.5

Here, gcc is loaded because it is a prerequisite of the cuda module.

You will have to let the cuda compiler know about the capabilities of the Kepler graphics card by supplying the flag -arch=sm_30 or -arch=sm_35.

Driver Version

The current NVIDIA driver version for the K20 is 340.32

Xeon Phi

Compilers

The Xeon Phi uses the standard intel compilers, however requires at least version 13.1

module load intel/14.0.0

MPI

IntelMPI also has Xeon Phi support

module load intelmpi/4.1.1.036

NOTE: Be sure to use mpiifort for compiling native MIC Fortran code as the mpif77,mpif90 scripts ignore the -mmic flags and will produce host only code.

Tools

The Intel Cluters Tools such as vtune amplifier and inspector are available for the Xeon Phi by loading the following modules.

module load inteltools

OpenCL

OpenCL version 1.2 is available for the Xeon Phi on arcX

/opt/intel/opencl

Direct Access

The Xeon Phi can be accessed directly from the host node by

ssh mic0

Shared Filesystem

The host node arc09 mounts the standard SciNet filesystems, i.e. $HOME and $SCRATCH, however to share files between the host and Xeon Phi use /localscratch/$HOME which shows up as $HOME on "mic0".

Useful Links

Building Native for MIC

TACC Stampede MIC Info

@@ Line 1: / Line 1: @@
+{| style="border-spacing: 8px; width:100%"
+| valign="top" style="cellpadding:1em; padding:1em; border:2px solid; background-color:#f6f674; border-radius:5px"|
+'''WARNING: SciNet is in the process of replacing this wiki with a new documentation site. For current information, please go to [https://docs.scinet.utoronto.ca https://docs.scinet.utoronto.ca]'''
+|}
 {{Infobox Computer
-|image=[[Image:Tesla_S2070_3qtr.gif|center|300px|thumb]]
+|image=[[Image:Xeon_phi.jpg|center|250px|thumb]][[Image:NVIDIA-Tesla-K20X.jpg|center|250px|thumb]]
-|name=XeonPHI/K20 Test node
+|name=Intel Xeon Phi / NVIDIA Tesla K20
 |installed=April 2013
 |operatingsystem= Linux Centos 6.4
-|loginnode= arc09 (from <tt>arc01</tt>)
+|loginnode= gravity01
 |nnodes=1
-|rampernode=32 Gb
+|rampernode=32 GB
 |corespernode=8 with Xeon Phi & K20
 |interconnect=DDR Infiniband
@@ Line 13: / Line 18: @@
 }}
-This is a single test/devel node, part of the [[Accelerator Research Cluster]], for investigating new accelerator technologies. It consists of a singele x86_64 nodes with one 8-core Intel Sandybridge Xeon
+This is a single test node, for investigating new accelerator technologies. It consists of a single x86_64 node with one 8-core Intel Sandybridge Xeon
-E5-2650 2.0GHz CPU with 32GB of RAM per node. It has a single NVIDIA Tesla K20 GPU with CUDA Capability 3.0 (Kepler) with 2496 CUDA Cores and 5 GB of RAM as well as a single Intel Xeon Phi 5110P with
+E5-2650 2.0GHz CPU with 32GB of RAM. It has a single NVIDIA Tesla K20 GPU with CUDA Capability 3.0 (Kepler) with 2496 CUDA Cores and 5 GB of RAM as well as a single Intel Xeon Phi 3120A with 57
+.1 GHz cores and 6GB of RAM. The node is interconnected to the rest of the clusters with DDR Infiniband and mounts the regular SciNet GPFS filesystems.
+=== Login ===
-The nodes are interconnected with DDR Infiniband for MPI communications
+First login via ssh with your scinet account at '''<tt>login.scinet.utoronto.ca</tt>''', and from there you can proceed to '''<tt>gravity01</tt>'''.
-and disk I/O to the SciNet GPFS filesystems.
+=== Queue ===
-=== Login ===
+As this is a single node users are expected to use it in a "friendly" manner as this system is not setup for production
+usage, and primarily for investigating new technologies run times are limited to under 4 hours.
+To access the node you need to use the queue, similar to the standard ARC and GPC compute nodes,
+however with a maximum walltime of 4 hours.
+For an interactive job use
+<pre>
+qsub -l nodes=1:ppn=8,walltime=1:00:00 -q arcX -I
+</pre>
+== Software ==
+The same software installed on the GPC is available on '''<tt>arcX</tt>''' using the modules framework.
+See '''[[GPC_Quickstart#Modules_and_Environment_Variables | here]]''' for full details.
+== NVIDIA Tesla K20 ==
+See the '''[[ Gravity | Gravity ]]''' wiki page for full details of the available CUDA and OpenCL compilers and modules. To
+use all the K20 (Kepler) features a minimum of CUDA 5.0 is required. Cuda/6.5 is recommended for the K20.
+=== CUDA ===
+<pre>
+module load gcc/4.8.1 cuda/6.5
+</pre>
+Here, gcc is loaded because it is a prerequisite of the cuda module.
+You will have to let the cuda compiler know about the capabilities of the Kepler graphics card by supplying the flag
+<tt>-arch=sm_30</tt> or <tt>-arch=sm_35</tt>.
+=== Driver Version ===
+The current NVIDIA driver version for the K20 is 340.32
+== Xeon Phi ==
-First login via ssh with your scinet account at <tt>login.scinet.utoronto.ca</tt>, and from there you can proceed to '''<tt>arc01</tt>''' which
+=== Compilers ===
-is the GPU development node and then to '''<tt>arc09</tt>'''.
+The Xeon Phi uses the standard intel compilers, however requires at least version 13.1
-Access to this machines is no enabled be default so please email support@scinet.utoronto.ca for access.
+<pre>
+module load intel/14.0.0
+</pre>
-=== Devel/Compute ===
+=== MPI ===
-As this is a single node there is no queue and users are expected to use it in a "friendly" manner.  This system is not setup for production
+IntelMPI also has Xeon Phi support
-usage, and primarily for investigating new technologies so please keep your run times short.
-== Software ==
+<pre>
+module load intelmpi/4.1.1.036
+</pre>
-The same software installed on the GPC is available on ARC using the same modules framework.
+'''NOTE''': Be sure to use '''mpiifort''' for compiling native MIC Fortran code as the '''mpif77,mpif90''' scripts ignore the -mmic flags and will produce host only code.
-See [[GPC_Quickstart#Modules_and_Environment_Variables | here]] for full details.
-==Programming Frameworks==
+=== Tools ===
-Currently there are four programming frameworks to use: NVIDIA's CUDA framework, PGI's CUDA Fortran, PGI's implementation of OpenACC, or OpenCL.
+The Intel Cluters Tools such as vtune amplifier and inspector are available for the Xeon Phi by
+loading the following modules.
-=== NVIDIA K20 ===
+<pre>
+module load inteltools
+</pre>
-See
+=== OpenCL ===
+OpenCL version 1.2 is available for the Xeon Phi on '''<tt>arcX</tt>'''
-=== Driver Version ===
+<pre>
+/opt/intel/opencl
+</pre>
-The current NVIDIA driver version for the K20 is 310.44
+=== Direct Access ===
+The Xeon Phi can be accessed directly from the host node by
+<pre>
+ssh mic0
+</pre>
-== Further Info ==
+=== Shared Filesystem ===
+The host node '''arc09''' mounts the standard SciNet filesystems, i.e. $HOME and $SCRATCH, however to share
+files between the host and Xeon Phi use /localscratch/$HOME which shows up as $HOME on "mic0".
+=== Useful Links ===
-== User Codes ==
+[http://software.intel.com/en-us/articles/building-a-native-application-for-intel-xeon-phi-coprocessors Building Native for MIC ]
-Please discuss and put any relevant information/problems/best practices you have encountered when using/developing for CUDA and/or OpenCL
+[http://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide#mic TACC Stampede MIC Info ]