P7 Linux Cluster

From oldwiki.scinet.utoronto.ca
Revision as of 09:21, 15 June 2011 by Northrup (talk | contribs)
Jump to navigation Jump to search
P7 Cluster (P7)
IBM755.jpg
Installed May 2011
Operating System Linux (RHEL 6.0)
Interconnect Infiniband
Ram/Node 128 Gb
Cores/Node 32 (128 Threads)
Login/Devel Node p701 (from login.scinet)
Vendor Compilers xlc/xlf
Queue Submission LoadLeveler

Specifications

The P7 Cluster consists of 5 IBM Power 755 Servers each with 4x 8core 3.3GHz Power7 CPUs and 128GB Ram. Similar to the Power 6, the Power 7 utilizes Simultaneous Multi Threading (SMT), but extends the design from 2 threads per core to 4. This allows the 32 physical cores to support up to 128 threads which in many cases can lead to significant speedups.

Login

First login via ssh with your scinet account at login.scinet.utoronto.ca, and from there you can proceed to p701 which is currently the gateway/devel node for this cluster.

Compiler/Devel Node

From p701 you can compile, do short tests, and submit your jobs to the queue.

GNU

gcc/g++/gfortran version 4.4.4 is the default with RHEL 6.0 and is available by default.

IBM

To use the IBM Power specific compilers xlc/xlc++/xlf you need to load the following modules

$ module load vacpp xlf

NOTE: Be sure to use "-q64" when using the IBM compilers.

MPI

OpenMPI is available for both compilers

$ module openmpi/1.5.3-gcc-v4.4.4
$ module openmpi/1.5.3-ibm-11.1+13.1


IBM's POE is installed but due to current problems with loadleveler/lapi/poe it is not recommended for use.

Submit a Job

#!/bin/bash
#===============================================================================
# P7 Load Leveler Submission Script
#===================================
#
## FOR P7
#@ environment = MP_INFOLEVEL=1; MP_USE_BULK_XFER=yes; MP_BULK_MIN_MSG_SIZE=64K; \
#                MP_EAGER_LIMIT=64K; LAPI_DEBUG_ENABLE_AFFINITY=no
#
# @ notification = never
#
#===================================
# ulimits
# @ core_limit   = 0
#===================================
# Job specfic
#===================================
#
# @ job_name     = sample
#
# @ job_type = parallel
# @ class = verylong
# @ output = $(jobid).out
# @ error = $(jobid).err
# @ wall_clock_limit = 2:00:00
#
## Use either node,tasks_per_node
## or blocking,total_tasks
##
# @ node = 1
# @ tasks_per_node = 128
#
#@ queue
#
#===================================
export OMP_NUM_THREADS=1
export THRDS_PER_TASK=1

if [ -n "${LOADL_ACTIVE+x}" ]; then
    cd $LOADL_STEP_INITDIR
fi

./osu_mbw_mr