Systems Overview

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search

WARNING: SciNet is in the process of replacing this wiki with a new documentation site. For current information, please go to https://docs.scinet.utoronto.ca

SciNet Systems

SciNet is a consortium for High-Performance Computing consisting of researchers at the University of Toronto and its associated hospitals. It is one of seven such consortia in Canada, which together form Compute/Calcul Canada.

SciNet has two main clusters:

General Purpose Cluster (GPC)

University of Tor 79284gm-a.jpg

The General Purpose Cluster is an extremely large cluster (ranked 16th in the world at its inception in June 2009, and the fastest in Canada) and is where most simulations are to be done at SciNet. It is an IBM iDataPlex cluster based on Intel's Nehalem architecture (one of the first in the world to make use of the new chips). The GPC consists of 3,780 nodes with a total of 30,240 2.5GHz cores, with 16GB RAM per node (2GB per core). Approximately one quarter of the cluster is interconnected with non-blocking DDR InfiniBand while the rest of the nodes are connected with 5:1 blocked QDR Infiniband. The compute nodes are accessed through a queuing system that allows jobs with a maximum wall time of 48 hours.

Tightly Coupled System (TCS)

TCS-1.jpg

The TCS is a cluster of `fat' (high-memory, many-core) POWER6 nodes on a very fast Infiniband connection, which was installed in January of 2009. It has a relatively small number of cores (~3000), arranged in nodes of 32 cores with 128 GB of RAM on each node. It is dedicated for jobs that require such a large memory / low latency configuration. Jobs need to use multiples of 32 cores (a node), and are submitted to the LoadLeveler queuing system which allows jobs with a maximum wall time of 48 hours.

Power 7 Linux Cluster (P7)

IBM755.jpg

The P7 is a cluster of 5 (soon to be at least 8) `fat' (high-memory, many-core) POWER7 nodes on a very fast Infiniband connection, which was installed in May of 2011. It has a total of 160 cores, arranged in nodes of 32 cores with 128 GB of RAM on each node. It is dedicated for jobs that require such a large memory / low latency configuration. Jobs need to use multiples of 32 cores or more, and are submitted to the LoadLeveler queuing system which allows jobs with a maximum wall time of 48 hours.

Accelerator Research Cluster (ARC)

The ARC is a technology evaluation cluster with a combination of 14 IBM PowerXCell 8i "Cell" nodes and 8 Intel x86_64 "Nehalem" nodes containing 16 NVIDIA M2070 GPUs. The QS22's each have two 3.2GHz "IBM PowerXCell 8i CPU's, where each CPU has 1 Power Processing Unit (PPU) and 8 Synergistic Processing Units (SPU), and 32GB of RAM per node. The Intel nodes have two 2.53GHz 4core Xeon X5550 CPU's with 48GB of RAM per node each containing two NVIDIA M2070 (Fermi) GPU's each with 6GB of RAM.

Please note that this cluster is not a production cluster and is only accessible to selected users.