BGQ

From oldwiki.scinet.utoronto.ca
Revision as of 15:35, 21 August 2012 by Northrup (talk | contribs)
Jump to navigation Jump to search
Blue Gene/Q (BGQ)
Blue Gene Cabinet.jpeg
Installed August 2012
Operating System RH6.3, CNK (Linux)
Number of Nodes 2048(32,768 cores), 512 (8,192 cores)
Interconnect 5D Torus (jobs), QDR Infiniband (I/O)
Ram/Node 16 Gb
Cores/Node 16 (64 threads)
Login/Devel Node bgq01,bgq02
Vendor Compilers bgxlc, bgxlf
Queue Submission Loadleveler

Specifications

BGQ is an extremely powerful and energy efficient 3rd generation IBM Supercomputer built around a system on a chip compute node that has a 16core 1.6GHz Power based CPU and 16Gb of Ram and runs a very lightweight Linux OS called CNK. The nodes are bundled in groups of 32 and then 16 of these groups make up a midplane with 2 midplanes per rack. The compute nodes are all connected togther using a custom 5D interconnect. Each midplane has 8 Power7 I/O nodes that run a full Redhat Linux OS that manages the compute nodes and mounts the GPFS filesystem.

Jobs

BGQ job size is typically determined by midplanes (512 nodes or 8192 cores), however sub-blocks can be used to further subdivide midplanes with a minimum of one IO node per block. In SciNet's configuration (with 8 I/O nodes per midplane) this allows 64 nodes (1024 cores) to be the smallest job size.


*** WAT2Q SPECIFIC ****

Compile

/bgsys/drivers/V1R1M1/ppc64/comm/xl/bin/mpich2version
/bgsys/drivers/V1R1M1/ppc64/comm/xl/bin/mpixlc
/bgsys/drivers/V1R1M1/ppc64/comm/xl/bin/mpixf90

Run a Job =

When not using loadleveler there is a direct launch program called runjob on BGQ that acts a lot like mpirun/mpiexec. The "block" argument is the predifined group of nodes that are already booted. See the next section on how to create these blocks manually.

runjob --block R00-M0-N03-32 --ranks-per-node=16 --np 512 --cwd=/gpfs/DDNgpfs3/xsnorthrup/osu_bgq --exe=/gpfs/DDNgpfs3/xsnorthrup/osu_bgq/osu_mbw_mr

To see running jobs and the status of available blocks use:

list_jobs
list_blocks


Setup blocks

To reconfigure the BGQ nodes use the bg_console

bg_console

There are various options to create block types (section 3.2 in the BGQ admin manual), but the smallest is created using the following command:

gen_small_block <blockid> <midplane> <cnodes> <nodeboard> 
gen_small_block  R00-M0-N03-32 R00-M0 32 N03

The block then needs to be booted using:

allocate R00-M0-N03-32

If those resources are already booted into another block, that block must be freed before the new block can be allocated.

select_block R00-M0-N03
free_block

There are many other functions in bg_console:

help all

I/O

GPFS



Documentation

BGQ System Administration Guide

BGQ Application Development