Ccsm launch

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search

WARNING: The last edit of this page is over two years old. The information on this page may be out-of-date.

hybrid_launch - a.k.a ccsm_launch

Please note that this package is not an official IBM product: you use it entirely at your own risk, and neither IBM nor the author assume any liability or responsibility for maintenance.

This package should *not* be used with applications that have been linked with

       librebind.a

This hybrid (MPI+OpenMP) application prebinding tool exploits the thread-binding facilities of the XL compiler suite's SMP runtime library as opposed to direct use of the AIX call

       bindprocessor ()

to bind threads to logical CPUs. There are (at least) two limitations of this approach:

 1. The bookkeeping that must be done by the tool requires that the number
    of OpenMP threads be known in advance: If the number of threads per MPI
    task is not explicitly specified using either the  parthds  option of
    the  XLSMPOPTS  environment variable or the OpenMP environment variable
    OMP_NUM_THREADS , the assumption made is that the number of threads
    equals the number of logical CPUs; this is consistent with the behavior
    of the XL compiler SMP runtime library.  This limitation gives rise to
    (at least) two restrictions: the number of OpenMP threads cannot be
    established using the OpenMP call
       omp_set_num_threads ()
    and nested parallelism should not be used unless it is reasonable that
    child threads inherit the binding of their parent threads.
 2. The target logical CPUs must be specified by range.  If the environment
    variable  TARGET_CPU_RANGE  is set to "-1", the tool will make a best-
    effort selection based on the assumption that the hosts used by the job
    are dedicated to it.  If  TARGET_CPU_RANGE  is set to a pair of numbers
    separated by a dash, only logical processors in the specified range on
    each host are used: for instance, if
       TARGET_CPU_RANGE="0-3",
    only the first 4 logical processors on each host are used.

The application must be executed under the

       poe

command; for instance,

       export OMP_NUM_THREADS=4
       export TARGET_CPU_RANGE=-1
       /usr/bin/poe hybrid_launch ./myexe < ./myinp > ./myout

To test that you will actually obtain the binding you require with the environment under which you plan to execute your application, you can execute

       /usr/bin/poe hybrid_launch reporter

and carefully examine the output that is generated.

The idea of using the thread-binding facilities of the XL compiler SMP runtime library has been suggested by others at IBM, and although I take credit for independent "discovery", I do not take it for precedence.

Please report comments and corrections to parpia@us.ibm.com. Last revision 30 Aug 2007