Installing CCSM4

From oldwiki.scinet.utoronto.ca
Revision as of 13:25, 2 December 2010 by Guido (talk | contribs)
Jump to navigation Jump to search

Under construction

As a group we have a set of source code located in /project/ccsm on the SciNET GPFS:

Each of the latest versions of the model will have a "_current" at the end of the model version to designate a link to the current subversion being used. These codes have been modified to run on SciNET: Status: Model -------- TCS --------- GPC CCSM3 --------- Yes --------- No CCSM4 --------- Yes --------- Yes CESM1 --------- Yes --------- Yes

If you prefer to install in your own user space directory (e.g. you don't have access to /project/ccsm) you can use the following instructions:

  1. DOWNLOAD SOURCE CODE AND DATA

Download the source code from http://www.ccsm.ucar.edu/models/ccsm4.0/ The CCSM4.0 User's Guide, available from the CCSM4 web site, gives instructions on getting the input data sets. There is new script check_input_data which checks whether the correct input data sets are available. The build script now calls check_input_data and downloads any missing data sets. Note that this means that the first time the build script is run it must be as an interactive job, not as a batch job, as the compute nodes do not have access to an external network. The directory that holds the input data is set in the variable DIN_LOC_ROOT_CSMDATA in config_machines.xml below.

  1. CREATE SPECIFIC FILES

In $CCSM4_DIR/scripts/ccsm_utils/Machines create the files Macros.tcs, env_machopts.tcs and mkbatch.tcs by copying the equivalent generic_linux_intel or bluefire (equivalent to TCS) files.

EDIT Specific files (See diff files below for details)

  1. EDIT Macros.tcs


  1. EDIT env_machopts.tcs

depending on which modules are loaded on GPC (TCS doesn't need this)

  1. --- set env variables for Macros if needed
#setenv NETCDF_PATH something
setenv NETCDF_PATH something
setenv NETCDF_MODS something


  1. EDIT mkbatch.tcs

You will need the following:

set mach = tcs

(copy and modify from bluefire or appropriate machine type)


setenv OMP_NUM_THREADS ${maxthrds}
#mpiexec -n ${maxtasks} ./ccsm.exe >&! ccsm.log.\$LID
mpirun -np ${maxtasks} ./ccsm.exe >&! ccsm.log.\$LID


The value of vmem will have to adjusted for each job.

  1. EDIT config_machines.xml

Replace the paths with paths to your own copy of the data and executable. Not all of the above may be necessary for all resolutions and compsets.


Here are the diff files for TCS:


diff --git a/Macros.bluefire b/Macros.tcs
index 9643b77..1502c9b 100644
--- a/Macros.bluefire
+++ b/Macros.tcs
@@ -74,15 +74,15 @@ else
 endif
 LD            := $(FC)
 
-NETCDF_PATH   := /usr/local
-INC_NETCDF    := $(NETCDF_PATH)/include
-LIB_NETCDF    := $(NETCDF_PATH)/lib
-MOD_NETCDF    := $(NETCDF_PATH)/include
+NETCDF_PATH   := $(SCINET_NETCDF_BASE)
+INC_NETCDF    := $(SCINET_NETCDF_PATH)/include
+LIB_NETCDF    := $(SCINET_NETCDF_PATH)/lib
+MOD_NETCDF    := $(SCINET_NETCDF_PATH)/include
 
 INC_MPI       := 
 LIB_MPI       := 
-PNETCDF_PATH  := /contrib/parallel-netcdf-1.1.1svn
-LIB_PNETCDF   := $(PNETCDF_PATH)/lib
+PNETCDF_PATH  := $(SCINET_PNETCDF_BASE)
+LIB_PNETCDF   := $(SCINET_PNETCDF_LIB)
 LAPACK_LIBDIR := /usr/local/lib
 
 CFLAGS        := $(CPPDEFS) -q64 -O2 
@@ -94,7 +94,8 @@ FLAGS_OPT     := -O2 -qstrict -Q
 LDFLAGS       := -q64 -bdatapsize:64K -bstackpsize:64K -btextpsize:64K 
 AR            := ar
 MOD_SUFFIX    := mod
-CONFIG_SHELL  := /usr/local/bin/bash
+CONFIG_SHELL  := /usr/bin/bash
+
 
 #===============================================================================
 # Override with user settings



diff --git a/mkbatch.bluefire b/mkbatch.tcs
index c4a8e20..0e71df1 100755
--- a/mkbatch.bluefire
+++ b/mkbatch.tcs
@@ -1,6 +1,6 @@
 #! /bin/tcsh -f
 
-set mach = bluefire
+set mach = tcs
 
 #################################################################################
 if ($PHASE == set_batch) then
@@ -42,14 +42,7 @@ endif
 @ batchpes = ${nodes} * ${PES_PER_NODE}
 ./xmlchange -file env_mach_pes.xml -id BATCH_PES -val ${batchpes}
 
-if ($?ACCOUNT) then
-  set account_name = $ACCOUNT
-else
-  set account_name = `grep -i "^${CCSMUSER}:" /etc/project.ncar | cut -f 1 -d "," | cut -f 2 -d ":" `
-  if (-e ~/.ccsm_proj) then
-     set account_name = `head -1 ~/.ccsm_proj`
-  endif
-endif
+set account_name = $USER
 
 if ($?QUEUE) then
   set queue_name = $QUEUE
@@ -57,7 +50,7 @@ else
   set queue_name = regular
 endif
 
-set time_limit = "0:50"
+set time_limit = "24:00"
 if ($CCSM_ESTCOST > 0) set time_limit = "1:50"
 if ($CCSM_ESTCOST > 1) set time_limit = "4:00"
 
@@ -66,17 +59,31 @@ cat >! $CASEROOT/${CASE}.${mach}.run << EOF1
 #==============================================================================
 #  This is a CCSM coupled model Load Leveler batch job script for $mach
 #==============================================================================
-#BSUB -n $ntasks_tot
-#BSUB -R "span[ptile=${ptile}]"
-#BSUB -q ${queue_name}
-#BSUB -N
-#BSUB -x
-#BSUB -a poe
-#BSUB -o poe.stdout.%J
-#BSUB -e poe.stderr.%J
-#BSUB -J $CASE
-#BSUB -W ${time_limit}
-#BSUB -P ${account_name}
+# @ shell = /usr/bin/tcsh
+# @ output = poe.stdout.\$(jobid).\$(stepid)
+# @ error  = poe.stderr.\$(jobid).\$(stepid) 

+# @ notification = never
+# @ bulkxfer = yes
+# @ environment = COPY_ALL
+# @ node_usage = not_shared
+# @ checkpoint = no
+# @ class = verylong
+# @ job_type = parallel
+# @ job_name = $CASE
+# @ wall_clock_limit = ${time_limit}
+## @ node = 10
+## @ tasks_per_node = 1
+# @ task_geometry = {$task_geo}
+#
+## this is necessary in order to avoid core dumps for batch files
+## which can cause the system to be overloaded
+# ulimits
+# @ core_limit = 0
+#=====================================
+## necessary to force use of infiniband network for MPI traffic
+# @ network.MPI = sn_all,not_shared,US,HIGH
+#=====================================
+# @ queue
 
 setenv LSB_PJL_TASK_GEOMETRY "{$task_geo}"
 setenv    BIND_THRD_GEOMETRY "$thrd_geo"
@@ -99,11 +106,8 @@ echo "\`date\` -- CSM EXECUTION BEGINS HERE"
  
 setenv NTHRDS \$BIND_THRD_GEOMETRY
 setenv MP_LABELIO yes
-if (\$USE_MPISERIAL == "FALSE") then
-   mpirun.lsf /contrib/bin/ccsm_launch /contrib/bin/job_memusage.exe ./ccsm.exe >&! ccsm.log.\$LID
-else
-                                       /contrib/bin/job_memusage.exe ./ccsm.exe >&! ccsm.log.\$LID
-endif
+
+/usr/bin/poe /project/ccsm/bin/ccsm_launch ./ccsm.exe >&! ccsm.log.\$LID
 
 wait
 echo "\`date\` -- CSM EXECUTION HAS FINISHED" 
@@ -130,7 +134,7 @@ endif
 touch ${CASEROOT}/${CASE}.${mach}.l_archive
 chmod 775 ${CASEROOT}/${CASE}.${mach}.l_archive
 
-set account_name = `grep -i "^${CCSMUSER}:" /etc/project.ncar | cut -f 1 -d "," | cut -f 2 -d ":" `
+set account_name = $USER
 if (-e ~/.ccsm_proj) then
    set account_name = `head -1 ~/.ccsm_proj`
 endif


diff --git a/config_machines.xml~ b/config_machines.xml
index 97f2829..3b2c70a 100644
--- a/config_machines.xml~
+++ b/config_machines.xml
@@ -2,6 +2,68 @@
 
 <config_machines>
 
+<machine MACH="cryo"
+         DESC="Guido's i7 desktop (intel), 8 pes"
+         EXEROOT="/home/$CCSMUSER/cesm/exe/$CASE"
+         OBJROOT="$EXEROOT"
+         INCROOT="$EXEROOT/lib/include" 
+         DIN_LOC_ROOT_CSMDATA="/home/guido/cesm/inputdata"
+         DIN_LOC_ROOT_CLMQIAN="/project/tss/atm_forcing.datm7.Qian.T62.c080727"
+         DOUT_S_ROOT="/home/$CCSMUSER/cesm/archive/$CASE"
+         DOUT_L_HTAR="FALSE"
+         DOUT_L_MSROOT="csm/$CASE"
+         CCSM_BASELINE="/fs/cgd/csm/ccsm_baselines"
+         CCSM_CPRNC="/fs/cgd/csm/tools/cprnc_64/cprnc"
+         OS="Linux"
+         BATCHQUERY="/usr/local/torque/bin/qstat"
+         BATCHSUBMIT="/usr/local/torque/bin/qsub" 
+         GMAKE_J="1" 
+         MAX_TASKS_PER_NODE="8"
+         MPISERIAL_SUPPORT="FALSE" />
+
+<machine MACH="tcs"
+         DESC="U of T IBM p6, os is AIX, 32 pes/node, batch system is Moab/LoadLeveler, testing" 
+         EXEROOT="/scratch/$CCSMUSER/$CASE"
+         OBJROOT="$EXEROOT"
+         LIBROOT="$EXEROOT/lib"
+         INCROOT="$EXEROOT/lib/include" 
+         DIN_LOC_ROOT_CSMDATA="/project/ccsm/inputdata"
+         DIN_LOC_ROOT_CLMQIAN="/cgd/tss/atm_forcing.datm7.Qian.T62.c080727"
+         DOUT_S_ROOT="/scratch/$CCSMUSER/archive/$CASE"
+         DOUT_L_HTAR="TRUE"
+         DOUT_L_MSROOT="/project/peltier/$CCSMUSER/archive/$CASE"
+         CCSM_BASELINE="/project/ccsm"
+         CCSM_CPRNC="/home/guido/bin/cprnc"
+         OS="AIX" 
+         BATCHQUERY="llq"
+         BATCHSUBMIT="llsubmit" 
+         GMAKE_J="32" 
+         MAX_TASKS_PER_NODE="64"
+         MPISERIAL_SUPPORT="TRUE"
+         PES_PER_NODE="32" />
+

+<machine MACH="gpc"
+         DESC="U of T iDataPlex intel cluster, os is linux, 8 pes/node, batch system is Moab/Torque, testing" 
+         EXEROOT="/scratch/$CCSMUSER/$CASE"
+         OBJROOT="$EXEROOT"
+         LIBROOT="$EXEROOT/lib"
+         INCROOT="$EXEROOT/lib/include" 
+         DIN_LOC_ROOT_CSMDATA="/project/ccsm/inputdata"
+         DIN_LOC_ROOT_CLMQIAN="/cgd/tss/atm_forcing.datm7.Qian.T62.c080727"
+         DOUT_S_ROOT="/scratch/$CCSMUSER/archive/$CASE"
+         DOUT_L_HTAR="TRUE"
+         DOUT_L_MSROOT="/project/peltier/$CCSMUSER/archive/$CASE"
+         CCSM_BASELINE="/project/ccsm"
+         CCSM_CPRNC="/home/guido/bin/cprnc"
+         OS="AIX" 
+         BATCHQUERY="/opt/torque/bin/qstat"
+         BATCHSUBMIT="/opt/torque/bin/qsub" 
+         GMAKE_J="1" 
+         MAX_TASKS_PER_NODE="8"
+         MPISERIAL_SUPPORT="TRUE"
+         PES_PER_NODE="8" />
+
+
 <machine MACH="bluefire"
          DESC="NCAR IBM p6, os is AIX, 32 pes/node, batch system is LSF" 
          EXEROOT="/ptmp/$CCSMUSER/$CASE"


Other changes: (AIX does not have -f option on hostname) Change "hostname -f" to "hostname" in: /project/ccsm/cesm1_current/models/utils/mct/configure