Data Management

From oldwiki.scinet.utoronto.ca
Revision as of 10:51, 7 August 2009 by Ljdursi (talk | contribs) (Quick overview of /home and /scratch)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Storage at SciNet

SciNet's storage system is based on IBM's GPFS (General Parallel File System). There are two main systems for user data: /home, a small, backed-up space where user home directories are located, and /scratch, a large system for input or output data for jobs; data on /scratch is not only not backed up, data placed there is deleted after a month. SciNet does not provide long-term storage for large data sets.

Home Disk Space

Every SciNet user gets a 10GB directory on /home. Home is visible both from login nodes, and from the development nodes on GPC and the TCS. However, on the compute nodes of the clusters -- as when jobs are running -- /home is mounted read-only; jobs can read files in /home but cannot write to files there. /home is a good place to put code, input files for runs, and anything else that needs to be kept to reproduce runs.

Scratch Disk Space

Every SciNet user also gets a directory in /scratch. Scratch is visible both from login nodes, the development nodes on GPC and the TCS, and on the compute nodes of the clusters, mounted as read-write. Thus jobs must write their output somewhere in /scratch.

There is a large amount of space available on /scratch, but it is purged monthly so that other users running their jobs and generating large outputs will have room tow store their data temporarily, as well. Computational results which you want to keep longer than this must be copied (using scp) off of SciNet entirely and to your local system. SciNet does not provide long-term storage for large data sets.