HPSS
High Performance Storage System
(Pilot usage phase to start in Jun/2011 with a select group of users. Deployment and configuration are still a work in progress)
The High Performance Storage System (HPSS) is a tape-backed hierarchical storage system that will provide a significant portion of the allocated storage space at SciNet. It is a repository for archiving data that is not being actively used. Data can be returned to the active GPFS filesystem when it is needed.
Access and transfer of data into and out of HPSS is done under the control of the user, whose interaction is expected to be scripted and submitted as a batch job, using one or more of the following utilities:
- HSI is a client with an ftp-like functionality which can be used to archive and retrieve large files. It is also useful for browsing the contents of HPSS.
- HTAR is a utility that creates tar formatted archives directly into HPSS. It also creates a separate index file (.idx) that can be accessed and browsed quickly.
- ISH is a TUI utility that can perform an inventory of the files and directories in your tarballs.
Why should I use and trust HPSS?
- 10+ years history, used by 50+ facilities in the “Top 500” HPC list
- very reliable, data redundancy and data insurance built-in.
- highly scalable, reasonable performance at SciNet - Ingest: ~12 TB/day, Recall: ~24 TB/day (aggregated)
- HSI/HTAR clients also very reliable and used on several HPSS sites. ISH was written at SciNet.
- HPSS fits well with the Storage Capacity Expansion Plan at SciNet (pdf presentation)
Guidelines
- Expanded storage capacity is provided on tape -- a media that is not suited for storing small files. Files smaller than ~100MB should be grouped into tarballs with tar or htar.
- The maximum size of a file that can be transferred into the repository is 1TB. However, optimal performance is obtained with file sizes <= 100 GB.
- Make sure to check the application's exit code and the returned log file for errors after all data transfers and any tarball creation process.
- Pilot users: DURING THE TESTING PHASE DO NOT DELETE THE ORIGINAL FILES FROM /scratch OR /project
Access Through the Queue System
All access to the archive system is done through the GPC queue system.
Scripted File Transfers
File transfers in and out of the HPSS should be scripted into jobs and submitted to the archive queue. See HSI example below.
#!/bin/env bash #PBS -q archive #PBS -N hsi_put_file_in_hpss #PBS -j oe #PBS -me /usr/local/bin/hsi -v <<EOF cput -p /scratch/$USER/workarea/finished-job1.tar.gz : finished-job1.tar.gz EOF status=$? if [ ! $status == 0 ];then echo '!!! TRANSFER FAILED !!!' fi exit $status
The status of pending jobs can be monitored with showq specifying the archive queue:
showq -w class=archive
Recalling Data for Analysis
Typically, data will be recalled to the /scratch file system when it is needed for analysis. Job dependencies can be used to have analysis jobs wait in the queue for data recalls before starting. The qsub flag is
-W depend=afterok:<JOBID>
where JOBID is the job number of the staging job that must finish successfully before the analysis job can start.
Here is a short cut for generating the dependency:
gpc04 $ qsub $(qsub data-recall.sh | awk -F '.' '{print "-W depend=afterok:"$1}') job-to-work-on-recalled-data.sh
Using HSI
HSI is the primary client with which a user will interact with HPSS. It provides an ftp-like interface for archiving and retrieving files. In addition it provides a number of shell-like commands that are useful for examining and manipulating the contents in HPSS. The most commonly used commands will be:
cput | Conditionally stores a file only if the file does not already exist in HPSS |
cget | Conditionally retrieves a copy of a file from HPSS to your local storage only if a local copy does not already exist. |
cd,mkdir,ls,rm,mv | Operate as one would expect on the contents of HPSS. |
lcd,lls | Local commands. |
- Simple commands can be executed on a single line.
hsi "mkdir examples; cd examples; cput example_data.tgz"
- More complex sequences can be performed using a script such as this:
hsi <<EOF mkdir -p examples/201106 cd examples mv example_data.tgz 201106/ lcd /scratch/$USER/examples/ cput -R -u * EOF status=$? if [ ! $status == 0 ];then echo '!!! TRANSFER FAILED !!!' fi exit status
HSI Documentation
Complete documentation of HSI is available on the Gleicher Enterprises web site.
Your First Time Using HSI
Once you are comfortable with HSI, you will script the process and submit it non-interactively to the queue. On your first tests, however, we suggest that you use an interactive queue to get familiar with the system. In the remainder of this subsection, you will find an example of an interactive test that you can run to become familiar with using HSI.
Setup
To begin the process, we create a directory structure to use for the tests:
mkdir BASE echo 1 > BASE/file.1 echo 2 > BASE/file.2
Accessing HSI
Now we obtain an interactive job on the archive queue:
qsub -l nodes=1:ppn=8,walltime=2:00:00 -q archive -I
Now we start HSI:
/usr/local/bin/hsi
And that provides us with the following prompt, where your initial directory will be: /archive/$(groups)/$(whoami)/
[HSI]/archive/group/user->
Note that the current directory on HSI (/archive/group/user) is empty by typing:
ls
Also note that the current directory on the disk contains the directory structure that we created at the beginning:
lls
Where the output should contain the "BASE" directory.
Offload
Now we offload the BASE directory and all of its contents to the HPSS system. Running interactively, it is easy to learn that the -R flag is necessary to offload a directory. We thus use:
cput -R BASE
The output appears as:
cput 'BASE/file.1' : 'file.1' ( 2 bytes, 0.2 KBS (cos=1300)) cput 'BASE/file.2' : 'file.2' ( 2 bytes, 0.5 KBS (cos=1300))
And now when we look at the files on the HPSS
ls BASE
The output appears as:
/archive/group/user/BASE: file.1 file.2
Recall
We now recall the data from HPSS back to the GPFS disk. Once this entire simple test is complete, we suggest that you run through this test again using a real directory of your data. Thus, we do not delete the original directory on disk at this point. Instead, we create a new directory and recall the data from HPSS to this new directory where it can be checked for congruency to the original data if desired.
First, we need to create a new directory on the disk. We will do this from within HSI, but you could also exit HSI (using the exit command or control-c) to make the directory changes and then run HSI again. Thus, continuing from above:
lmkdir RECALL lls
now lists both the BASE and RECALL directories on disk.
We now recall the data from HPSS to GPFS:
lcd RECALL cget -R BASE
And the output of the cget command is:
cget '/project/.../TEST/RECALL/BASE/file.1' : '/archive/.../BASE/file.1' (2011/06/29 12:14:21 2 bytes, 4.0 KBS ) cget '/project/.../TEST/RECALL/BASE/file.2' : '/archive/.../file.2' (2011/06/29 12:14:22 2 bytes, 3.7 KBS )
We now exit HSI (using the exit command or control-c) and verify the existence of the directory that was brought back to GPFS.
Typical Usage
The most common interactions will be putting data into HPSS, examining the contents (ls), and getting data back onto one of the active filesystems for inspection or analysis.
- sample data offload
#!/bin/bash # This script is named: data-offload.sh #PBS -q archive #PBS -N offload #PBS -j oe #PBS -me date # individual tarballs already exist # Note that upon executing hsi, your initial directory will be: /archive/$(groups)/$(whoami)/ /usr/local/bin/hsi -v <<EOF mkdir put-away cd put-away cput /scratch/$USER/workarea/finished-job1.tar.gz : finished-job1.tar.gz cput /scratch/$USER/workarea/finished-job2.tar.gz : finished-job2.tar.gz EOF status=$? if [ ! $status == 0 ];then echo '!!! TRANSFER FAILED !!!' fi exit $status
- sample data list
A convenient way to explore the contents of HPSS is with the inventory shell ISH. This example creates an index of all the files in a user's portion of the namespace. The list is placed in the file /home/$USER/HPSSdm/hsi.igz that can be inspected from the gpc-devel nodes.
#!/bin/bash # This script is named: data-list.sh #PBS -q archive #PBS -N hpss_index #PBS -j oe #PBS -me TODAY=$(date +%Y%m%d) INDEX_DIR=/home/$USER/HPSSdm if [[ -! -e $INDEX_DIR ]];then mkdir $INDEX_DIR fi export ISHREGISTER=$HOME/HPSSdm ish hindex
This index can be browsed or searched with ISH on the development nodes.
gpc-f104n084-$ ish ~/HPSSdm/hsi.igz [ish]hsi.igz> help
ISH is a powerful tool that is also useful for creating and browsing indices of tar and htar archives, so please look at the documentation or built in help.
- sample data recall
- This example should be modified to emphasize that a single cget of multiple files (rather than several separate gets) allows HSI to do optimization.
#!/bin/bash # This script is named: data-recall.sh #PBS -q archive #PBS -N recall #PBS -j oe #PBS -me mkdir -p /scratch/$USER/recalled-from-hpss hsi -v << EOF cget /scratch/$USER/recalled-from-hpss/Jan-2010-jobs.tar.gz : put-away-on-2010/Jan-2010-jobs.tar.gz cget /scratch/$USER/recalled-from-hpss/Feb-2010-jobs.tar.gz : put-away-on-2010/Feb-2010-jobs.tar.gz EOF status=$? exit $status
HSI vs. FTP
HSI syntax and usage is very similar to that of FTP. Please note the following information adapted from the HSI man page:
HSI supports several of the commonly used FTP commands, including "dir","get","ls","mdelete","mget","put","mput" and "prompt", with the following differences:
- The "dir" command is an alias for "ls" in HSI. The "ls" command supports an extensive set of options for displaying files, including wildcard pattern-matching, and the ability to recursively list a directory tree
- The "put" and "get" family of commands support recursion
- There are "conditional put" and "conditional" get commands (cput, cget)
- The syntax for renaming files during transfers with HSI is different from FTP. With HSI, the general format is always
"local_file : hpss_file"
and multiple such pairs may be specified on a single command line.
For example, when using HSI to store the local file "file1" as "hpss_file1" into HPSS, then retrieve it back to the local filesystem as "file1.bak", the following commands could be used:
put file1 : hpss_file1 get file1.bak : hpss_file1
- unlike with FTP, where the following syntax would be used:
put file1 hpss_file1 get hpss_file1 file1.bak
- The "m" prefix is not needed for HSI commands; all commands that work with files accept multiple files on the command line. The "m" series of commands are intended to provide a measure of compatibility for FTP users.
Other HSI Examples
- Creating tar archive of C source programs and header files on the fly by piping stdout:
tar cf - *.[ch] | hsi put - : source.tar
Note: the ":" operator which separates the local and HSI pathnames must be surrounded by whitespace (one or more space characters)
- Retrieve the tar file source kept above and extract all files:
hsi get - : source.tar | tar xf -
- The commands below are equivalent (the default HSI directory placement is /archive/<group>/<user>/):
hsi put source.tar hsi put source.tar : /archive/<group>/<user>/source.tar
- Put a subdirectory LargeFiles and all its contents recursively. You may use '-u' option to resume a previously disrupted session.
cput -R -u LargeFiles
- For more details please check the HSI Introduction, the HSI Man Page or the or the hsi help
- verify checksum
#!/bin/env bash #PBS -q archive #PBS -N checksum_verified_transfer #PBS -j oe #PBS -me thefile=<localpath> storedfile=<hpsspath> # Generate checksum on fly using a named pipe so that file is only read from GPFS once mkfifo /tmp/NPIPE cat $thefile | tee /tmp/NPIPE | hsi -q put - : $storedfile & pid=$! md5sum /tmp/NPIPE |tee /tmp/$fname.md5 rm /tmp/NPIPE # Check the exit code of the HSI process wait $pid sc=$? if [ $sc != 0 ];then echo "File transfer failed" exit $sc fi # change filename to stdin in checksum file sed -i.1 "s+/tmp/NPIPE+-+" /tmp/$fname.md5 # verify checksum hsi -q get - : $storedfile | md5sum -c /tmp/$fname.md5 sc=$? if [ $sc != 0 ]; then echo '!!! Job Failed !!!' echo 'error=' $sc fi
Strange HSI nuances
- During interactive use, even though it appears that the keyboard up arrow will retrieve previous HSI commands, this does not work as expected and should be avoided.
- Tab completion does not work. Be careful with combinations of tab completion and the "*" character!
- echo $? does not work as expected from within HSI. Here is what happens:
[HSI]/archive/group/user->echo $? echo turned on [HSI]/archive/group/user->echo $? echo turned off
Thus, you must avoid the use of echo when checking the output of HSI commands.
HTAR
Please aggregate small files (<~100MB) into tarballs or htar files.
HTAR is a utility that is used for aggregating a set of files and directories, by using a sophisticated multithreaded buffering scheme to write files from the local filesystem directly into HPSS, creating an archive file that conforms to the POSIX TAR specification, thereby achieving a high rate of performance.
CAUTION
- Files larger than 68 GB cannot be stored in an htar archive (you'll get an error message for the whole operation)
- HTAR archives cannot contain more than 1 million files.
- Check the HTAR exit code and log file before removing any files from the active filesystems.
HTAR Usage
- To write the file1 and file2 files to a new archive called files.tar in the default HPSS home directory, enter:
htar -cf files.tar file1 file2 OR htar -cf /archive/<group>/<user>/files.tar file1 file2
- To write a subdirA to a new archive called subdirA.tar in the default HPSS home directory, enter:
htar -cf subdirA.tar subdirA
- To extract all files from the project1/src directory in the archive file called proj1.tar, and use the time of extraction as the modification time, enter:
htar -xm -f proj1.tar project1/src
- To display the names of the files in the out.tar archive file within the HPSS home directory, enter (the out.tar.idx file will be queried):
htar -vtf out.tar
For more details please check the HTAR - Introduction or the HTAR Man Page online