Difference between revisions of "MARS"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 34: Line 34:
 
* The maximum size that an individual file can have inside an HTAR is '''68GB'''. Please be sure to fish out those files that are larger from the directories and transfer them with  HSI
 
* The maximum size that an individual file can have inside an HTAR is '''68GB'''. Please be sure to fish out those files that are larger from the directories and transfer them with  HSI
 
* The maximum size of a tar/htar file that HPSS will take is '''1TB'''. Please do not generate tar-balls that large.
 
* The maximum size of a tar/htar file that HPSS will take is '''1TB'''. Please do not generate tar-balls that large.
* Average transfer rates with HSI (no small files, average > 1MB/file):  
+
* Average transfer rates with '''HSI''' (no small files, average > 1MB/file):  
 
   * write: 100-130MB/s
 
   * write: 100-130MB/s
 
   * read:  450-600MB/s (no staging from tapes required)
 
   * read:  450-600MB/s (no staging from tapes required)
* Average transfer rates with HTAR (not too many small files, average > 100KB/file, aggregation included):  
+
* Average transfer rates with '''HTAR''' (not too many small files, average > 100KB/file, aggregation included):  
 
   * write: 30-40MB/s
 
   * write: 30-40MB/s
 
   * read:  100-110MB/s (no staging from tapes required)
 
   * read:  100-110MB/s (no staging from tapes required)
* Average transfer rates from tapes, if stage is required (add to the above estimates)
+
* Average transfer rates from '''tapes''', if stage is required (add to the above estimates)
 
   * read: 80MB/s per tape drive.  
 
   * read: 80MB/s per tape drive.  
 
   * maximum of 4 drives may be used per hsi/htar session
 
   * maximum of 4 drives may be used per hsi/htar session

Revision as of 00:47, 28 April 2011

Massive Archive and Restore System

(Pilot usage phase to start in May/2011 with a select group of users. Deployment and configuration are still a work in progress)

The MARS deployment at SciNet is an effort to offer a more efficient way to offoad/archive data from the most active file systems (scratch and project) than our current TSM-HSM solution, still without having to deal directly with the tape library or "tape commands"

The system is a combination of the underlaying hardware infrastructure, 3 software components, HPSS, HSI and HTAR, plus some environment customization.

HPSS: the main component, best described as a very scalable "blackbox" engine running in the background to support the Archive and Restore operations. High Performance Storage System - HPSS is the result of over a decade of collaboration among five Department of Energy laboratories and IBM, with significant contributions by universities and other laboratories worldwide. For now the best way for SciNet users to understand HPSS may be to compare it with our existing HSM-TSM implementation.

HSI: it may be best understood as a supercharged ftp interface, specially designed by Gleicher Enterprises to act as a front-end for HPSS, gathering some of the best features you would encounter on a shell, rsync and GridFTP (and a few more). It enables users to transfer whole directory trees from /project and /scratch, therefore freeing up space. HSI is most suitable when those directory trees do not contain too many small files to start with, or when you already have a series of tarballs.

HTAR: similarly, htar is sort of a "super-tar" application, also specially designed by Gleicher Enterprises to interact with HPSS, allowing users to build and automatically transfer tarballs to HPSS on the fly. HTAR is most suitable to aggregate whole directory trees.

Quick Reference

Files are organized inside HPSS in the same fashion as in /project. Users in the same group have read permissions to each other's archives.

/archive/<group>/<user>

Using HSI


Using HTAR


Performance/Limits considerations

  • IN/OUT transfers to HPSS using HSI is bound to maximum of about 4 files/second. Therefore do not attempt to transfer directories with too many (small) files inside. Instead use HTAR, so they are aggregated while being sent to HPSS
  • The maximum size that an individual file can have inside an HTAR is 68GB. Please be sure to fish out those files that are larger from the directories and transfer them with HSI
  • The maximum size of a tar/htar file that HPSS will take is 1TB. Please do not generate tar-balls that large.
  • Average transfer rates with HSI (no small files, average > 1MB/file):
 * write: 100-130MB/s
 * read:  450-600MB/s (no staging from tapes required)
  • Average transfer rates with HTAR (not too many small files, average > 100KB/file, aggregation included):
 * write: 30-40MB/s
 * read:  100-110MB/s (no staging from tapes required)
  • Average transfer rates from tapes, if stage is required (add to the above estimates)
 * read: 80MB/s per tape drive. 
 * maximum of 4 drives may be used per hsi/htar session