Difference between revisions of "Data Transfer"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
m (Created page with 'hpn-ssh [http://www.psc.edu/networking/projects/hpn-ssh/ High-Performance-enabled ssh] scp scp -oNoneSwitch=yes -oNoneEnabled=yes rsync -av -e ssh 45-tangle2/*0025.bin ljdursi...')
 
Line 1: Line 1:
hpn-ssh [http://www.psc.edu/networking/projects/hpn-ssh/ High-Performance-enabled ssh]
+
== Data Mover Node ==
  
scp
+
Serious moves of data to or from SciNet should be done from the <tt>datamover1</tt> node.  From any of the interactive SciNet nodes, one should be able to <tt>ssh datamover1</tt> to log in.  This is the machine that has the fastest network connection to the outside world (by a factor of 10; a 10Gb/s link as vs 1Gb/s). 
  
scp -oNoneSwitch=yes -oNoneEnabled=yes
+
Transfers must be ''originated'' from <tt>datamover1</tt>; that is, one can not copy files from the outside world directly to or from the data mover node; one has to log in to the data mover node and copy the data to or from the outside network. 
 +
 
 +
== ssh ==
 +
 
 +
All traffic to and from the data centre has to go via [http://en.wikipedia.org/wiki/Secure_Shell SSH], or secure shell.  This is a protocol which sets up a secure connection between between two sites.  On top of this protocol, there are many ways to copy files.
 +
 
 +
The usual ssh protocols were not designed for speed.  On the <tt>datamover1</tt> node, we have installed hpn-ssh, or [http://www.psc.edu/networking/projects/hpn-ssh/ High-Performance-enabled ssh].  This is backwards compatable with the `usual' ssh, but is capable of significantly higher speeds.  If you routinely have large data transfers to do, we recommend having your system administrator look into installing [http://www.psc.edu/networking/projects/hpn-ssh/ hpn-ssh] on your system. 
 +
 
 +
Everything we discuss below, unless otherwise stated, will work regardless of whether you have hpn-ssh installed on your remote system.
 +
 
 +
== scp ==
 +
 
 +
<tt>scp</tt>, or secure copy, is the easiest
 +
 
 +
scp -oNoneEnabled=yes -oNoneSwitch=yes  
 +
 
 +
== rsync ==
  
 
rsync -av -e ssh 45-tangle2/*0025.bin ljdursi@nuexport00.cita.utoronto.ca:/mnt/raid-cita/ljdursi/athDrape/45-tangle2
 
rsync -av -e ssh 45-tangle2/*0025.bin ljdursi@nuexport00.cita.utoronto.ca:/mnt/raid-cita/ljdursi/athDrape/45-tangle2
Line 9: Line 25:
 
If the files compress well, compressing first, or transmitting with compression on (scp -C, rsync -z) can significantly enhance effective data transfer rates
 
If the files compress well, compressing first, or transmitting with compression on (scp -C, rsync -z) can significantly enhance effective data transfer rates
  
with hpn-ssh
 
rsync -av: ~70MB/s
 
scp: ~50MB/s
 
  
without: about half that.
+
== What transfer speeds should I expect? ==
 +
 
 +
{| class="wikitable" border="1"
 +
|-
 +
!  Mode
 +
!  With hpn-ssh
 +
!  Without
 +
|-
 +
|  rsync
 +
|  60-80 MB/s
 +
|  30-40 MB/s
 +
|-
 +
|  scp
 +
|  50 MB/s
 +
|  25 MB/s
 +
|}
 +
 
 +
== Why are my transfer so much slower? ==
  
 
Numbers << less than that: could be a number of things.
 
Numbers << less than that: could be a number of things.
Line 19: Line 49:
 
   - how busy is the server
 
   - how busy is the server
 
   - how busy is the disk
 
   - how busy is the disk
 +
 +
http://www.psc.edu/networking/projects/tcptune/

Revision as of 23:07, 8 February 2010

Data Mover Node

Serious moves of data to or from SciNet should be done from the datamover1 node. From any of the interactive SciNet nodes, one should be able to ssh datamover1 to log in. This is the machine that has the fastest network connection to the outside world (by a factor of 10; a 10Gb/s link as vs 1Gb/s).

Transfers must be originated from datamover1; that is, one can not copy files from the outside world directly to or from the data mover node; one has to log in to the data mover node and copy the data to or from the outside network.

ssh

All traffic to and from the data centre has to go via SSH, or secure shell. This is a protocol which sets up a secure connection between between two sites. On top of this protocol, there are many ways to copy files.

The usual ssh protocols were not designed for speed. On the datamover1 node, we have installed hpn-ssh, or High-Performance-enabled ssh. This is backwards compatable with the `usual' ssh, but is capable of significantly higher speeds. If you routinely have large data transfers to do, we recommend having your system administrator look into installing hpn-ssh on your system.

Everything we discuss below, unless otherwise stated, will work regardless of whether you have hpn-ssh installed on your remote system.

scp

scp, or secure copy, is the easiest

scp -oNoneEnabled=yes -oNoneSwitch=yes

rsync

rsync -av -e ssh 45-tangle2/*0025.bin ljdursi@nuexport00.cita.utoronto.ca:/mnt/raid-cita/ljdursi/athDrape/45-tangle2

If the files compress well, compressing first, or transmitting with compression on (scp -C, rsync -z) can significantly enhance effective data transfer rates


What transfer speeds should I expect?

Mode With hpn-ssh Without
rsync 60-80 MB/s 30-40 MB/s
scp 50 MB/s 25 MB/s

Why are my transfer so much slower?

Numbers << less than that: could be a number of things.

 - network connection between scinet and your machine
 - how busy is the server
 - how busy is the disk

http://www.psc.edu/networking/projects/tcptune/