Skip to main content

File Transfer

Gautschi supports several methods for file transfer. Use the links below to learn more about these methods.

SCP

SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH protocol. SCP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SCP can copy single files, but will also recursively copy directory contents if given a directory name.

After Aug 17, 2020, the community clusters will not support password-based authentication for login. Methods that can be used include two-factor authentication (Purdue Login) or SSH keys. If you do not have SSH keys installed, you would need to type your Purdue Login response into the SFTP's "Password" prompt.

Link to section 'Command-line usage:' of 'SCP' Command-line usage:

You can transfer files both to and from Gautschi while initiating an SCP session on either some other computer or on Gautschi (in other words, directionality of connection and directionality of data flow are independent from each other). The scp command appears somewhat similar to the familiar cp command, with an extra user@host:file syntax to denote files and directories on a remote host. Either Gautschi or another computer can be a remote.

  • Example: Initiating SCP session on some other computer (i.e. you are on some other computer, connecting to Gautschi):

          (transfer TO Gautschi)
          (Individual files) 
    $ scp  sourcefile  myusername@gautschi.rcac.purdue.edu:somedir/destinationfile
    $ scp  sourcefile  myusername@gautschi.rcac.purdue.edu:somedir/
          (Recursive directory copy)
    $ scp -pr sourcedirectory/  myusername@gautschi.rcac.purdue.edu:somedir/
    
          (transfer FROM Gautschi)
          (Individual files)
    $ scp  myusername@gautschi.rcac.purdue.edu:somedir/sourcefile  destinationfile
    $ scp  myusername@gautschi.rcac.purdue.edu:somedir/sourcefile  somedir/
          (Recursive directory copy)
    $ scp -pr myusername@gautschi.rcac.purdue.edu:sourcedirectory  somedir/
    

    The -p flag is optional. When used, it will cause the transfer to preserve file attributes and permissions. The -r flag is required for recursive transfers of entire directories.

  • Example: Initiating SCP session on Gautschi (i.e. you are on Gautschi, connecting to some other computer):

          (transfer TO Gautschi)
          (Individual files) 
    $ scp  myusername@$another.computer.example.com:sourcefile  somedir/destinationfile
    $ scp  myusername@$another.computer.example.com:sourcefile  somedir/
          (Recursive directory copy)
    $ scp -pr myusername@$another.computer.example.com:sourcedirectory/  somedir/
    
          (transfer FROM Gautschi)
          (Individual files)
    $ scp  somedir/sourcefile  myusername@$another.computer.example.com:destinationfile
    $ scp  somedir/sourcefile  myusername@$another.computer.example.com:somedir/
          (Recursive directory copy)
    $ scp -pr sourcedirectory  myusername@$another.computer.example.com:somedir/
    

    The -p flag is optional. When used, it will cause the transfer to preserve file attributes and permissions. The -r flag is required for recursive transfers of entire directories.

Link to section 'Software (SCP clients)' of 'SCP' Software (SCP clients)

Linux and other Unix-like systems:

  • The scp command-line program should already be installed.

Microsoft Windows:

  • MobaXterm
    Free, full-featured, graphical Windows SSH, SCP, and SFTP client.
  • Command-line scp program can be installed as part of Windows Subsystem for Linux (WSL), or Git-Bash.

Mac OS X:

  • The scp command-line program should already be installed. You may start a local terminal window from "Applications->Utilities".
  • Cyberduck is a full-featured and free graphical SFTP and SCP client.

FTP / SFTP

FTP is not supported on any research systems because it does not allow for secure transmission of data. Use SFTP instead, as described below.

SFTP (Secure File Transfer Protocol) is a reliable way of transferring files between two machines. SFTP is available as a protocol choice in some graphical file transfer programs and also as a command-line program on most Linux, Unix, and Mac OS X systems. SFTP has more features than SCP and allows for other operations on remote files, remote directory listing, and resuming interrupted transfers. Command-line SFTP cannot recursively copy directory contents; to do so, try using SCP or graphical SFTP client.

After Aug 17, 2020, the community clusters will not support password-based authentication for login. Methods that can be used include two-factor authentication (Purdue Login) or SSH keys. If you do not have SSH keys installed, you would need to type your Purdue Login response into the SFTP's "Password" prompt.

Link to section 'Command-line usage' of 'FTP / SFTP' Command-line usage

You can transfer files both to and from Gautschi while initiating an SFTP session on either some other computer or on Gautschi (in other words, directionality of connection and directionality of data flow are independent from each other). Once the connection is established, you use put or get subcommands between "local" and "remote" computers. Either Gautschi or another computer can be a remote.

  • Example: Initiating SFTP session on some other computer (i.e. you are on another computer, connecting to Gautschi):

    $ sftp myusername@gautschi.rcac.purdue.edu
    
          (transfer TO Gautschi)
    sftp> put sourcefile somedir/destinationfile
    sftp> put -P sourcefile somedir/
    
          (transfer FROM Gautschi)
    sftp> get sourcefile somedir/destinationfile
    sftp> get -P sourcefile somedir/
    
    sftp> exit
    

    The -P flag is optional. When used, it will cause the transfer to preserve file attributes and permissions.

  • Example: Initiating SFTP session on Gautschi (i.e. you are on Gautschi, connecting to some other computer):

    $ sftp myusername@$another.computer.example.com
    
          (transfer TO Gautschi)
    sftp> get sourcefile somedir/destinationfile
    sftp> get -P sourcefile somedir/
    
          (transfer FROM Gautschi)
    sftp> put sourcefile somedir/destinationfile
    sftp> put -P sourcefile somedir/
    
    sftp> exit
    

    The -P flag is optional. When used, it will cause the transfer to preserve file attributes and permissions.

Link to section 'Software (SFTP clients)' of 'FTP / SFTP' Software (SFTP clients)

Linux and other Unix-like systems:

  • The sftp command-line program should already be installed.

Microsoft Windows:

  • MobaXterm
    Free, full-featured, graphical Windows SSH, SCP, and SFTP client.
  • Command-line sftp program can be installed as part of Windows Subsystem for Linux (WSL), or Git-Bash.

Mac OS X:

  • The sftp command-line program should already be installed. You may start a local terminal window from "Applications->Utilities".
  • Cyberduck is a full-featured and free graphical SFTP and SCP client.

Globus

Link to section 'Globus' of 'Globus' Globus

Globus, previously known as Globus Online, is a powerful and easy to use file transfer service for transferring files virtually anywhere. It works within RCAC's various research storage systems; it connects between RCAC and remote research sites running Globus; and it connects research systems to personal systems. You may use Globus to connect to your home, scratch, and Fortress storage directories. Since Globus is web-based, it works on any operating system that is connected to the internet. The Globus Personal client is available on Windows, Linux, and Mac OS X. It is primarily used as a graphical means of transfer but it can also be used over the command line.

Link to section 'Link to section 'Globus Web:' of 'Globus' Globus Web:' of 'Globus' Link to section 'Globus Web:' of 'Globus' Globus Web:

  • Navigate to http://transfer.rcac.purdue.edu
  • Click "Proceed" to log in with your Purdue Career Account.
  • On your first login it will ask to make a connection to a Globus account. Accept the conditions.
  • Now you are at the main screen. Click "File Transfer" which will bring you to a two-panel interface (if you only see one panel, you can use selector in the top-right corner to switch the view).
  • You will need to select one collection and file path on one side as the source, and the second collection on the other as the destination. This can be one of several Purdue endpoints, or another University, or even your personal computer (see Personal Client section below).

The RCAC collections are as follows. A search for "Purdue" will give you several suggested results you can choose from, or you can give a more specific search.

  • Home directory and Scratch storage: "Gautschi Cluster Collection", however, you can start typing "gautschi" and it will suggest appropriate matches.
  • Research Data Depot: "Purdue Research Computing - Data Depot", a search for "Depot" should provide appropriate matches to choose from.
  • Fortress: "Purdue Fortress HPSS Archive", a search for "Fortress" should provide appropriate matches to choose from.

From here, select a file or folder in either side of the two-pane window, and then use the arrows in the top-middle of the interface to instruct Globus to move files from one side to the other. You can transfer files in either direction. You will receive an email once the transfer is completed.

Link to section 'Link to section 'Globus Personal Client setup:' of 'Globus' Globus Personal Client setup:' of 'Globus' Link to section 'Globus Personal Client setup:' of 'Globus' Globus Personal Client setup:

Globus Connect Personal is a small software tool you can install to make your own computer a Globus endpoint on its own. It is useful if you need to transfer files via Globus to and from your computer directly.

  • On the "Collections" page from earlier, click "Get Globus Connect Personal" or download a version for your operating system it from here: Globus Connect Personal
  • Name this particular personal system and follow the setup prompts to create your Globus Connect Personal endpoint.
  • Your personal system is now available as a collection within the Globus transfer interface.

Link to section 'Link to section 'Globus Command Line:' of 'Globus' Globus Command Line:' of 'Globus' Link to section 'Globus Command Line:' of 'Globus' Globus Command Line:

Globus supports command line interface, allowing advanced automation of your transfers.

To use the recommended standalone Globus CLI application (the globus command):

Link to section 'Link to section 'Sharing Data with Outside Collaborators' of 'Globus' Sharing Data with Outside Collaborators' of 'Globus' Link to section 'Sharing Data with Outside Collaborators' of 'Globus' Sharing Data with Outside Collaborators

Globus allows convenient sharing of data with outside collaborators. Data can be shared with collaborators' personal computers or directly with many other computing resources at other institutions. See the Globus documentation on how to share data:

For links to more information, please see Globus Support page and RCAC Globus presentation.

Windows Network Drive / SMB

SMB (Server Message Block), also known as CIFS, is an easy to use file transfer protocol that is useful for transferring files between RCAC systems and a desktop or laptop. You may use SMB to connect to your home, scratch, and Fortress storage directories. The SMB protocol is available on Windows, Linux, and Mac OS X. It is primarily used as a graphical means of transfer but it can also be used over the command line.

Note: to access Gautschi through SMB file sharing, you must be on a Purdue campus network or connected through VPN.

Link to section 'Windows:' of 'Windows Network Drive / SMB' Windows:

  • Windows 7: Click Windows menu > Computer, then click Map Network Drive in the top bar
  • Windows 8 & 10: Tap the Windows key, type computer, select This PC, click Computer > Map Network Drive in the top bar
  • Windows 11: Tap the Windows key, type File Explorer, select This PC, click Computer > Map Network Drive in the top bar
  • In the folder location enter the following information and click Finish:
    • To access your Gautschi home directory, enter \\home.gautschi.rcac.purdue.edu\gautschi-home.
    • To access your scratch space on Gautschi, enter \\scratch.gautschi.rcac.purdue.edu\gautschi-scratch. Once mapped, you will be able to navigate to your scratch directory.
  • Note: Use your career account login name and password when prompted. (You will not need to add ",push" nor use your Purdue Duo client.)
  • Your home or scratch directory should now be mounted as a drive in the Computer window.

Link to section 'Mac OS X:' of 'Windows Network Drive / SMB' Mac OS X:

  • In the Finder, click Go > Connect to Server
  • In the Server Address enter the following information and click Connect:
    • To access your Gautschi home directory, enter smb://home.gautschi.rcac.purdue.edu/gautschi-home.
    • To access your scratch space on Gautschi, enter smb://scratch.gautschi.rcac.purdue.edu/gautschi-scratch. Once mapped, you will be able to navigate to your scratch directory.
  • Note: Use your career account login name and password when prompted. (You will not need to add ",push" nor use your Purdue Duo client.)
  • Your home or scratch directory should now be mounted as a drive in the Computer window.

Link to section 'Linux:' of 'Windows Network Drive / SMB' Linux:

  • There are several graphical methods to connect in Linux depending on your desktop environment. Once you find out how to connect to a network server on your desktop environment, choose the Samba/SMB protocol and adapt the information from the Mac OS X section to connect.
  • If you would like access via samba on the command line you may install smbclient which will give you FTP-like access and can be used as shown below. For all the possible ways to connect look at the Mac OS X instructions.
    smbclient //home.gautschi.rcac.purdue.edu/gautschi-home -U myusername
    
    smbclient //scratch.gautschi.rcac.purdue.edu/gautschi-scratch -U myusername
  • Note: Use your career account login name and password when prompted. (You will not need to add ",push" nor use your Purdue Duo client.)

HSI

HSI, the Hierarchical Storage Interface, is the preferred method of transferring files to and from Gautschi. HSI is designed to be a friendly interface for users of the High Performance Storage System (HPSS). It provides a familiar Unix-style environment for working within HPSS while automatically taking advantage of high-speed, parallel file transfers without requiring any special user knowledge.

HSI is provided on all research systems as the command hsi. HSI is also available for download for many operating systems.

Interactive usage:

$ hsi

*************************************************************************
*                    Purdue University
*                  High Performance Storage System (HPSS)
*************************************************************************
* This is the Purdue Data Archive, Fortress.  For further information
* see http://www.rcac.purdue.edu/storage/fortress/
*
*   If you are having problems with HPSS, please call IT/Operational
*   Services at 49-44000 or send E-mail to rcac-help@purdue.edu.
*
*************************************************************************
Username: myusername  UID: 12345  Acct: 12345(12345) Copies: 1 Firewall: off [hsi.3.5.8 Wed Sep 21 17:31:14 EDT 2011]

[Fortress HSI]/home/myusername->put data1.fits
put  'test' : '/home/myusername/test' ( 1024000000 bytes, 250138.1 KBS (cos=11))

[Fortress HSI]/home/myusername->lcd /tmp

[Fortress HSI]/home/myusername->get data1.fits
get  '/tmp/data1.fits' : '/home/myusername/data1.fits' (2011/10/04 16:28:50 1024000000 bytes, 325844.9 KBS )

[Fortress HSI]/home/myusername->quit

Batch transfer file:

put data1.fits
put data2.fits
put data3.fits
put data4.fits
put data5.fits
put data6.fits
put data7.fits
put data8.fits
put data9.fits

Batch usage:

$ hsi < my_batch_transfer_file
*************************************************************************
*                    Purdue University
*                  High Performance Storage System (HPSS)
*************************************************************************
* This is the Purdue Data Archive, Fortress.  For further information
* see http://www.rcac.purdue.edu/storage/fortress/
*
*   If you are having problems with HPSS, please call IT/Operational
*   Services at 49-44000 or send E-mail to rcac-help@purdue.edu.
*
*************************************************************************
Username: myusername  UID: 12345  Acct: 12345(12345) Copies: 1 Firewall: off [hsi.3.5.8 Wed Sep 21 17:31:14 EDT 2011]
put  'data1.fits' : '/home/myusername/data1.fits' ( 1024000000 bytes, 250200.7 KBS (cos=11))
put  'data2.fits' : '/home/myusername/data2.fits' ( 1024000000 bytes, 258893.4 KBS (cos=11))
put  'data3.fits' : '/home/myusername/data3.fits' ( 1024000000 bytes, 222819.7 KBS (cos=11))
put  'data4.fits' : '/home/myusername/data4.fits' ( 1024000000 bytes, 224311.9 KBS (cos=11))
put  'data5.fits' : '/home/myusername/data5.fits' ( 1024000000 bytes, 323707.3 KBS (cos=11))
put  'data6.fits' : '/home/myusername/data6.fits' ( 1024000000 bytes, 320322.9 KBS (cos=11))
put  'data7.fits' : '/home/myusername/data7.fits' ( 1024000000 bytes, 253192.6 KBS (cos=11))
put  'data8.fits' : '/home/myusername/data8.fits' ( 1024000000 bytes, 253056.2 KBS (cos=11))
put  'data9.fits' : '/home/myusername/data9.fits' ( 1024000000 bytes, 323218.9 KBS (cos=11))
EOF detected on TTY - ending HSI session

For more information about HSI:

HTAR

HTAR (short for "HPSS TAR") is a utility program that writes TAR-compatible archive files directly onto Gautschi, without having to first create a local file. Its command line was originally based on tar, with a number of extensions added to provide extra features.

HTAR is provided on all research systems as the command htar. HTAR is also available for download for many operating systems.

Link to section 'Usage:' of 'HTAR' Usage:

Create a tar archive on Gautschi named data.tar including all files with the extension ".fits":

$ htar -cvf data.tar *.fits
HTAR: a   data1.fits
HTAR: a   data2.fits
HTAR: a   data3.fits
HTAR: a   data4.fits
HTAR: a   data5.fits
HTAR: a   /tmp/HTAR_CF_CHK_17953_1317760775
HTAR Create complete for data.tar. 5,120,006,144 bytes written for 5 member files, max threads: 3 Transfer time: 16.457 seconds (311.121 MB/s)
HTAR: HTAR SUCCESSFUL

Unpack a tar archive on Gautschi named data.tar into a scratch directory for use in a batch job:

$ cd $RCAC_SCRATCH/job_dir
$ htar -xvf data.tar
HTAR: x data1.fits, 1024000000 bytes, 2000001 media blocks
HTAR: x data2.fits, 1024000000 bytes, 2000001 media blocks
HTAR: x data3.fits, 1024000000 bytes, 2000001 media blocks
HTAR: x data4.fits, 1024000000 bytes, 2000001 media blocks
HTAR: x data5.fits, 1024000000 bytes, 2000001 media blocks
HTAR: Extract complete for data.tar, 5 files. total bytes read: 5,120,004,608 in 18.841 seconds (271.749 MB/s )
HTAR: HTAR SUCCESSFUL

Look at the contents of the data.tar HTAR archive on Gautschi:

$ htar -tvf data.tar
HTAR: -rw-r--r--  myusername/pucc 1024000000 2011-10-04 16:30  data1.fits
HTAR: -rw-r--r--  myusername/pucc 1024000000 2011-10-04 16:35  data2.fits
HTAR: -rw-r--r--  myusername/pucc 1024000000 2011-10-04 16:35  data3.fits
HTAR: -rw-r--r--  myusername/pucc 1024000000 2011-10-04 16:35  data4.fits
HTAR: -rw-r--r--  myusername/pucc 1024000000 2011-10-04 16:35  data5.fits
HTAR: -rw-------  myusername/pucc        256 2011-10-04 16:39  /tmp/HTAR_CF_CHK_17953_1317760775
HTAR: Listing complete for data.tar, 6 files 6 total objects
HTAR: HTAR SUCCESSFUL

Unpack a single file, "data5.fits", from the tar archive on Gautschi named data.tar into a scratch directory.:

$ htar -xvf data.tar data5.fits
HTAR: x data5.fits, 1024000000 bytes, 2000001 media blocks
HTAR: Extract complete for data.tar, 1 files. total bytes read: 1,024,000,512 in 3.642 seconds (281.166 MB/s )
HTAR: HTAR SUCCESSFUL

Link to section 'HTAR Archive Verification' of 'HTAR' HTAR Archive Verification

HTAR allows different types of content verification while creating archives. Users can ask HTAR to verify the contents of an archive during (or after) creation using the '-Hverify' switch. The syntax of this option is:

$ htar -Hverify=option[,option...] ... other arguments ... 
where option can be any of the following:
Option Explanation
info Compares tar header info with the corresponding values in the index.
crc Enables CRC checking of archive files for which a CRC was generated when the file is added to the archive.
compare Enables a byte-by-byte comparison of archive member files and their local file counterparts.
nocrc Disables CRC checking of archive files.
nocompare Disables a byte-by-byte comparison of archive member files and their local file counterparts.

Users can use a comma-separated list of options shown above, or a numeric value, or the wildcard all to specify the degree of verification. The numeric values for Hverify can be interpreted as follows:

0: Enables "info" verification.
1: Enables level 0 + "crc" verification.
2: Enables level 1 + "compare" verification.
all: Enables all comparison options.

An example to verify an archive during creation using checksums (crc):

htar -Hverify=1 -cvf abc.tar ./abc

An example to verify a previously created archive using checksums (crc):

htar -Hverify=1 -Kvf abc.tar

Please note that the time for verifying an archive increases as you increase the verification level. Carefully choose the option that suits your dataset best.

For details please see the HTAR Man Page.

For more information about HTAR:

HTAR has an individual file size limit of 64GB. If any files you are trying to archive with HTAR are greater than 64GB, then HTAR will immediately fail. This does not limit the number of files in the archive or the total overall size of the archive. To get around this limitation, try using the htar_large command. It is slower than using htar but it will work around the 64GB file size limit. This does not limit the number of files in the archive or the total overall size of the archive.

To get around this limitation, try using the htar_large command. It is slower than using HTAR but it will work around the 64GB file size limit. The usage of htar_large is almost the same as htar except that htar_large will not generate the tar index file. Thus, the -Hverify=1 option cannot be used since it's based on index file.

Helpful?

Thanks for letting us know.

Please don't include any personal information in your comment. Maximum character limit is 250.
Characters left: 250
Thanks for your feedback.