Fortress - Complete User Guide

Overview of Fortress

The Fortress DXUL system is a large, long-term, multi-tiered file caching and storage system utilizing both online disk and robotic tape drives. Fortress was upgraded in the fall of 2006. It currently consists of an IBM p570 with four 1.65 GHz Power5 processors, 8 GB of RAM, and a 2.5 TB RAID disk cache with an effective capacity of 1.7 TB. Fortress also uses an ADIC Scalar 10K robotic tape library with a capacity of 1.2 PB (36 LTO-II drives and 2,000 LTO-II tape cartridges).

Detailed Hardware Specification

Storage Subsystem Drives Media Total Capacity Effective Capacity
RAID Disk Cache 2.5 TB 1.7 TB
ADIC Scalar 10k Robotic Tape Library 36 LTO-II 2,000 LTO-II 1.2 PB 600 TB

Files stored on Fortress are written to two separate storage devices. Recently used files smaller than 0.5 MB have their primary copy stored on 4TB of low-cost disks (disk cache), but the second copy (backup of disk cache) is on tape or optical disks. This provides a rapid restore time to the disk cache. However, the large latency to access a larger file (usually involving a copy from a tape cartridge) makes it unsuitable for use as active storage. The primary and secondary copies of larger files are stored on separate tape cartridges in the ADIC tape library.

In addition to poor performance, these two uses can cause severe problems with the system itself:

  • DO NOT store any actively used files on fortress.
  • DO NOT store large collections of small files on fortress.

Do not use fortress as a second home directory. Instead, use tar or some similar archive tool to combine all the smaller files you wish to store into a single large file first.

For active data storage you should use either local storage or a scratch file system. You may then copy any results you wish to archive to fortress when computation is complete.

Fortress runs AIX 5.2 and uses DXUL 2.9 from EMC.

Obtaining an Account

Purdue faculty, staff, and students with the approval of their advisor may request access to Fortress using the online Research Computing Account Request Form.

Login / SSH

To login to Fortress, you may use fortress.rcac.purdue.edu via SSH.

You will not normally need to log in directly to Fortress. You may access your files there via SFTP, SCP, Windows Network Share/Drives (CIFS/SMB), or NFS. You may log in directly however, if the need arises.

SSH Client Software

All access to the RCAC Systems must be through secure (encrypted) connections. Standard telnet and FTP are not supported. SSH, SCP, and SFTP may be used instead.

Secure Shell or SSH is a way of establishing a secure channel between a local and a remote computer. It uses public-key cryptography to authenticate the remote computer and (optionally) to allow the remote computer to authenticate the user. It is usually used to log in to a remote machine and execute commands similar to telnet, but it also supports tunneling and forwarding of X11 or arbitrary TCP connections. The associated SFTP and SCP protocols may be used to transfer files. There are many SSH clients available, depending on the operating system you use.

Linux / Solaris / AIX / HP-UX / Unix:

  • "ssh", "sftp", and "scp" are pre-installed. Log in using "ssh myusername@servername".

Microsoft Windows:

Mac OS X:

  • "ssh", "sftp", and "scp" are pre-installed. You may start a local terminal window from "Applications->Utilities". Log in using "ssh myusername@servername".
  • MacSSH and MacSFTP
  • NiftyTelnet 1.1 SSH

Passwords

If you have received a default password as part of the process of obtaining your account, you should change it immediately upon login. This can be done from any terminal/SSH session with the command "passwd". You will have the same password on all RCAC systems, and if you change it on any one of them, it will change on all of them.

If you already have a Purdue career account login, then you will initially be given the same login and password as your career account. There is no need to change your career account password because you have received an account on RCAC systems.

There is not currently any requirement regarding how often you must change your password within RCAC, but for security reasons it would be a good to change it at least once every 6 months, preferably every 3 months.

All passwords should:

  • Be something you have never used as a password before, on this or any other system.
  • Be easy for you to remember and difficult for others to guess.
  • Be at least eight characters long.
  • Be a combination of upper and lowercase letters, numbers, and symbols.
  • TIP: Choose a sentence or song lyric and abbreviate it: "The dog Samson ate 4 new slippers!" = "TdSa4ns!"

Never share your password with another user or make your password known to anyone else. Systems staff will NEVER ask for your password, by email or otherwise.

Email

There is no local mail delivery available on Fortress. All email sent to Fortress will be forwarded to mail.rcac.purdue.edu for delivery.

Login Shell

Fortress provides only a restricted shell to users. This means that only a limited set of commands are allowed. Here is a list of the more useful commands available on fortress.rcac.purdue.edu:

bunzip2 Uncompress bzip2-generated .bz2 files.
bzip2 Compress files into .bz2 files.
cat Concatenate files together and/or display them on your terminal.
cd Change the current working directory.
chmod Change the permissions (modes) on files or directories.
cp Copy one or more files or directories.
du Summarize your disk usage.
echo Echo arguments to your terminal.
exit Log out.
grep Search for a pattern within files.
gunzip Uncompress gzip-generated .gz files.
gzip Compress files into .gz files.
head Display the first N lines of a file.
help Display a shell help message.
logout Log out.
ls List the contents of directories.
mkdir Create new directories.
mv Move and/or rename files and directories.
passwd Change your password (for all Purdue systems).
pwd Display the current working directory path.
rm Delete files or directories.
rmdir Delete empty directories.
scp Secure network copy using SSH.
source Run files of commands.
tail Display the last N lines of a file.
tar Create/Extract/View tar-generated .tar file archives.
unzip Extract a ZIP format file archive.
utstage Request files be restaged into the disk cache from tape.
wc Count the number of characters, words, and/or lines in a file.
zip Create a ZIP format file archive.
zipgrep Search a ZIP format file archive with grep.
zipinfo Display information about a ZIP format file archive.

Storage Options

File storage on Fortress consists solely of long-term or permanent storage. Home directories on Fortress are the long-term or permanent storage filesystems. Below is some more detail on this.

Home Directories

Your home directory is the default directory you are placed in when you log in.

On Fortress, your home directory will be in the /archive/fortress/home/ file system, and will not be the same as your home directory on any other RCAC system. Your home directory on Fortress is your long-term storage directory for all RCAC systems. You can find the path to this by logging in to fortress.rcac.purdue.edu, and typing "pwd":

$ pwd
/archive/fortress/home/myusername

Long-Term Storage

Long-term Storage or Permanent Storage is available to RCAC users on the DXUL/UniTree archival storage system, commonly referred to as "Fortress". DXUL (DiskXtender for Unix and Linux) and UniTree are a software package that manages a hierarchical storage system. Program files, data files and any other files which are not used often, but which must be saved, can be put in permanent storage. Fortress currently has a 1.2 PB capacity. However, since two copies are retained of every file, the usable capacity is only 600 TB.

Recently used files smaller than 0.5 MB have their primary copy stored on low-cost disks, but the second copy is on tape or optical disks. This provides a rapid restore time to the disk cache. However, the large latency to access a larger file (usually involving a copy from a tape cartridge) makes it unsuitable for use as active storage.

In addition to poor performance, these two uses can cause severe problems with the system itself:

  • DO NOT store any actively used files on Fortress.
  • DO NOT store large collections of small files on Fortress.

Do not use Fortress as a second home directory. Instead, use tar or some similar archive tool to combine all the smaller files you wish to store into a single large file first.

For active data storage you should use either local storage or a scratch file system. You may then copy any results you wish to archive to Fortress when computation is complete.

Fortress writes two copies of every file either to two tapes, or to disk and a tape, to protect against medium errors. Unfortunately, Fortress does not automatically switch to the alternate copy when it has trouble accessing the primary. If it seems to be taking an extraordinary amount of time to retrieve a file (hours), please either email dxul-help@purdue.edu or call ITaP Computing Services 765-49-68238. We can then investigate why it is taking so long. If it is an error on the primary copy, we will instruct Fortress to switch to the alternate copy as the primary and recreate a new alternate copy.

On Fortress, the Unix "sticky bit" flag ("t" permission in "ls -l" output) is used indicate when a file is not currently in the disk cache, but would need to be retrieved from tape if accessed. This flag is provided for your convenience, so you can see if attempting to read a file may result in a wait for a tape to be loaded before you try to read it. The normal Unix meaning of the sticky bit on Fortress does not exist, and attempting to alter it with "chmod" will have no effect.

Lost Long-Term Storage File Recovery

Data on Fortress is not backed up elsewhere in a traditional sense. New and modified files in the disk cache are migrated to tape within 30 minutes, and Fortress maintains two copies of every file on different media to protect against media failures, but there is no backup protecting against user changes.

If you remove or overwrite a file on Fortress, it is gone. You cannot request to have it retrieved.

However, the DXUL software provides a "trashcan" facility on Fortress. When you remove a file, the file is placed into a ".trash" directory in your Fortress home directory. The filename has a date and time stamp appended. It will remain in this ".trash" directory for roughly 4 days, after which it is permanently removed. To recover a file you accidentally deleted, simply locate it in your ".trash" directory and move it back to where it belongs. If you remove a file from the ".trash" directory, or if you wait 4 days or longer and the system removes it automatically, the file is permanently lost.

Environment Variables

There are many environment variables related to storage locations and paths which are automatically set for you upon log in. Use environment variables instead of actual paths whenever possible to avoid problems if the specific paths to any of these change. Some of the environment variables you should have are:

  • $USER: your username
  • $HOME: path to your Fortress home directory
  • $SSH_CLIENT: your local client's IP address
  • $TERM: type of terminal or terminal emulator being used

All environment variables begin with the dollar sign ($) and are all uppercase. These may be used on the command line or in any scripts in place of and in combination with hard-coded values:

$ ls $HOME
...

$ ls $HOME/myproject
...

$ ls $HOME/myproject/$USER_data
...

You may find the value of any environment variable by using the "echo" command:

$ echo $USER
myusername

$ echo $HOME
/archive/fortress/home/myusername

The restricted shell on Fortress does not allow you to create or overwrite environment variables.

Storage Quotas / Limits

There is currently no quota on Fortress disk use, but there will likely be quotas set in the near future. Although it may seem an infinite amount of space, we expect Fortress to fill up just like any other storage device. In addition to the option of adding more capacity (a larger robot with more tapes), we may take files that are rarely accessed and move them out of the robot to offline storage. Fortress would still know about these files, but they would require a human operator to reload them into the cache.

You will receive a monthly email report showing your current Fortress usage. If we need to move old data offline to make room for other users, this will be indicated in your report. Files belonging to active accounts will be retained for as long as possible, but no longer than ten years. Owners of files more than ten years old will be contacted to see if the files may be removed. Files belonging to deleted accounts will also be retained, but inaccessible except by special request after the accounts have been terminated. The files will be kept for no more than ten years or the usability of the media on which they are stored, whichever comes first.

Archive and Compression

There are several options for archiving and compressing groups of files or directories on RCAC systems. All of the following tools are provided:

  • zip   (more information)
    Simple compression and file packaging utility.
    Examples:
      (compress file somefile.c)
    $ zip somefile.zip somefile.c
    
      (extract contents of somefile.zip)
    $ unzip somefile.zip
    
      (compress all files in a directory into one archive file)
    $ zip -r somefile.zip somedirectory/
    
      (compress all ".c" files in current directory into one archive file)
    $ zip -r somefile.zip . -i \*.c
    
  • tar   (more information)
    Saves many files together into a single archive file, and can restore individual files from the archive. Includes automatic archive compression/decompression options, and special features that allow tar to be used for incremental and full backups.
    Examples:
      (archive file somefile.c)
    $ tar cvf somefile.tar somefile.c
    
      (archive and compress file somefile.c)
    $ tar czvf somefile.tar.gz somefile.c
    
      (list contents of archive somefile.tar)
    $ tar tvf somefile.tar
    
      (extract contents of somefile.tar)
    $ tar xvf somefile.tar
    
      (extract contents of gzipped archive somefile.tar.gz)
    $ tar xzvf somefile.tar.gz
    
      (archive and compress all files in a directory into one archive file)
    $ tar czvf somefile.tar.gz somedirectory/
    
      (archive and compress all ".c" files in current directory into one archive file)
    $ tar czvf somefile.tar.gz *.c 
    
  • gzip   (more information)
    Compression utility designed as a replacement for compress, with much better compression and no patented algorithms. The standard compression system for all GNU software.
    Examples:
      (compress file somefile - also removes uncompressed file)
    $ gzip somefile
    
      (uncompress file somefile.gz - also removes compressed file)
    $ gunzip somefile.gz
    
  • bzip2   (more information)
    Strong, lossless data compressor based on the Burrows-Wheeler transform. Also available as a library.
    Examples:
      (compress file somefile - also removes uncompressed file)
    $ bzip2 somefile
    
      (uncompress file somefile.bz2 - also removes compressed file)
    $ bunzip2 somefile.bz2
    
  • compress   (more information)
    Adaptive Lempel-Ziv compressor. Not often used today.

Windows users can work with these same formats using some of the following software:

  • 7-Zip
    Free Windows software package that can handle all the above formats.
  • WinZip
    Commercial Windows software package that can handle all the above formats.
  • WinRAR
    Commercial Windows software package that can handle all the above formats.

File Transfer

There are a variety of ways to transfer data to and from Fortress. Which you should use depends on several factors, including the ease of use for you personally, connection speed and bandwidth, the size and number of files to be transferred.

FTP

FTP (File Transfer Protocol) is simple data transfer mechanism. However, FTP was not designed to provide secure communications, and so FTP is no longer supported on any RCAC systems. Most modern FTP clients support either SFTP or SCP however, which are similar, secure protocols for file transfer. Try using one of the other methods described here instead of FTP.

SCP

SCP (Secure CoPy) is a simple way of transferring files between two machines that uses the SSH (Secure SHell) protocol. You may use SCP to connect to any system where you have SSH (login) access. SCP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SCP can copy single files, but will also recursively copy directory contents if given a directory name.

Command-line usage:

  (to a remote system from local)
$ scp sourcefilename myusername@hostname:somedirectory/destinationfilename

  (from a remote system to local)
$ scp myusername@hostname:somedirectory/sourcefilename destinationfilename

  (recursive directory copy to a remote system from local)
$ scp sourcedirectory/ myusername@hostname:somedirectory/

Linux / Solaris / AIX / HP-UX / Unix:

  • The "scp" command line program should already be installed.

Microsoft Windows:

  • WinSCP is a full-featured and free graphical SCP and SFTP client.
  • PuTTY also offers "pscp.exe", which is an extremely small program and a basic SCP client.
  • Secure FX is a commercial SCP and SFTP client which is available free to Purdue students, faculty, and staff with a Purdue career account.

Mac OS X:

  • The "scp" command line program should already be installed. You may start a local terminal window from "Applications->Utilities".

SFTP

SFTP (Secure File Transfer Protocol) is a reliable way of transferring files between two machines. You may use SFTP to connect to most RCAC systems. SFTP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SFTP has more features than SCP and allows for other operations on remote files, remote directory listing, and resuming interrupted transfers. Command-line SFTP cannot recursively copy directory contents; to do so, try using SCP or graphical SFTP client.

Command-line usage:

$ sftp -B buffersize myusername@hostname

      (to a remote system from local)
sftp> put sourcefile somedir/destinationfile
sftp> put -P sourcefile somedir/

      (from a remote system to local)
sftp> get sourcefile somedir/destinationfile
sftp> get -P sourcefile somedir/

sftp> exit
  • -B: optional, specify buffersize for transfer; larger may increase speed, but costs memory
  • -P: optional, preserve file attributes and permissions

Linux / Solaris / AIX / HP-UX / Unix:

  • The "sftp" command line program should already be installed.

Microsoft Windows:

  • WinSCP is a full-featured and free graphical SFTP and SCP client.
  • PuTTY also offers "psftp.exe", which is an extremely small program and a basic SFTP client.
  • Secure FX is a commercial SFTP and SCP client which is available free to Purdue students, faculty, and staff with a Purdue career account.

Mac OS X:

  • The "sftp" command line program should already be installed. You may start a local terminal window from "Applications->Utilities".
  • MacSFTP

LFTP

LFTP is a command line file transfer program for Linux and Unix systems. It supports SFTP, HTTP, and HTTPS file transfers. LFTP has additional features not provided by SFTP such as bandwidth throttling, transfer queues, and parallel transfers. It may be used interactively or scripted.

LFTP with parallel transfers can be much faster than SCP or SFTP, so its use is encouraged when possible.

LFTP is provided only on some RCAC systems. However, it is simply a client, so it is not needed on the remote machine involved in a transfer (the remote system need only support SFTP).

Interactive usage:

$ lftp myusername@hostname

         (transfer all ".dat" files from remote system to local)
lftp :~> mget *.dat

         (transfer "filename.dat" file from local system to remote)
lftp :~> put filename.dat

         (transfer a directory and all contents from remote
          system to local, using 5 connections in parallel)
lftp :~> mirror --parallel=5 remotedirectory localdirectory/

         (transfer a directory and all contents from local
          system to remote, using 8 connections in parallel)
lftp :~> mirror -R --parallel=8 localdirectory remotedirectory/

Batch usage:

  (specify all actions on command line)
$ lftp myusername@hostname -e "mget *.dat"

  (specify all actions in the script file "mytransfer.lftp")
$ lftp myusername@hostname -f mytransfer.lftp

Windows Network Drive / Share

If you run Windows (any version) on your personal computer, you may access your files on Fortress as a standard Windows network drive (the CIFS or SMB protocol). You may then drag and drop files and folders to and from Fortress as you would any local hard drive or USB drive.

However, many Windows programs do not handle long delays when opening a file well, so you may get errors when accessing files on Fortress not in the disk cache. If so, wait a little while and try again or use a different access mechanism to check the status of the file and wait for it to be reloaded from tape.

Do not use Fortress as a second home directory. Instead, use tar or some similar archive tool to combine all the smaller files you wish to store into a single large file first. For active data storage you should use either local storage or a scratch file system. You may then copy any results you wish to archive to Fortress when computation is complete.

In addition to poor performance, these two uses can cause severe problems with the system itself:

  • DO NOT store any actively used files on Fortress.
  • DO NOT store large collections of small files on Fortress.

To mount Fortress as a Windows network drive:

  1. Right-click on "My Network Places" to bring up the menu.
  2. Select "Map a network drive".
  3. Select an unused drive letter (perhaps "F" for Fortress or "X" or "U" for DXUL).
  4. Enter the Windows network path to your Fortress home directory in the "Folder" text box. This will be "\\fortress.rcac.purdue.edu\myusername".
  5. If your Fortress username is not the same as your Windows login username, click the link on this form for "Connect using a different user name." Then specify your Fortress username.
  6. We strongly recommend you do not select (uncheck) the box to "Reconnect at logon." Keeping Fortress mounted while not in use can hurt the overall performance of both your own system and Fortress.
  7. Click the "Finish" button.
You should now see a new drive under "My Computer" for your Fortress home directory.

Fortress Frequently Asked Questions (FAQ)

There are currently no FAQs for Fortress.