Effective Use of Research Storage

October 16, 2013
Carter, Coates, Conte, Hansen, Peregrine1, Radon, Rossmann

On Thursday, Oct 10, a BlueArc scratch fileserver suffered a filesystem failure that resulted in data loss on several scratch filesystems.

In light of this event, we'd like to take this opportunity to remind all of our cluster users of the most effective ways to use research storage.

Cluster Scratch Storage

Cluster scratch storage is for limited-duration, high-performance storage of data for running jobs or workflows. It is globally accessible across the cluster. Old data in scratch filesystems is occasionally purged - to keep the filesystem from being fragmented or filling up. Scratch is intended to be a space in which to run your jobs, and not used as long-term storage of data, applications, or other files.

Please keep in mind that any scratch filesystem - scratch95, scratch96, lustreA, lustreC, and lustreD - is engineered for capacity and high performance, and are not protected from any kind of data loss by any backup technology. While research computing scratch filesystems are engineered to be fault-tolerant and reliable, some types of failures can result in data loss.

Any sort of data loss on scratch, from accidentally deleting a file or a full filesystem failure, is not recoverable.

Other Types of Research Storage

Files in home directories and shared group storage, on the other hand, are optimized for medium-performance and work like editing files, developing and compiling source code, installing applications, etc., instead of running jobs - and are protected from accidental deletion by filesystem snapshots, and lost files can be recovered with the "flost" command.

Fortress Archive

ITaP recommends that important data, research results, etc. be permanently stored in the Fortress HPSS archive, and copied to scratch spaces while being actively worked on. The "hsi" and "htar" commands provide easy-to-use interfaces into the archive.

For more information on using Fortress, please visit the web site at http://www.rcac.purdue.edu/userinfo/resources/fortress/

If you have questions about the various types of research storage, please contact us at rcac-help@purdue.edu.

