Conte Cluster LustreD Scratch Filesystem Purging

March 31, 2014
Conte

Update:

Due to issues with the automated processess indexing the Lustre filesystems, resumption of scratch purging has been postponed for one week. The first automated mailing of purge warnings will be sent on April 1, and the purge will occur on April 8.

Original Notice:

During the week following Spring Break, 2014 (March 24-28), scratch filesystem purging will begin on the Conte cluster's LustreD filesystem.

On March 25, the first automated mailing of purge warnings will be mailed, with purges actually occurring on April 1, 2014.

Periodic purging of scratch filesystems is necessary to prevent the storage from filling up, reduce fragmentation, and ensure that I/O continues to perform as well as expected.

LustreD provides very large file and space quotas for Conte users, and, when purging begins, will have different parameters than older ITaP clusters.

Rather than purging based on an individual file's creation time, lustreD will be purged based on the last access time of an individual file.

Any file not accessed in 90 days will be subject to purge

As previously, you can use the "purgelist" command to report which files are eligible for purging.

It is our intention that this purge policy will make it easier for actively used datasets to remain in place and make your research computing more productive.

Notes on Cluster Scratch and Archival Storage

However, keep in mind that cluster scratch storage is for limited-duration, high-performance storage of data for running jobs or workflows. It is important that old data in scratch filesystems is occasionally purged - to keep the filesystem from being fragmented or filling up. Scratch is intended to be a space in which to run your jobs, and not used as long-term storage of data, applications, or other files.

Please keep in mind that any scratch filesystem is engineered for capacity and high performance, and are not protected from any kind of data loss by any backup technology. While research computing scratch filesystems are engineered to be fault-tolerant and reliable, some types of failures can result in data loss.

ITaP recommends that important data, research results, etc. be permanently stored in the Fortress HPSS archive, and copied to scratch spaces while being actively worked on. The "hsi" and "htar" commands provide easy-to-use interfaces into the archive.

For more information on using Fortress, please visit the web site at http://www.rcac.purdue.edu/userinfo/resources/fortress/

Persistent Group Storage

If non-purged, disk-based storage is a requirement for your group's work, please consider ITaP's persistent group storage service. This service is well-suited for storing a research group’s data, results, applications, source code and anything else members may need to share with each other.

For more information on the Persistent Group Storage service, see

http://www.rcac.purdue.edu/userinfo/resources/group/ and New research group digital storage option offered to Purdue researchers

Originally posted: January 22, 2014