Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Lustre scratch storage system unavailable

The Lustre storage system that provides scratch storage on the Rossmann and Coates Linux clusters (via /scratch/lustreA) failed at approximately 1:30pm Thursday, February 3. ITaP Storage Engineers are in MATH working on the problem, but we are currently unable to say when the storage system will be returned to service. As a result, PBS job scheduling on Rossmann and Coates has been paused until the storage system is back in operation.

9pm Thursday, 2/3: Diagnostics run by ITaP and vendor Storage Engineers indicated that part of the disk system in one of the servers that make up LustreA suffered a hardware failure. Replacement parts are being express-shipped to Purdue and are expected to arrive Friday morning, 2/4.

11:30am Friday, 2/4: We have been notified that the replacement parts needed to repair the Lustre storage system are expected to arrive at Purdue early Friday afternoon, 2/4.

8pm Friday, 2/4: ITaP and vendor storage engineers are running diagnostics and checking the integrity of the Lustre storage system but are not yet ready to release it for production.

9pm Friday, 2/4: This outage has been resolved. The Rossmann and Coates Linux clusters' Lustre file system was returned to service and PBS job scheduling resumed at 9pm Friday, 2/4.

Originally posted: