Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Outages

  • Chilled water outage in MATH

    Update: As of about 11:00 am, the problem with the chilled water has been corrected, and scheduling has resumed on all RCAC clusters. Thank you for your patience. If you encounter any issues or have questions, please contact us at rcac-help@purdue.ed...

  • Chilled water outage in MATH

    Campus chilled water serving the MATH data center is experiencing above-normal temperatures, and as a precaution, scheduling on the Coates, Rossmann, Hansen, Carter, and Radon clusters has been stopped. Steele is not affected. There should be no impa...

  • Unexpected Power Outage in MATH

    Update: Noon, 1/8/13 The power issue in MATH has been resolved. Power has been restored to the nodes in the Coates-A subcluster affected by the outage. ITaP engineers have verified that the Coates-A subcluster is operating correctly, and have restart...

  • Scheduling paused on ITaP research clusters

    During scheduled network maintenance on network equipment connecting storage to ITaP clusters, all scheduling will be paused from 4-6pm. Running jobs will continue to execute, and new jobs may be submitted to PBS queues, but no new jobs will start u...

  • Unscheduled Power outage in Math Datacenter

    Update: 10:00pm Tuesday As of 8:30pm Tuesday 21 August 2012, the LustreB filesystem has been returned to full service. Our storage engineers with assistance of the vendor have verified that the system is stable. If you encounter any issues, please co...

  • Unscheduled HPSS outage

    Update - April 11, 2012 240pm At around 240pm, ITaP engineers have restored communications between the HPSS system and the tape library. Access to Fortress from Samba, HSI/HTAR and other methods has been restored. I apologize for the inconvenience th...

  • Unscheduled Samba Outage

    Update : 1:45pm As of As of 1:45pm this afternoon, systems staff have completed patching the samba servers used to access storage systems. You should now be able to connect to samba.rcac.purdue.edu for samba access to home and scratch directories and...

  • Partial outage affecting some Coates queues

    Update - 6:45 pm Tuesday, 10 April 2012 ITaP engineers have found and repaired the network issue that was affecting Coates nodes type B, C and E. Job scheduling has been resumed for all queues. If you encounter any problems, please report them to rc...

  • Unscheduled outage to MATH datacenter

    Update - 9:30pm, 4/1/2012: As of about 9:30pm, Sunday, 1 April, ITaP systems staff have returned Hansen to production status, and job scheduling is re-enabled. The scratch filesystem on Hansen has been restored with no apparent loss of files; if you...

  • PBS unavailable on Rossmann cluster

    Due to a network issue, the server running the PBS software for Rossmann is unavailable. While the server is unavailable, attempts to use PBS commands ("qsub", "qstat", "pbsnodes") will fail with error messages like: qst...

  • Unscheduled outage to Rossmann cluster

    At approximately 10:50pm, Thursday, March 15, the power distribution to large portions of the Rossmann cluster failed. These feeds also power the login nodes for the cluster, which, while unavailable, renders Rossmann unavailable for use. Power was r...

  • Lustre unavailable on Hansen cluster

    Update: As of 9:45pm, Lustre is back in production and scheduling has resumed on Hansen. Original Notice: As of approximately 8:00pm February 7, an issue was found the Lustre filesystem on Hansen making the filesystem unavailable for use. ITaP engine...

  • Coates Scheduling unavailable

    This morning, the PBS system on Coates developed an issue with the storage holding its internal state.While systems engineers are working on recovering it from backup, any new job submissions will not be possible, nor will you be able to query job st...

  • Fortress: ADIC Scalar 10k tape robot unavailable (1/4/2012)

    Update - 1/9/2012 The repairs to the ADIC tape library have been completed and Fortress' tape functionality is back in operation. Update - 1/6/2012 Following work today by vendor engineers, the latest estimate for the ADIC tape robot's return to serv...

  • Hansen: unscheduled outage to Lustre scratch

    Update The error condition on the Lustre filesystem has been cleared, and Hansen is back in production and accepting new jobs. Jobs already running should have resumed at the point where they were blocked waiting when the Lustre error occurred. This...

  • Fortress: ADIC Scalar 10k tape robot unavailable

    Update 12/2/11 (4:15pm) The tape robot has been returned to service and Fortress is back in production. Please contact us at rcac-help@purdue.edu if you encounter further issues. Update 12/2/11: The ADIC Scalar 10K robot is temporarily down again wit...

  • Unscheduled LustreA Outage

    The LustreA scratch filesystem, used by Rossmann and Coates, suffered an unknown failure sometime in the early morning of November 15, 2011. LustreA was returned to normal operation at about 10:30am. Any jobs on those systems run overnight before t...

  • Coates PBS scheduler issues

    This week, ITaP engineers have been troubleshooting issues with the Coates cluster, with the most common symptom being PBS jobs that abort or restart after some period of run time. Late yesterday afternoon, a change was made to the cluster's networki...

  • Aug. 5-17 research computing system outage FAQ

    What’s happening? ITaP’s research computing systems will be shut down beginning at 5 p.m. Friday, Aug 5, including the Rossmann, Coates, Moffett and Radon clusters. The supercomputers are scheduled to be off until Wednesday, Aug. 17. Why? An outage r...

  • Lustre scratch storage system unavailable

    The Lustre storage system that provides scratch storage on the Rossmann and Coates Linux clusters (via /scratch/lustreA) failed at approximately 1:30pm Thursday, February 3. ITaP Storage Engineers are in MATH working on the problem, but we are curre...