Unscheduled outage on Conte
May 11, 2017 8:00am – 2:40pm
As of 2:35 pm, Conte cluster is returned to service. Scheduling is resumed in all queues.
The source of the problem has been identified and the fix is underway. We anticipate returning Conte to service by 3pm today.
The Conte cluster is currently experiencing issues related to an Infiniband library. Our system administrators are working to track and fix the glitch.
Job scheduling on Conte has been paused while engineers address the issue.