Unscheduled outage on Conte

May 11, 2017  8:00am – 2:40pm

As of 2:35 pm, Conte cluster is returned to service. Scheduling is resumed in all queues.


The source of the problem has been identified and the fix is underway. We anticipate returning Conte to service by 3pm today.

Original message

The Conte cluster is currently experiencing issues related to an Infiniband library. Our system administrators are working to track and fix the glitch.

Job scheduling on Conte has been paused while engineers address the issue.

Originally posted: May 11, 2017  11:10am