Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Unscheduled Bell outage

  • Outages
  • Bell

Link to update at January 21, 2022 3:46pm EST UPDATE:

As of 3:46pm EST, Bell's Slurm database problems were resolved and the Bell cluster has been returned to normal service. We apologize for the disruption of service. Please report any issues to rcac-help@purdue.edu.

Link to original posting ORIGINAL:

The Bell cluster began experiencing issues with scheduler database around 11:35am EST. The problem manifests as freezing and/or "socket timed out" and "Unable to contact slurm controller" error messages upon the usual Slurm commands (squeue, sbatch, etc)

Engineers are currently diagnosing the issue and are working to identify a fix.

We will provide an update by 5pm.

Originally posted: