Bell will be unavailable Tuesday night through Thursday

October 20, 2020  8:00pm – October 23, 2020  8:00pm
Bell

UPDATE: October 23, 2020  7:46pm

As of 7:45 approximately half the cluster, including the highmem nodes has been returned to early access testing status. Benchmarking and configuration testing will continue on the other half.

We will close this outage notice but will continue to update Bell early users on its status.


UPDATE: October 23, 2020  6:58pm

As of 7:00 pm, reconfiguration of the cluster for partial return to service is ongoing. We will post an update by 10:00 pm


UPDATE: October 23, 2020  4:07pm

Some benchmarking and configuration testing is still ongoing, but we will return approximately half the cluster for early access user testing. Reconfiguration for this use is going on now, and we will update the status by 7:00 pm tonight.


UPDATE: October 23, 2020  11:49am

Benchmarking and testing of inter-node communication is ongoing. We will post an update by 4:00 pm today.


UPDATE: October 23, 2020  10:00am

As of 10:00 am, benchmarking is still ongoing. We will provide another update by noon today.


UPDATE: October 22, 2020  8:22pm

As of 8:00 pm Thursday, benchmarking on the Bell cluster is ongoing. We will extend the outage to ensure our Engineering team has sufficient time to reconfigure the cluster for production use when benchmarking is done and test it for usability.

We will update this article by 10:00 am Friday, 23 October.


ORIGINAL: October 19, 2020  2:54pm

The Bell Cluster will be unavailable October 20, 2020 at 8:00pm – October 23, 2020 at 8:00pm. During this time, our Engineering team will be working with vendor representatives to complete benchmarking steps and finalize the cluster's internal configuration.

During this time users will still be able to login for limited code testing, but jobs will not run on the compute nodes.

Originally posted: October 19, 2020  2:54pm