Scratch Issues on Carter

October 6, 2015  10:00am – October 30, 2015  10:00am
Carter

October 30, 2015 11:00am

ITaP Engineers have made additional timeout changes to the scratch filesystem which has increased stability. Additional work is being scheduled for Tuesday, December 1, 2015 from 7:00am to 7:00pm.

October 8, 2015 5:00pm

An emergency reboot of the Carter scratch servers needs to be performed on October 9, 2015 at 2:00pm. Any job attempting to use scratch storage during this time will block for I/O, however, some jobs may fail.

Original

Over the last two weeks, we have seen sporadic issues communicating with Carter's new scratch storage hardware. Any jobs on Carter attempting to use scratch storage may periodically block for I/O or in some cases, fail.

Our engineers and our storage vendor are continuing work to address these issues, but we are not yet certain of the root causes or when we will fully return to normal operations.

We will keep this news article updated as we learn more.

Originally posted: October 5, 2015  10:55am