Outages and Maintenance
- 
																						
											
											Multiple clusters have been powered off in MATH G109 datacenter due to a water issue in the building. Affected systems are Bell, Brown, Geddes, Gilbreth and Negishi. We will provide an update by 5:00 PM today. 
- 
																						Fortress Tape Library Maintenance The Fortress Archive Tape Library will be unavailable Friday, October 13, 2023 from 8:00am - 5:00pm EDT for a Preventative maintenance. During this time, the Tape Library will have 2 new drives installed. Also, maintenance and software updates will... 
- 
																						Fortress Archive Monthly Maintenance The Fortress Archive will be unavailable Wednesday, October 4, 2023 from 8:00am - 12:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transf... 
- 
																						
											
											The Weber cluster will be unavailable beginning Wednesday, September 27th, 2023 at 8:00am for a scheduled maintenance. The cluster will return to full production by Wednesday, September 27th, 2023 at 5:00pm. During this time, Weber will have additio... 
- 
																						Scheduling Paused on Negish Cluster At about noon today (Tuesday 12 September), we discovered an issue with the scheduler database related to the power outage last Sunday. Scheduling on Negishi has been paused to allow for work on correcting the database problem. We will have an upda... 
- 
																						
											
											Anvil is experiencing more issues related to the power outage yesterday in the Purdue Data Center. Users are currently unable to login via any method, SSH, Open On Demand, etc. Engineers have been dispatched to resolve the issue. This post will be up... 
- 
																						
											
											Update: As of 3:45pm, the Bell cluster has returned to production status. Scheduling is still paused on the Negishi cluster, and we will have an update by 5:00pm EDT The Bell and Negishi clusters began experiencing issues with power around 1:00pm EDT... 
- 
																						Fortress Archive Monthly Maintenance The Fortress Archive will be unavailable Wednesday, September 6, 2023 from 8:00am - 12:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any tran... 
- 
																						
											
											The Scholar cluster will be unavailable Wednesday, August 16, 2023 from 8:00am - 5:00pm EDT for scheduled maintenance. The cluster will return to full production by Wednesday, August 16th, 2023 at 5:00pm EDT. During this time, the following changes w... 
- 
																						
											
											Purdue IT network engineers will be troubleshooting the network connection on Negishi. During this time, the network will be failed over to a backup path, and then moved back to the primary upon conclusion of the work. Impact should be minimal. In t... 
- 
																						Fortress Archive Monthly Maintenance The Fortress Archive will be unavailable Wednesday, August 2, 2023 from 8:00am - 12:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transfe... 
- 
																						Data Depot degraded performance on RCAC clusters Users of Data Depot on RCAC clusters are currently experiencing significant performance degradation. The symptoms manifest as delays in listing or accessing files in /depot, significant lags in terminal sessions (especially if you have Data Depot in... 
- 
																						Unscheduled Hammer OnDemand Outage Open OnDemand services for the Hammer cluster are currently offline. Engineers are investigating a boot disk failure on the server that hosts the gateway.hammer.rcac.purdue.edu virtual machine. 
- 
																						Hammer Cluster Maintenance 7/19 The Hammer cluster will be unavailable Wednesday, July 19th at 8:00am for scheduled maintenance. The cluster will return to full production by 5:00pm on Wednesday, July 19th. Any Slurm jobs which request a walltime which would take them past 8:00am 7... 
- 
																						Fortress Archive Monthly Maintenance The Fortress Archive will be unavailable Wednesday, July 12, 2023 from 8:00am - 2:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). NOTE: This month the maintenance is delayed one week because of the Fourth of July holiday... 
- 
																						Unscheduled Hammer Slurm outage The Hammer cluster began experiencing issues with the Slurm scheduler around 5:00am, Thursday, July 6th. The Slurm scheduler is non-responsive, as a result, jobs will fail to schedule. Desktop and SSH access to Hammer login nodes is still available,... 
- 
																						
											
											The Anvil system will be unavailable Wednesday, June 28th, 2023 from 8:00am - Thursday, June 29th, 2023 at 8:00am EDT for scheduled maintenance. Any Slurm jobs which request a walltime which would take them past Wednesday, June 28th, 2023 at 8:00am E... 
- 
																						BoilerKey Transition to Purdue Login Overnight on June 26-27th (Monday-Tuesday), all Purdue systems which use BoilerKey, including RCAC clusters and other systems, will switch to the new Purdue Login. For more information about this change, please see the following documentation: https:... 
- 
																						
											
											The Negishi cluster will be unavailable Wednesday, June 21, 2023 from 8:00am - 5:00pm EDT for scheduled maintenance. The cluster will return to full production by Wednesday, June 21st, 2023 at 5:00pm EDT. During this time, Negishi will have the opera... 
- 
																						
											
											The Geddes cluster began experiencing issues overnight. Engineers are currently diagnosing the issue and are working to identify a fix. Workloads will be unavailable while this issue is being addressed. We will provide an update by 12 PM.