Brown
- 
																						
											
											
The following are highlights and answers to frequently asked questions about the new Spack-based software stacks on ITaP Community Clusters. How do I use the new modules? The new software stack is now the default (i.e. this is what you see when you...
 - 
																						
Community Cluster Life Cycles Extended to 6 Years
For the last several years, we've heard your input around longer lifespans for community clusters, and now, thanks to the end of Moore's Law, it makes sense to begin adjusting to a longer lifespan. Following an announcement July 31 at the annual Comm...
 - 
																						
BoilerKey and SSH Key Login to Clusters FAQ
As explained in the news article on Requiring BoilerKey or SSH key authentication on Community Clusters, all clusters will now be requiring BoilerKey or SSH key authentication in order to log in to them, effective mid-August 2020. Here are some comm...
 - 
																						
											
											
As of 12:30pm EDT all the clusters are back in production. If your job crashed during the outage, please resubmit it. We are currently experiencing an outage across the community clusters (Brown, Gilbreth, Halstead, Hammer, Rice, Scholar, Snyder, WC...
 - 
																						
Requiring BoilerKey or SSH key authentication on Community Clusters
During Aug 17-20th, 2020, due to immediate security concerns, we will be changing community cluster access to require BoilerKey two-factor authentication (2FA) for all direct SSH or Thinlinc desktop access to each cluster and will no longer support p...
 - 
																						
											
											
The Brown cluster will be unavailable Wednesday, August 19, 2020 at 8:00am EDT for scheduled maintenance. The cluster will return to full production by %enddatetime%. During this time, Brown will have the operating system patched and a maintenance up...
 - 
																						
Home and Applications Filesystem Maintenance - All Clusters
Most of the research computing clusters (Brown, Gilbreth, Halstead, Hammer, Rice, Scholar, Snyder, WCERES, Workbench, and WSC Hadoop) as well as some other minor systems will be unavailable beginning at Tuesday, November 3rd, 2020 at 9:00am EST, for...
 - 
																						
											
											
The Brown cluster began experiencing issues with its job scheduler around 4:00pm EST. The problem manifests itself as Slurm-related commands (slist, squeue, sinteractive, sbatch, etc) being slow, unresponsive or timing out. Queue selection dialogs in...
 - 
																						
Research Computing Holiday Break
Research Computing personnel will observe the university winter break from 5:00pm EST EST on Friday, December 18th, 2020, and will resume normal business hours on Monday, January 4th, 2021. During this time, Research Computing services will continue...
 - 
																						
Access to RCAC Resources During ITaP Central Authentication Outage
On Sunday, December 27th, 2020, ITaP staff will perform major upgrades to the central authentication infrastructure. All applications that require logging in with BoilerKey or Career Account credentials will be unavailable Sunday, December 27, 2020 f...
 - 
																						
											
											
The Bell, Brown, Gilbreth, Halstead, Rice, Scholar, and Snyder clusters began experiencing issues with their Data Depot mounts around 10:00pm EST. Engineers are currently diagnosing the issue and are working to identify a fix. To avoid job losses for...
 - 
																						
											
											
The Data Depot storage server began experiencing issues around 3:00pm EST on Thursday, February 4th, 2021. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused on all clusters while this issue...
 - 
																						
Unscheduled outage on multiple clusters
Due to problems with cooling system in the MATH datacenter, the CMS, Bell, Brown, Gilbreth, Halstead, WCERES, and WSC Hadoop clusters began experiencing issues around 4:00pm EDT. Multiple front-end, compute and storage services are affected. Engineer...
 - 
																						
Whole-Floor Cluster Maintenance
The majority of Research Computing computational resources (Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, WCERES, Workbench, and WSC Hadoop clusters) will be unavailable Tuesday, May 11, 2021 at 5:00pm EDT for Data Depot migration work. The clust...
 - 
																						
Intermittent Access Failures on Data Depot
As of Thursday, June 17th, 2021 at 11:00am EDT, users of community clusters may experience intermittent "permission denied" errors while trying to access their files on Data Depot. Errors may come and go, and may appear on both login and c...
 - 
																						
Scheduling Paused on Multiple Clusters
At about 4:00 pm today (Wednesday, 21 July, 2021) System Engineers found an issue with the schedulers on the Bell, Brown, Gilbreth, Halstead, and Scholar clusters. Job scheduling has been paused while this is being investigated. Symptoms of this pro...
 - 
																						
											
											
The Brown cluster began experiencing issues with cooling around 9:00pm EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is being addressed. We will provide an update...
 - 
																						
RCAC Whole-Floor Downtime and Power Work
The majority of the Research Computing computational resources will be unavailable July 30, 2021 7:00am - August 1, 2021 12:00pm EDT for a whole-floor downtime due to electrical power work in MATH and POD data centers. Along with a required preven...
 - 
																						
Unscheduled Brown, Hammer and Weber outage
The Brown, Hammer, and Weber clusters began experiencing issues with cooling in the POD data center around 11:00am EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is...
 - 
																						
Unscheduled Brown and Hammer outage
The Brown and Hammer cluster began experiencing issues with cooling in the POD data center around 5:40pm EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is being add...