Outages
-
ECN services outage - ITaP Research Computing systems impacted
Engineering Computing Network (ECN) in coordination with Physical Facilities will be conducting a planned power outage in the MSEE building from 8am until noon on Saturday, January 10th, 2015. No one will be allowed to enter the building during this...
-
Power has been restored and the Peregrine 1 cluster has been restarted and is in full production mode The Calumet campus is experiencing a power issue that has rendered Peregrine1 unavailable. Contractors are on site working to restore service, but t...
-
Diminished network connectivity to research computing resources
Due to a network issue at the Indiana GigaPOP, connectivity to RCAC resources from off campus is intermittent. Access to the research computing web site, Globus, Thinlinc, or other research computing resources may be impacted. To work around this iss...
-
LustreD (/scratch/conte) unavailable
UPDATE: LustreD has been returned to service and scheduling has been resumed as of about 4 pm Saturday, October 4th, 2014. The LustreD filesystem, serving the Conte cluster, has become unavailable as of about 10:55 am Saturday, October 4th, 2014. Sys...
-
UPDATE: Fortress was successfully returned to service as of 7:35 pm Wednesday, 15 July. As of 8:30am on July 15, 2014, the Fortress HPSS Archive is unavailable due to a hardware issue. Access to Fortress via HSI, HTAR, Globus, or CIFS is not availabl...
-
Scheduling Paused on Hansen and Carter
The scratch filesystem on Hansen and Carter is currently unavailable due to a hardware issue. Attempts to access scratch will block until the filesystem is back online. Job scheduling on Hansen and Carter has been paused while storage engineers addre...
-
The Lustre D filesystem, serving the Conte cluster, has become unavailable as of about 8:00 pm Thursday 13 Feb, 2014. System engineers are working to bring the system back to 100% operation. Currently running jobs should be able to continue, but sch...
-
Lustre D filesystem unavailable
Update - 2:25pm, 12/16/2013 The LustreD scratch filesystem has been returned to service and both the filesystem and scheduler appear to be working properly. Conte has been returned to normal production service as of 2:20pm. Update - 10:30am, 12/16/2...
-
All ITaP Research Computing systems are currently experiencing an issue with accessing network filesystems. A case has been opened with our vendor as ITaP engineers troubleshoot the issue. Cluster users may experience issues accessing files in /home,...
-
The Fortress HPSS Archive is offline due to issues with their storage systems relating to the power loss on the West Lafayette campus in the wake of the severe weather Sunday night. Engineers are investigating the problem now, but until this is reso...
-
Nearly all major clusters operated by ITaP Research Computing are stopped due to issues with their storage systems relating to the power loss on the West Lafayette campus in the wake of the severe weather Sunday night. This includes: Conte, Carter,...
-
Update: 11:00pm, Nov. 12, 2013 ITaP storage engineers have returned the offline hardware to production and LustreC is back in production. Queues on Hansen and Carter have been restarted as of 11:45pm. Update: 5:00pm Following consultation with vendor...
-
Partial scratch96 filesystem outage
In the evening of 10/10/2013, the fileserver providing the "scratch96" filesystem serving some users of the Steele and Radon clusters suffered a permanent failure to its 2nd tier storage. This means that files on scratch96 that are older th...
-
Fortress HPSS Archive Unavailable
Update - 10:15 am Fortress is back in full production. Original Message: As of 8:00am, Thursday, September 19, the Fortress HPSS is temporarily unavailable due to issues with communicating with its tape drives. Storage engineers are working to return...
-
LustreC filesystem unavailable
Update: May 13, 2013 11:00pm: LustreC has been returned to service. Carter, Hansen, and Peregrine1 are back in production with queues enabled. Update: May 13, 2013 3:00pm: storage engineers are continuing to work with vendor support to return Lustre...
-
Resolved: As of about 4:45pm ET, the connectivity issue affecting the Fortress archive has been resolved. The HPSS archive is back in full production. If you encounter any issues, please contact us at rcac-help@purdue.edu Update: ITaP Storage Enginee...
-
Network outage affecting Peregrine1 cluster
On April 24, 2013, network engineers will be relocating fiber optics that connect the Peregrine1 cluster to infrastructure in West Lafayette. This outage is scheduled for 12:00am through 5:00am. This will leave Peregrine1 unable to run jobs Any PBS j...
-
Scheduling paused on Carter cluster
Update: 8:12pm Scheduling on Carter has been resumed, and Carter is back in full production. Original Message: Beginning the morning of April 16, a number of compute nodes on the Carter cluster are experiencing a connectivity issue. While ITaP engine...
-
Update: ITaP engineers have corrected the issue affecting the LustreC filesystem. The system is back in production. Job scheduling on Carter, Hansen and Peregrine1 has been restarted. As always, thank you for your patience. If you encounter any issue...
-
As of 9:00am, are seeing a problem with the LustreC scratch filesystem that serves Carter, Hansen, and Peregrine1. To prevent any more jobs from running into this, we have temporarily suspended scheduling of new jobs, though you may still submit to...