Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Fortress Hardware Move and Software Upgrade

  • Outages and Maintenance
  • Fortress

Link to update at March 23, 2023 7:28pm EDT UPDATE:

Access to Fortress via Globus, HSI, HTAR, SFTP and Samba are back. The Web Page to generate keytabs is now working as well.

There may be a few more glitches, but we will handle those as needed. This upgrade is closed.

Let us know at rcac-help@purdue.edu of issues. Thank you for your patience.

Link to update at March 21, 2023 5:29pm EDT UPDATE:

Users continue to have intermittent problems with some access methods. Engineers continue to work on finding and fixing those problems.

We will update again by Thursday morning as things progress.

Link to update at March 21, 2023 10:10am EDT UPDATE:

Fortress is back in service. We are experiencing some unexpected problems with Globus file sharing which we have opened a ticket about. There are also some issues with the web page to generate keytabs for hsi, but it was felt we could return to service and deal with this later.

Send email to rcac-help@purdue.edu with problems.

Link to update at March 20, 2023 5:35pm EDT UPDATE:

Update:

We are currently still waiting for a very large database operation to complete before proceeding with bringing the system up. We expect this to complete over night. If so, we will start bringing Fortress back into service.

Further news by noon tomorrow.

Link to update at March 19, 2023 6:41pm EDT UPDATE:

Update:

We are currently waiting for a very large database operation to complete before proceeding with bringing the system up. About a third of the 65 transactions have occurred, but it may be another day or more for these to finish. Entries still remaining could cause errors in staging from tape, so we are holding off bringing the system back.

Further news around 6PM tomorrow.

Link to update at March 18, 2023 5:33pm EDT UPDATE:

Update:

We are currently waiting for a very large database operation to complete before proceeding with bringing the system up. This may take a very long time, unfortunately. This is required due to a disk cache array failure. No data was lost, but the database currently thinks about 65 million files are on disk that no longer exists.

Further update by 6PM tomorrow.

Link to update at March 18, 2023 9:01am EDT UPDATE:

Update:

HPSS Engineers have found some of the problems with the archive and work continues today to bring the archive back to service.

Update by 8PM tonight.

Link to update at March 17, 2023 4:27pm EDT UPDATE:

Update:

There have been numerous problems that have caused a delay in Return to Service. RCAC Engineers continue to work with HPSS Engineers to resolve these problems.

We will update by 10AM tomorrow. Sorry for any inconvenience caused by this extension.

Link to original posting ORIGINAL:

Fortress Downtime and Upgrade March 15, 2023 6:00AM – March 17, 2023 5:00PM

Fortress will be down Wednesday, March 15, 2023 through Friday, March 17, 2023. During that time, some hardware will be moved to another data center and most servers will be replaced with new ones in the new data center. This move will place the servers for Fortress in close proximity to the new tape library that was put into production last year.

Changes with this down time include:

  • Upgrade to a newer OS (RHEL6 -> RHEL8)
  • HPSS major release upgrade (v7.53 -> v9.3)
  • HSI/HTAR upgrade (v6.03 -> v9.3)
  • Change from Kerberos authentication to Unix authentication for HSI/HTAR
  • New core servers and movers as well as new client servers for SFTP and Globus
  • 40Gb networking

How does this impact you?

  • With the authentication changes, the old Kerberos keytabs that have been used in the past will be invalid. The first time running HSI or HTAR on the new system will generate a new Unix keytab to be used. External to RCAC cluster users can use the usual website to download a new key or copy a cluster-generated keytab.
  • The new Fortress cluster will be on a new subnet, so adjustments will have to be made to departmental firewalls in order to use HSI/HTAR external to RCAC. See here for more information about the firewall settings.
  • Any transfers occurring when the maintenance window opens on HSI/HTAR or Globus will be terminated and will not be recoverable.
  • New versions of HSI and HTAR will be released. The old versions will not work. New rpms for RHEL will be made available here prior to the move. Note that the new binaries won’t work until after the move.

In order for this upgrade to go smoothly, we are asking users to refrain from initiating large data transfers in or out of the archive starting Monday, March 13th. In order to move some of the hardware, the disk caches need to be up to date on migrating files to tape. Especially large transfers of data into the system will slow down the hardware move, since the data will need to be migrated before the hardware can be turned off.

Please let us know of problems or concerns by emailing rcac-help@purdue.edu. See this RCAC news post for updates and more information.

Originally posted: