Data Security Standards
The Protected Data Filesystem (PDFS) has been reviewed by the Purdue System Security Information Assurance team and found to meets or exceeds the requirements for controlled access to dbGaP data, human genomics data, and similar levels of data protection.
The PDFS is a shared Linux computing system physically located in the Purdue Research Data Center. Purdue IT facilities are configured with a high-level of physical security. All building access is controlled by the Purdue card office, badged and logged. All Purdue IT facilities and processes are certified ISO 9000, 27001, and 20000-1.
All data is stored on RAID arrays attached to file servers on a private, non-routable network. protected data sets are only accessible from within Purdue HPC systems. None of those servers are directly accessible from the Internet. All of the community cluster internal networks are isolated from the Internet, on private networks and with login nodes by firewall rules. The research network entry points are further protected with intrusion detection systems. All servers within RCAC are additionally protected by local firewalls.
All data is stored in directories (folders) with Linux file access controls restricting access to owner and group. Group membership is set by the owner. The top-level permissions on these directories are set by the system and unchangeable by individuals. Groups and accounts are reviewed annually by the primary investigator.
All user access to the system is password controlled. All users of the system are bound by Purdue IT Policies and Standards. Remote access to the servers is via encrypted transport (i.e. SSH). No data is exported to non-compliant systems.
Privileged access accounts are approved by the RCAC staff, documented and restricted to the specific staff members responsible for maintaining the cluster. All privileged access is logged. All system components are kept up-to-date with security patches.
Link to section 'Specific Example Approved Datasets (as of Summer 2024)' of 'Data Security Standards' Specific Example Approved Datasets (as of Summer 2024)
- NIH dbGaP
- UK Biobank
- Human Genomic Data
Link to section 'Cybersecurity Standards and Dataset Approval Process' of 'Data Security Standards' Cybersecurity Standards and Dataset Approval Process
Purdue IT high-performance computing resources are built and operated in line with the NIST 800-233 "High-Performance Computing Security: Architecture, Threat Analysis, and Security Posture" best practices, approved for data subject to the NIH Genomic Data Sharing Policy, and certified ISO 27001.
Data security requirements are driven by the contract and data use/material transfer/data transfer agreement. New data use agreements are reviewed by contract analysts and Purdue System Security (PSS) Information Assurance analysts, and matched to IT resources.
Sponsor-specifc data security requirements must be reviewed by PSS analysts prior to upload into the PDFS.