- Installed: 09/05/2014
- Retired: 08/31/2018
Overview of Hathi
Hathi was a shared Hadoop cluster operated by ITaP, and was a shared resource available to partners in Purdue's Community Cluster Program. Hathi went into production on September 8, 2014.
Hathi consisted of two components: the Hadoop Distributed File System (HDFS), and a MapReduce framework for job and task tracking.
The Hadoop Distributed File System (HDFS) was a distributed file system designed to run on commodity hardware. It had many similarities with existing distributed file systems. However, the differences from other distributed file systems were significant. HDFS was highly fault-tolerant and was designed to be deployed on low-cost hardware. HDFS provided high throughput access to application data and was suitable for applications that have large data sets. HDFS relaxed a few POSIX requirements to enable streaming access to file system data.
A Hadoop cluster had several components:
- Name Node
- Resource Manager
- Data Node
- Task Manager
Hathi Detailed Hardware Specification
|Sub-Cluster||Number of Nodes||Processors per Node||Cores per Node||Memory per Node||Retired in|
|A||4||Two 8-core Intel E5-2650v2 + 48TB storage||16||256 GB||2018|