Photo of Hathi

Hathi

Hathi

Overview of Hathi

Hathi is a shared Hadoop cluster operated by ITaP, and is a shared resource available to partners in Purdue's Community Cluster Program. Hathi went into production on September 8, 2014. Hathi consists of 6 Dell compute nodes with two 8-core Intel E5-2650v2 CPUs, 32 GB of memory, and 48TB of local storage per node for a total cluster capacity of 288TB. All nodes have 40 Gigabit Ethernet interconnects and a 5-year warranty.

Hathi consists of two components: the Hadoop Distributed File System (HDFS), and a MapReduce framework for job and task tracking.

The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data.

A Hadoop cluster has several components:

  • Name Node
  • Resource Manager
  • Data Node
  • Task Manager

To request access to Hathi today, please email rcac-help@purdue.edu. Subscribe to our Community Cluster Program Mailing List to stay informed on the latest purchasing developments, or email us at rcac-help@purdue.edu if you have any questions.

Detailed Hardware Specification

Hathi consists of Dell r720xd servers, each with 16 Intel E5-2650v2 cores, 32 GB of memory, 48TB of local storage, and a 40 Gigabit Ethernet interconnect.

Number of Nodes Processors per Node Cores per Node Memory per Node HDFS Storage per Node Interconnect
6 2 Intel E5-2650v2 16 32 GB 48 TB 40 GigE

Hathi nodes run Red Hat Enterprise Linux, Version 6 and use the PivotalHD Hadoop distribution for resource and job management. The application of operating system patches occurs as security needs dictate. All nodes allow for unlimited stack usage, as well as unlimited core dump size (though disk space and server quotas may still be a limiting factor).

Obtaining an Account on Hathi

All Purdue faculty, staff, and students with the approval of their advisor may request access to Hathi. Refer to the Accounts / Access page for more details on how to request access.