BoilerGrid

Overview of BoilerGrid

BoilerGrid is a large, high-throughput, distributed computing system operated by ITaP, and using the HTCondor system developed by the HTCondor Project at the University of Wisconsin. BoilerGrid provides a way for you to run programs on large numbers of otherwise idle computers in various locations, including any temporarily under-utilized high-performance cluster resources as well as any computer lab desktop machines not currently in use. Whenever a local user or scheduled job needs a machine back, HTCondor stops its job and sends it to another HTCondor node as soon as possible. Because this model limits the ability to do parallel processing and communications, BoilerGrid is only appropriate for relatively quick serial jobs.

Detailed Hardware Specification

BoilerGrid scavenges cycles from nearly all ITaP research systems, including all the ITaP-maintained research clusters and specialized systems. BoilerGrid also uses idle time of machines in student labs on the Purdue West Lafayette campus. Through the larger consortium DiaGrid, BoilerGrid may also send jobs to machines at other institutions, including the University of Wisconsin, the University of Louisville, Indiana University, the University of Notre Dame, Indiana State University, the Purdue Calumet and North Central campuses, and the Indiana University – Purdue University Fort Wayne campus. Whenever the primary scheduling system on any of these machines needs a compute node back or a user sits down and starts to use a desktop computer, HTCondor will stop its job and, if possible, checkpoint its work. HTCondor then immediately tries to restart this job on some other available compute node in BoilerGrid.

A recent snapshot of BoilerGrid found 36,524 total processor cores. Of these, there were 29,111 Linux/x86_64, 98 Linux/Intel (ia32), 385 WinNT51/Intel, and 6925 WinNT61/Intel. There are also small numbers of Itanium Linux, Solaris, and Intel OSX nodes. Memory on compute nodes ranges from 512 MB to 192 GB, and most processors run at 2 GHz or faster. With a total of over 60 TFLOPS available, BoilerGrid can provide large numbers of cycles in a short amount of time. HTCondor offers high-throughput computing and is excellent for parameter sweeps, Monte Carlo simulations, or nearly any serial application.

Owner Arch/OS Processor Cores
ITaP - Research Computing x86_64/Linux 30,717
ITaP - Research Computing Intel/Linux 29
ITaP - Envision Center Intel/Linux 48
ITaP - Teaching & Learning Intel/WinNTXX ~9,300
Purdue Calumet X86_64/Linux 998
Notre Dame CSE Intel/Linux, Intel/OSX, Sun4u/Solaris210, x86_64/Linux 1,213
Purdue Biology, Libraries & some ITaP Intel/Linux, Intel/WinNT51 187

BoilerGrid currently uses HTCondor 7.6.10. You can check on the overall status of BoilerGrid using CondorView.