BoilerGrid is a large, high-throughput, distributed computing system provided by RCAC and using the Condor system developed by the Condor Project at the University of Wisconsin. BoilerGrid provides a means for users to run programs on large numbers of otherwise idle computers in various locations, including both high-performance resources momentarily under-utilized and desktop lab machines not currently in use. Whenever a local user or scheduled job needs a given machine, the Condor job is stopped and sent to another Condor node as soon as possible. Because this model limits the ability to accomplish parallel processing and communications, RCAC decided to limit access to smaller, serial jobs. Condor jobs can be submitted from most of the RCAC systems (Gray, Pete, Prospero, Radon, Rossmann, Steele, Venice). You may also install Condor on your own desktop machine, and submit from that.
BoilerGrid scavenges cycles from nearly all RCAC systems, including community clusters, specialized systems, and the recycled cluster. BoilerGrid also uses idle time of machines in student labs on the Purdue West Lafayette campus, the Purdue Calumet campus and the University of Notre Dame. Whenever the normal scheduling system on these machines sends a job to a node, Condor preempts or (if possible) checkpoints its work, then immediately surrenders the node to the scheduled job.
BoilerGrid currently consists of over 20,000 processors. Of these, about 10,500 are Linux/x86_64, approximately 600 are Linux/Intel (ia32), and approximately 11,000 are WinNT51/Intel. There are also small numbers of Itanium Linux, Solaris and Mac OSX nodes. Memory on compute nodes ranges from 512 MB to 32 GB, and most processors run at 3 GHz or faster. With a total of over 60 TFLOPS available, BoilerGrid can provide large numbers of cycles in a short amount of time. All shared areas and software packages available on the RCAC systems are available on Condor. Condor is designed for high-throughput computing and is excellent for parameter sweeps, Monte Carlo simulations, or nearly any serial application.
| Owner | Arch/OS | Processors |
|---|---|---|
| ITaP - RCAC | x86_64/Linux | ~10500 |
| ITaP - RCAC | Intel/Linux | ~660 |
| ITaP - Envision Center | Intel/Linux | 48 |
| ITaP - Teaching & Learning | Intel/WinNTXX | ~9300 |
| Purdue Calumet | Intel/WinNT51 | ~250 |
| Notre Dame CSE | Intel/Linux, Sun4u/Solaris28, PPC/OSX, x86_64/Linux | ~230 |
| Purdue Biology, Libraries, & other ITaP | Intel/Linux, Intel/WinNT51 | 187 |
BoilerGrid currently runs the latest stable release of Condor: 7.0.1. BoilerGrid status may be monitored using CondorView.
November 23, 2009
November 23, 2009
October 19, 2009
September 18, 2009
September 14, 2009
September 14, 2009