Steele User Guide

Get Help
Collapse Topics

    Overview of Steele
        Overview of Steele

    Common Error Messages
        cannot connect to X server
        E233: cannot open display
        How do I check my job output while it is running
        bash: command not found
        qdel: Server could not connect to MOM 12345.rice-adm.rcac.purdue.edu
        bash: module command not found
        /usr/bin/xauth: error in locking authority file
        I worked on Steele after I graduated/left Purdue, but can not access it anymore
        My SSH connection hangs

    Common Questions
        How can my collaborators outside Purdue get access to Steele?
        How can I get email alerts about my PBS job status?
        How can I get access to Sentaurus software?
        Can I share data with outside collaborators?
        Can I get a private server from RCAC?

    Biography of John M. Steele
        Overview of John M. Steele


path breadcrumb divider Overview of Steele path breadcrumb divider Overview of Steele

Overview of Steele

Steele was a compute cluster operated by ITaP and the first system built under Purdue's Community Cluster Program. ITaP installed Steele in May 2008 in an unprecedented single-day installation. It replaced and expanded upon ITaP research resources retired at the same time, including the Hamlet, Lear, and Macbeth clusters. Steele consisted of 852 64-bit, 8-core Dell 1950 and 9 64-bit, 8-core Dell 2950 systems with various combinations of 16-32 GB RAM, 160 GB to 2 TB of disk, and 1 Gigabit Ethernet (1GigE) and InfiniBand local to each node.

Detailed Hardware Specification

Sub-Cluster Number of Nodes Processors per Node Cores per Node Memory per Node Interconnect Disk
-B 180 Two 2.33 GHz Quad-Core Intel E5410 8 16 GB 10 Gbps SDR InfiniBand and 1 GigE 160 GB
-C 48 Two 2.33 GHz Quad-Core Intel E5410 8 32 GB 1 GigE 160 GB
-D 41 Two 2.33 GHz Quad-Core Intel E5410 8 32 GB 10 Gbps SDR InfiniBand and 1 GigE 160 GB
-E 9 Two 3.00 GHz Quad-Core Intel E5450 8 32 GB 1 GigE 2 TB
-Z 48 Two 2.33 GHz Quad-Core Intel E5410 8 16 GB 1 GigE 160 GB

At the time of retirement, Steele nodes ran Red Hat Enterprise Linux 5 (RHEL5) and used Moab Workload Manager 7 and TORQUE Resource Manager 4 as the portable batch system (PBS) for resource and job management. Steele also ran jobs for BoilerGrid whenever processor cores in it would otherwise be idle.

path breadcrumb divider Common Error Messages path breadcrumb divider cannot connect to X server

Problem

You receive the following message after entering a command to bring up a graphical window

cannot connect to X server

Solution

This can happen due to multiple reasons:

  • Reason 1: Your SSH client software does not support graphical display by itself (e.g. SecureCRT or PuTTY).
    • Solution: Try using a client software like Thinlinc or MobaXTerm as described here.
  • Reason 2: You did not enable X11 forwarding in your SSH connection.

    • Solution: If you are in a Windows environment, make sure that X11 forwarding is enabled in your connection settings (e.g. in MobaXTerm or PuTTY). If you are in a Linux environment, try

      ssh -Y -l username hostname

  • Reason 3: If you are trying to open a graphical window within an interactive job, make sure you are using the -X option with qsub after following the previous step(s) for connecting to the front-end. Please see the example here.
  • Reason 4: If none of the above apply, make sure that you are within quota of your home directory as described here.

path breadcrumb divider Common Error Messages path breadcrumb divider E233: cannot open display

Problem

You receive the following message after entering a command to bring up a graphical window

E233: cannot open display

Solution

This means you did not enable X11 forwarding which supports remote graphical access to applications. Try

ssh -Y -l username hostname

path breadcrumb divider Common Error Messages path breadcrumb divider How do I check my job output while it is running

Problem

After submitting your job to the cluster, you want to see the output that it generates.

Solution

There are two simple ways to do this:

  • qpeek: Use the tool qpeek to check the job's output. Syntax of the command is:
    qpeek <jobid>
  • Redirect your output to a file: To do this you need to edit the main command in your jobscript as shown below. Please note the redirection command starting with the greater than (>) sign.
    myapplication ...other arguments... > "${PBS_JOBID}.output"
    On any front-end, go to the working directory of the job and scan the output file.
    tail "<jobid>.output"
    Make sure to replace <jobid> with an appropriate jobid.

path breadcrumb divider Common Error Messages path breadcrumb divider bash: command not found

Problem

You receive the following message after typing a command

bash: command not found

Solution

This means the system doesn't know how to find your command. Typically, you need to load a module to do it.

path breadcrumb divider Common Error Messages path breadcrumb divider qdel: Server could not connect to MOM 12345.rice-adm.rcac.purdue.edu

Problem

You receive the following message after attempting to delete a job with the 'qdel' command

qdel: Server could not connect to MOM 12345.rice-adm.rcac.purdue.edu

Solution

This error usually indicates that at least one node running your job has stopped responding or crashed. Please forward the job ID to rcac-help@purdue.edu, and ITaP Research Computing staff can help remove the job from the queue.

path breadcrumb divider Common Error Messages path breadcrumb divider bash: module command not found

Problem

You receive the following message after typing a command, e.g. module load intel

bash: module command not found

Solution

The system cannot find the module command. You need to source the modules.sh file as below

source /etc/profile.d/modules.sh

or

#!/bin/bash -i

path breadcrumb divider Common Error Messages path breadcrumb divider /usr/bin/xauth: error in locking authority file

Problem

I receive this message when logging in:

/usr/bin/xauth: error in locking authority file

Solution

Your home directory disk quota is full. You may check your quota with myquota.

You will need to free up space in your home directory.

path breadcrumb divider Common Error Messages path breadcrumb divider I worked on Steele after I graduated/left Purdue, but can not access it anymore

Problem

You have graduated or left Purdue but continue collaboration with your Purdue colleagues. You find that your access to Purdue resources has suddenly stopped and your password is no longer accepted.

Solution

Access to all Research Computing resources depends on having a valid Purdue Career Account. Expired Career Accounts are removed twice a year, during Spring and October breaks (more details at the official page). If your Career Account was purged due to expiration, you will not be be able to access the resources.

To provide remote collaborators with valid Purdue credentials, the University provides a special procedure called R4P ("request for privileges"). If you need to continue your collaboration with your Purdue PI, the PI will have to work with their departmental Business Office to submit or renew an R4P request on your behalf.

After your R4P is completed and Career Account is restored, please note two additional necessary steps:

  • Access: Restored Career Accounts by default do not have any Research Computing resources enabled for them. Your PI will have to login to the Manage Users tool and explicitly re-enable your access by un-checking and then ticking back checkboxes for desired queues/Unix groups resources.

  • Email: Restored Career Accounts by default do not have their @purdue.edu email service enabled. While this does not preclude you from using Research Computing resources, any email messages (be that generated on the clusters, or any service announcements) would not be delivered - which may cause inconvenience or loss of compute jobs. To avoid this, we recommend setting your restored @purdue.edu email service to "Forward" (to an actual address you read). The easiest way to ensure it is to go through the Account Setup process.

path breadcrumb divider Common Error Messages path breadcrumb divider My SSH connection hangs

Problem

Your console hangs while trying to connect to a RCAC Server.

Solution

This can happen due to various reasons. Most common reasons for hanging SSH terminals are:

  • Network: If you are connected over wifi, make sure that your Internet connection is fine.
  • Busy front-end server: When you connect to a cluster, you SSH to one of the front-ends. Due to transient user loads, one or more of the front-ends may become unresponsive for a short while. To avoid this, try reconnecting to the cluster or wait until the server you have connected to has reduced load.
  • File system issue: If a server has issues with one or more of the file systems (home, scratch, or depot) it may freeze your terminal. To avoid this you can connect to another front-end.

If neither of the suggestions above work, please contact rcac-help@purdue.edu specifying the name of the server where your console is hung.

path breadcrumb divider Common Questions path breadcrumb divider How can my collaborators outside Purdue get access to Steele?

How can my collaborators outside Purdue get access to Steele?

Your Departmental Business Office can submit a Request for Privileges (R4P) to provide access to collaborators outside Purdue, including recent graduates. Once the R4P process is complete, you will need to add your outside collaborators to Steele as you would any for any Purdue collaborator.

path breadcrumb divider Common Questions path breadcrumb divider How can I get email alerts about my PBS job status?

Question

How can I be notified when my PBS job was executed and if it completed successfully?

Answer

Submit your job with the following command line arguments

qsub -M email_address -m bea myjobsubmissionfile

Or, include the following in your job submission file.

#PBS -M email_address                                                  
#PBS -m bae                                                                         

The -m option can have the following letters; "a", "b", and "e":

a - mail is sent when the job is aborted by the batch system.
b - mail is sent when the job begins execution.
e - mail is sent when the job terminates.

path breadcrumb divider Common Questions path breadcrumb divider How can I get access to Sentaurus software?

Question

How can I get access to Sentaurus tools for micro- and nano-electronics design?

Answer

Sentaurus software license requires a signed NDA. Please contact Dr. Mark Johnson, Director of ECE Instructional Laboratories to complete the process.

Once the licensing process is complete and you have been added into a cae2 Unix group, you could use Sentaurus on RCAC community clusters by loading the corresponding environment module:

module load sentaurus

path breadcrumb divider Common Questions path breadcrumb divider Can I share data with outside collaborators?

Yes! Globus allows convenient sharing of data with outside collaborators. Data can be shared with collaborators' personal computers or directly with many other computing resources at other intstitutions. See the Globus documentation on how to share data:

path breadcrumb divider Common Questions path breadcrumb divider Can I get a private server from RCAC?

Question

Can I get a private (virtual or physical) server from RCAC?

Answer

Often, researchers may want a private server to run databases, web servers, or other software. RCAC currently does not offer private servers (formerly known as "Firebox").

For use cases like this, we recommend the Jetstream Cloud (http://jetstream-cloud.org/) an NSF-funded science cloud allocated through the XSEDE project. RCAC staff can help get you access to Jetstream to test, or to help write an allocation proposal for larger projects.

Alternatively, you may consider commercial cloud providers such as Amazon Web Services, Azure, or Digital Ocean. These services are very flexible, but do come with a monetary cost.

path breadcrumb divider Biography of John M. Steele path breadcrumb divider Overview of John M. Steele

John Campbell stands with John Steele.

John Campbell, Purdue associate vice president for information technology and head of ITaP Research Computing, stands with John Steele (L) at the dedication of the new Steele cluster in May 2008.

John M. Steele

John M. Steele, associate professor emeritus of computer sciences, was involved with research computing at Purdue almost from its inception. He joined the Purdue staff in 1963 at the Computer Sciences Center associated with the then-new Computer Science Department. He served as the director of the Purdue University Computing Center, the high performance computing unit at Purdue prior to ITaP, from 1988 to 2001 before retiring in 2003.

John Steele signs the end panel.

John Steele signs the end panel of the Steele cluster.

His research interests have been in the areas of computer data communications and computer circuits and systems, including research on an early mobile wireless Internet system. He still does computing consulting. Steele earned his bachelor's in math and physics and master's in electrical engineering at Purdue.

Purdue University, 610 Purdue Mall, West Lafayette, IN 47907, (765) 494-4600

© 2017 Purdue University | An equal access/equal opportunity university | Copyright Complaints | Maintained by ITaP Research Computing

Trouble with this page? Disability-related accessibility issue? Please contact us at online@purdue.edu so we can help.