What is it
Stars is the High Performance Computing (HPC) cluster of the College of Arts and Sciences at Howard University. It was born thanks to a grant assigned to the P.I.s Roberto De Leo (Dept. of Mathematics), J. Harkless (Dept. of Chemistry) and P. Bezandry (Dept. of Mathematics) within Howard University's 2014 Research Resources and Small Equipment Program. All CoAS' students and researchers are eligible for access to Stars to perform their own numerical studies. This page contains all the instructions on how to get access and how to use the cluster.
Stars currently consists in 6 nodes with 2 CPU Xeon E5-2640 v3 @ 2.60GHz with 8 cores each (on all machines is activated the Hyperthreading feature, so that every node has 32 active cores total). On each machine are installed 16GB of RAM and a 3.6TB HDD storage (in a RAID5 configuration) is concentrated on one machine and shared with the remaining five.
The Operating System installed on Stars is Linux, kernel version 4.1.6 (distribution: Slackware 14.1).
The following is a list of the major packages installed on the Stars cluster. You are always welcome to install other packages in your own home directory but if you think that some package is general enough to be useful for many other people just send an email to the current cluster admistrator Roberto De Leo ⟨roberto.deleo AT howard.edu⟩ to have it installed on the cluster.
- Languages, Schedulers and Parallel Computing
- 3D Graphics
- 2D Graphics
How to get an account on the cluster
Send an email to the current cluster admistrator Roberto De Leo ⟨roberto.deleo AT howard.edu⟩ with your name, affiliation and your favorite login name (max 8 characters).
How to connect to the cluster
You can connect to the cluster through ssh to the cluster's server
If you connect from outside the HU network, remember to use port 443 (due to the current firewall policy
of HU it is not possible to use the default port 22 from remote).
If you use the command line and your login name is "doe", then you can connect with
ssh -p 443 firstname.lastname@example.org
If you want, moreover, to be able to use graphical applications, then you should connect with
ssh -p 443 -YC email@example.com
Look here for more ssh usage examples and how to use ssh/scp from Windows or Mac: ssh command Tutorial
What to do after you log in
Once you are logged into the system, all you get is a prompt waiting for your commands. In order to use the cluster, therefore, it is more or less mandatory that you know at least the basics of the Unix Command Line Interface (CLI), e.g. to see the list of your files, change directory, create or erase files and directories and so on. If you do not have any experience of Unix' CLI, here are a few good starting pointers:
- Ryan's Linux Tutorial
- The Command Line Crash Course
- A Command Line Primer for Beginners
How to transfer files from/to the cluster
You can transfer files to and from the cluster using scp through the port 443.
If you use the command line and your login name is "doe", then you can copy the file myfile to the cluster with
scp -P 443 myfile firstname.lastname@example.org:
To copy recursively the entire directory myfile to the cluster use
scp -P 443 -r mydir email@example.com:
Viceversa, you can copy the remote file myremotefile from the cluster to your machine with
scp -P 443 firstname.lastname@example.org:myremotefile .
Look here for more scp usage examples: scp command Tutorial
How to submit jobs
All jobs on the cluster must be submitted through the scheduler system slurm.
Jobs running on the cluster without passing through slurm will be stopped by the System Administrator without notice.
On the cluster there are 4 queues defined:
- debug -- priority 1000, 1 hour maximum running time
- fast -- priority 100, 6 hours maximum running time
- medium -- priority 10, 1 day maximum running time
- debug -- priority 1, default, no running time restriction
E.g., to run your program myjob on the medium cluster's queue from the command line, use
srun -p medium myjob
The scheduler will automatically take care of the node on which to run the job. When a higher priority job is launched and there are no more free cores, then a lower priority job will be suspended until a higher priority job ends. Very good tutorials on slurm can be found here:
- Slurm Tutorials on the Slurm web site
- Slurm Quick Start Tutorial from the Consortium des Équipements de Calcul Intensif (Belgium)
- Slurm Quickstart from the Lawrence Livermore National Laboratories
- Slurm User Tutorial (in pdf) from the Lawrence Livermore National Laboratories
- Slurm Job Submission from the University of Colorado
- Slurm Batch Queing from the University of Colorado
How to see which jobs are running
There are three commands that, after you log in into the cluster, allow you to see which and whose jobs are currently running on it:
- squeue -- view information about jobs located in the Slurm scheduling queue.
deleo@helios:~$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2434 slow apolloni deleo R 18-21:57:51 1 rigel 2525 slow apolloni deleo R 6-02:46:02 1 aldebaran 2527 slow apolloni deleo R 6-02:46:02 1 aldebaran 2529 slow apolloni deleo R 6-02:46:02 1 aldebaran
- smap -- a colored version of the same information.
- sview -- graphical user interface to view and modify Slurm state.