Getting an account
- Accounts can be requested here.
- All accounts are active for a period of 1 year.
- After 1 year, accounts will be locked and terminated.
- A new account must be requested annually.
- Storage on Galileo can be wiped at any time without notice.
Node configuration
Galileo is the head node:
- Dell R720xd with 384 GB of RAM, two Xeon E5-2570 processors, and 8 TB of disk space.
8 compute nodes with:
- 1 Dell R730 with 512 GB of RAM, two Xeon E5-2698 processors (80 threads).
- 1 Dell R730 with 512 GB of RAM, two Xeon E5-2698 processors (80 threads), 2 Nvidia Tesla K80 GPU’s.
- 2 Dell R930’s with 512 GB of RAM, four Xeon E7-8880 processors (176 threads).
- 2 Dell R730xd’s with 384 GB of RAM, two Xeon E5-2670 processors (48 threads each).
- 2 Dell R720’s with 128 GB of RAM, two Xeon E5-2670 processors (32 threads).
- HOME directory with a 500GB quota, 10^4 inode quota.
- No permanent storage service is available for Harlow. Please use departmental resources to backup data.
Software and Environment
This system is running CentOS 7.7 with OpenHPC/Warewulf/Slurm.
Installed software includes:
- IDL (5.2, 5.5, 6.1, 7.0, 8.3)
- Matlab (2013b, 2017a, 2917b, 2019b, 2021a)
- Python 2.7.5, 3.6
Installed compilers:
- gcc 4.8.5 (C, C++, Objective-C, Objective-C++, Java, Fortran, Ada, and Go)
Notes on software and compilers:
- We keep the CentOS 7.x operating system as original as possible, meaning that we will generally not install additional software or libraries that are not available through the yum package manager. Users must install their software locally in their home directories.
Connecting to Galileo
The Galileo cluster is on GSU’s private IP space. This means it cannot be reached directly from off campus. If you are off campus and need to get to the cluster, you can do so through one of our ssh gateways or through the GSU VPN. The hostname and IP are galileo.phy-astr.gsu.edu (10.252.12.200).
Job Scripts
Standard batch system jobs are executed applying the following steps:
- Provide (write) a batch job script, see the examples below.
- Submit the job script with the command sbatch.
- Monitor and control the job execution, e.g. with the commands squeue and scancel (to cancel the job).
- Documentation on Slurm job submission can be found at https://slurm.schedmd.com/.
A job script is a script (written in bash, ksh or csh syntax) containing Slurm keywords which are used as arguments for the command sbatch.
Example Job Scripts
Here is a job script that prints the hostname of a machine and outputs it to a file called res.txt.
It is saved as submit.sh.
For more sbatch options see https://slurm.schedmd.com/sbatch.html.
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=res.txt
#
#SBATCH --ntasks=1
#SBATCH --time=10:00
srun hostname
srun sleep 60
To execute the script:
sbatch submit.sh
Example Interactive Job Submission
Here is a job submission that opens an interactive session for running GUI based applications (ie: Matlab)
For more srun options see https://slurm.schedmd.com/srun.html.
[username@galileo]$ srun --x11 --pty /bin/bash
[username@node]$ /usr/local/MATLAB/R2019b/bin/matlab
Matlab code example
To run Matlab calculations, you first need to put your Matlab commands into a file with a .m extension.
You can then submit the job to sbatch with a script:
#!/bin/bash
#SBATCH --job-name matlab-example
#SBATCH --time=02:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
srun matlab -nodisplay -nojvm -nosplash -r 'filename'
Note
- Save your file with a .m extension but call it without the .m when you run Matlab
- Matlab can use a lot memory so if your job stalls or fails, try again with more memory.
Code with parameters
If your calculation requires input values, you can code it as a function and supply the values in the call to Matlab:
srun matlab -nodisplay -nojvm -nosplash -r 'filename(param1,param2,...)'
Multithreaded computations
To run a matlab jobs using multiple threads submit the job to sbatch with a script like:
#!/bin/bash
#SBATCH --job-name matlab-example
#SBATCH --time=02:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
srun matlab -nodisplay -nojvm -nosplash -r 'filename'
Batch partitions
We currently have 2 partitions (queues) on the Galileo Cluster.
The default is limited to 12 hours per job.
The longjobs partition can run jobs up to 72 hours.
Partition | Max. walltime | Nodes | Remark |
normal | 12:00:00 | 1 on demand | This is the default queue. |
longjobs | 72:00:00 | 1 on demand | This is for jobs up to 72 hours. |