Skip to content

Commit

Permalink
Merge pull request #421 from baskerville-hpc/baskerville
Browse files Browse the repository at this point in the history
Config options for Birmingham Baskerville slurm
  • Loading branch information
reid-a authored Oct 6, 2022
2 parents d4cd862 + 641cfa2 commit f778bf9
Show file tree
Hide file tree
Showing 32 changed files with 565 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#------------------------------------------------------------
# Birmingham Baskerville Slurm: Jenny Wong
#------------------------------------------------------------

# Cluster host and scheduler options: the defaults come from
# Graham at Compute Canada, running Slurm. Other options can
# be found in the library of snippets,
# `_includes/snippets_library`. To use one, replace options
# below with those in `_config_options.yml` from the
# library. E.g, to customise for Cirrus at EPCC, running
# PBS, we could replace the options below with those from
#
# _includes/snippets_library/EPCC_Cirrus_pbs/_config_options.yml
#
# If your cluster is not represented in the library, please
# copy an existing folder, rename it, and customize for your
# installation. Remember to keep the leading slash on the
# `snippets` variable below!

snippets: "/snippets_library/Birmingham_Baskerville_slurm"

local:
prompt: "[user@laptop ~]$"
bash_shebang: "#!/usr/bin/env bash"

remote:
name: "Baskerville"
login: "login.baskerville.ac.uk"
host: "bask-pg0310u18a.cluster.baskerville.ac.uk"
node: "bask-pg"
location: "University of Birmingham, UK"
homedir: "/bask/homes/y/yourUsername"
user: "yourUsername"
prompt: "[yourUsername@bask-pg0310u18a ~]$"
bash_shebang: "#!/bin/bash"

sched:
name: "Slurm"
submit:
name: "sbatch"
options: ""
queue:
debug: "devel"
testing: "normal"
status: "squeue"
flag:
user: "-u yourUsername"
interactive: ""
histdetail: "-l -j"
name: "-J"
time: "-t"
queue: "-p"
del: "scancel"
interactive: "srun"
info: "sinfo"
comment: "#SBATCH"
hist: "sacct -u $USER"

episode_order:
- 10-hpc-intro
- 11-connecting
- 12-cluster
- 13-scheduler
- 14-modules
- 15-transferring-files
- 16-parallel
- 17-resources
- 18-responsibility
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
> ## The authenticity of host
>
> When login in for the first time you may get a question whether you trust the
> server you are trying to connect. If you typed the address correctly (i.e.
> {{ site.login_host }}) then it is safe to say "yes" to the question at the
> end of this message and permanently added this server to trusted hosts.
>
> ~~~
> $ ssh lola@{{ site.login_host }}
> The authenticity of host '{{ site.login_host }}' can't be established.
> RSA key fingerprint is SHA256:NwV2/9HMlLfj6hFmXTuA4UVievE/uq36K9EYa20CteI.
> Are you sure you want to continue connecting (yes/no)? yes
> Warning: Permanently added '{{ site.login_host }}' to the list of known hosts
> ~~~
> {: .language-bash}
{: .callout}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
```
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up 7-00:00:00 1 down* c5-12
normal* up 7-00:00:00 87 mix c1-[7,13-15,19,22-24,28,31,33,3.......
normal* up 7-00:00:00 110 alloc c1-[1-6,8-9,11-12,16-18,20-21,........
normal* up 7-00:00:00 1 idle c1-10
bigmem up 14-00:00:0 1 drain c6-3
bigmem up 14-00:00:0 10 mix c3-[29-31,53,56],c6-[1,4-7]
bigmem up 14-00:00:0 1 alloc c6-2
bigmem up 14-00:00:0 24 idle c3-[32-52,54-55],c6-8
accel up 14-00:00:0 3 mix c7-[1-2,8]
accel up 14-00:00:0 5 alloc c7-[3-7]
optimist up infinite 1 down* c5-12
optimist up infinite 1 drain c6-3
optimist up infinite 100 mix c1-[7,13-15,19,22-24,28,31,33,35,37...
optimist up infinite 116 alloc c1-[1-6,8-9,11-12,16-18,20-21,25-27...
optimist up infinite 25 idle c1-10,c3-[32-52,54-55],c6-8
```
{: .output}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
> ## Explore a Worker Node
>
> Finally, let's look at the resources available on the worker nodes where your
> jobs will actually run. Try running this command to see the name, CPUs and
> memory available on the worker nodes (the instructors will give you the ID of
> the compute node to use):
>
> ```
> {{ site.host_prompt }} sinfo --node c3-12 -o "%n %c %m"
> ```
> {: .language-bash}
{: .challenge}
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
```
---------------------- /cluster/modulefiles/all -------------------------
4ti2/1.6.9-GCC-8.2.0-2.31.1 gmsh/4.5.6-foss-2019b-Python-3.7.4
ABySS/2.0.2-gompi-2019a gnuplot/5.2.6-GCCcore-8.2.0
AdapterRemoval/2.3.1-foss-2018b gnuplot/5.2.8-GCCcore-8.3.0
AdapterRemoval/2.3.1-GCC-8.2.0-2.31.1 Go/1.13.1
ADF/2019.103+StaticMKL gompi/2018b
AdmixTools/5.1-GCC-7.3.0-2.30 gompi/2019a
ADMIXTURE/1.3.0 gompi/2019b
[removed most of the output here for clarity]
----------------------- /cluster/modulefiles/external ---------------------
appusage/1.0 hpcx/2.4 hpcx/2.5 hpcx/2.6
Where:
S: Module is Sticky, requires --force to unload or purge
L: Module is loaded
Aliases: Aliases exist: foo/1.2.3 (1.2) means that "module load foo/1.2"
will load foo/1.2.3
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".
```
{: .output}
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
```
/usr/bin/which:no python3 in
(/opt/software/slurm/16.05.9/bin:
/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/11.3.4.258/bin:
/opt/software/bin:/opt/puppetlabs/puppet/bin:/opt/software/slurm/current/bin:
/usr/local/bin:/usr/bin:/usr/local/sbin:
/usr/sbin:/home/yourUsername/.local/bin:/home/yourUsername/bin)
```
{: .output}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
```
{{ site.host_prompt }} module load Python/3.7.2-GCCcore-8.2.0
{{ site.host_prompt }} which python3
```
{: .language-bash}
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
```
/cluster/software/Python/3.7.2-GCCcore-8.2.0/bin/python
```
{: .output}
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
```
{{ site.host_prompt }} ls /cluster/software/Python/3.8.2-GCCcore-9.3.0/bin
```
{: .language-bash}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
```
2to3 idle3.7 pytest rst2odt.py
2to3-3.7 isympy python rst2pseudoxml.py
chardetect netaddr python3 rst2s5.py
cygdb nosetests python3.7 rst2xetex.py
cython nosetests-3.7 python3.7-config rst2xml.py
cythonize pasteurize python3.7m rstpep2html.py
dijitso pbr python3.7m-config runxlrd.py
easy_install pip python3-config sphinx-apidoc
easy_install-3.7 pip3 pyvenv sphinx-autogen
f2py pip3.7 pyvenv-3.7 sphinx-build
f2py3 pybabel rst2html4.py sphinx-quickstart
f2py3.7 __pycache__ rst2html5.py tabulate
ffc pydoc3 rst2html.py virtualenv
ffc-3 pydoc3.7 rst2latex.py wheel
futurize pygmentize rst2man.py
idle3 py.test rst2odt_prepstyles.py
```
{: .output}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
```
/cluster/software/Python/3.8.2-GCCcore-9.3.0/bin:
/cluster/software/XZ/5.2.5-GCCcore-9.3.0/bin:
/cluster/software/bzip2/1.0.8-GCCcore-9.3.0/bin:
/cluster/software/binutils/2.34-GCCcore-9.3.0/bin:
/cluster/software/GCCcore/9.3.0/bin:/node/bin:/usr/local/bin:
/usr/bin:/usr/local/sbin:/usr/sbin:/cluster/bin:
/cluster/home/sabryr/.local/bin:/cluster/home/sabryr/bin
```
{: .output}
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
To demonstrate, let's use `module list`. `module list` shows all loaded
software modules.

```
{{ site.host_prompt }} module list
```
{: .language-bash}
```
Currently Loaded Modules:
1) StdEnv (S) 6) libreadline/8.0-GCCcore-8.2.0 (H)
2) GCCcore/8.2.0 7) XZ/5.2.4-GCCcore-8.2.0 (H)
3) bzip2/1.0.6-GCCcore-8.2.0 (H) 8) GMP/6.1.2-GCCcore-8.2.0 (H)
4) zlib/1.2.11-GCCcore-8.2.0 (H) 9) libffi/3.2.1-GCCcore-8.2.0 (H)
5) ncurses/6.1-GCCcore-8.2.0 (H) 10) Python/3.7.2-GCCcore-8.2.0
Where:
S: Module is Sticky, requires --force to unload or purge
H: Hidden Module
```
{: .output}

```
{{ site.host_prompt }} module load Beast/2.5.2-GCC-8.2.0-2.31.1
{{ site.host_prompt }} module list
```
{: .language-bash}

```
Currently Loaded Modules:
1) StdEnv (S) 9) libffi/3.2.1-GCCcore-8.2.0
2) GCCcore/8.2.0 10) Python/3.7.2-GCCcore-8.2.0
3) bzip2/1.0.6-GCCcore-8.2.0 (H) 11) binutils/2.31.1-GCCcore-8.2.0
4) zlib/1.2.11-GCCcore-8.2.0 (H) 12) GCC/8.2.0-2.31.1
5) ncurses/6.1-GCCcore-8.2.0 (H) 13) Java/11.0.2
6) libreadline/8.0-GCCcore-8.2.0 (H) 14) beagle-lib/3.1.2-GCC-8.2.0-2.31.1
7) XZ/5.2.4-GCCcore-8.2.0 (H) 15) Beast/2.5.2-GCC-8.2.0-2.31.1
8) GMP/6.1.2-GCCcore-8.2.0 (H)
Where:
S: Module is Sticky, requires --force to unload or purge
H: Hidden Module
```
{: .output}

So in this case, loading the `beast` module (a bioinformatics software
package), also loaded `Java/11.0.2` and `beagle-lib/3.1.2-GCC-8.2.0-2.31.1` as
well. Let's try unloading the `beast` package.

```
{{ site.host_prompt }} module unload Beast/2.5.2-GCC-8.2.0-2.31.1
{{ site.host_prompt }} module list
```
{: .language-bash}

```
Currently Loaded Modules:
1) StdEnv (S) 8) GMP/6.1.2-GCCcore-8.2.0 (H)
2) GCCcore/8.2.0 9) libffi/3.2.1-GCCcore-8.2.0 (H)
3) bzip2/1.0.6-GCCcore-8.2.0 (H) 10) Python/3.7.2-GCCcore-8.2.0
4) zlib/1.2.11-GCCcore-8.2.0 (H) 11) binutils/2.31.1-GCCcore-8.2.0 (H)
5) ncurses/6.1-GCCcore-8.2.0 (H) 12) GCC/8.2.0-2.31.1
6) libreadline/8.0-GCCcore-8.2.0 (H) 13) Java/11.0.2
7) XZ/5.2.4-GCCcore-8.2.0 (H) 14) beagle-lib/3.1.2-GCC-8.2.0-2.31.1
Where:
S: Module is Sticky, requires --force to unload or purge
H: Hidden Module
```
{: .output}

So using `module unload` "un-loads" a module along with its dependencies. If we
wanted to unload everything at once, we could run `module purge` (unloads
everything).

```
{{ site.host_prompt }} module purge
```
{: .language-bash}
```
The following modules were not unloaded:
(Use "module --force purge" to unload all):
1) StdEnv
```
{: .output}

Note that `module purge` is informative. It lets us know that all but a default
set of packages have been unloaded (and how to actually unload these if we
truly so desired).
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Let's take a closer look at the `gcc` module. GCC is an extremely widely used
C/C++/Fortran compiler. Tons of software is dependent on the GCC version, and
might not compile or run if the wrong version is loaded. In this case, there
are few different versions:

`GCC/7.3.0-2.30 GCC/8.2.0-2.31.1 GCC/8.3.0 GCC/9.3.0`

How do we load each copy and which copy is the default?

On SAGA and Fram we do not have default modules and we must use the full name
to load it.

```
{{ site.host_prompt }} module load gcc
```
{: .language-bash}

```
Lmod has detected the following error: The following module(s) are unknown:
"gcc"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore-cache load "gcc"
```
{: .output}

To load a software module we must specify the full module name:

```
{{ site.host_prompt }} module load GCC/8.2.0-2.31.1
{{ site.host_prompt }} gcc --version
```
{: .language-bash}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
```
{{ site.remote.bash_shebang }}
{{ site.sched.comment }} {{ site.sched.flag.name }} parallel-pi
{{ site.sched.comment }} {{ site.sched.flag.queue }} {{ site.sched.queue.testing }}
{{ site.sched.comment }} -N 1
{{ site.sched.comment }} -n 4
{{ site.sched.comment }} --mem=3G
# Load the computing environment we need
module load python3
# Execute the task
mpiexec python pi.py 100000000
```
{: .language-bash}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
```
{{ site.remote.bash_shebang }}
{{ site.sched.comment }} {{ site.sched.flag.name }} serial-pi
{{ site.sched.comment }} {{ site.sched.flag.queue }} {{ site.sched.queue.testing }}
{{ site.sched.comment }} -N 1
{{ site.sched.comment }} -n 1
{{ site.sched.comment }} --mem=3G
# Load the computing environment we need
module load python3
# Execute the task
python pi.py 100000000
```
{: .language-bash}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
```
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
991167 Sxxxx normal nn9299k 128 COMPLETED 0:0
```
{: .output}
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
> ## Benchmarking `fastqc`
>
> Create a job that runs the following command in the same directory as
> `.fastq` files
>
> ```
> fastqc name_of_fastq_file
> ```
> {: .language-bash}
>
> The `fastqc` command is provided by the `fastqc` module. You'll need to
> figure out a good amount of resources to ask for for this first "test run".
> You might also want to have the scheduler email you to tell you when the job
> is done.
>
> Hint: the job only needs 1 CPU and not too much memory or time. The trick is
> figuring out just how much you'll need!
{: .challenge}
Loading

0 comments on commit f778bf9

Please sign in to comment.