Skip to content

Latest commit

 

History

History
293 lines (283 loc) · 15.2 KB

How to Install Slurm on CentOS 7 Cluster.md

File metadata and controls

293 lines (283 loc) · 15.2 KB

How to Install Slurm on CentOS 7 Cluster

Slurm is an open-source workload manager designed for Linux clusters of all sizes. It’s a great system for queuing jobs for your HPC applications. I’m going to show you how to install Slurm on a CentOS 7 cluster.

  1. Delete failed installation of Slurm
  2. Create the global users
  3. Install Munge
  4. Install Slurm
  5. Use Slurm

 

Cluster Server and Compute Nodes

I configured our nodes with the following hostnames using these steps. Our server is:

buhpc3

The clients are:

buhpc1
buhpc2
buhpc3
buhpc4
buhpc5
buhpc6

 

Delete failed installation of Slurm

I leave this optional step in case you tried to install Slurm, and it didn’t work. We want to uninstall the parts related to Slurm unless you’re using the dependencies for something else.

First, I remove the database where I kept Slurm’s accounting.

yum remove mariadb-server mariadb-devel -y

Next, I remove Slurm and Munge. Munge is an authentication tool used to identify messaging from the Slurm machines.

yum remove slurm munge munge-libs munge-devel -y

I check if the slurm and munge users exist.

cat /etc/passwd | grep slurm

Then, I delete the users and corresponding folders.

userdel - r slurm
userdel -r munge
userdel: user munge is currently used by process 26278
kill 26278
userdel -r munge

Slurm, Munge, and Mariadb should be adequately wiped. Now, we can start a fresh installation that actually works.

 

Create the global users

Slurm and Munge require consistent UID and GID across every node in the cluster.

If your cluster has been configured, just add some new nodes, you should copy the /etc/munge/munge.key from your configured nodes to all your new nodes.

scp /etc/munge/munge.key buhpc02:/etc/munge/munge.key

If you create a new cluster, For all the nodes, before you install Slurm or Munge:

export MUNGEUSER=1001
groupadd -g $MUNGEUSER munge
useradd  -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGEUSER -g munge  -s /sbin/nologin munge
export SLURMUSER=1002
groupadd -g $SLURMUSER slurm
useradd  -m -c "SLURM workload manager" -d /var/lib/slurm -u $SLURMUSER -g slurm  -s /bin/bash slurm

 

Install Munge

Since I’m using CentOS 7, I need to get the latest EPEL repository.

yum install epel-release -y

Now, I can install Munge.

yum install munge munge-libs munge-devel -y

After installing Munge, I need to create a secret key on the Server. My server is on the node with hostname, buhpc3. Choose one of your nodes to be the server node.

First, we install rng-tools to properly create the key.

yum install rng-tools -y
rngd -r /dev/urandom

Now, we create the secret key. You only have to do the creation of the secret key on the server.

/usr/sbin/create-munge-key -r
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
chown munge: /etc/munge/munge.key
chmod 400 /etc/munge/munge.key

After the secret key is created, you will need to send this key to all of the compute nodes.

scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge

Now, we SSH into every node and correct the permissions as well as start the Munge service.

chown -R munge: /etc/munge/ /var/log/munge/
chmod 0700 /etc/munge/ /var/log/munge/
systemctl enable munge
systemctl start munge

To test Munge, we can try to access another node with Munge from our server node, buhpc3.

munge -n
munge -n | unmunge
munge -n | ssh 3.buhpc.com unmunge
remunge

If you encounter no errors, then Munge is working as expected.

 

Install Slurm

Slurm has a few dependencies that we need to install before proceeding.

yum install openssl openssl-devel pam-devel mariadb-server mariadb-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad perl-Switch http-parser-devel json-c-devel lua-json -y

Now, we download the latest version of Slurm preferably in our shared folder. The latest version of Slurm may be different from our version.

download the latest stable version of slurm by click: slurm-17.11.2.TAR.BZ2

cd /nfs

If you don’t have rpmbuild yet:

yum install rpm-build python3 cpanm* gcc gcc-c++ -y
rpmbuild -ta -with lua -with hwloc slurm-17.11.2.tar.bz2

We will check the rpms created by rpmbuild.

cd /root/rpmbuild/RPMS/x86_64

Now, we will move the Slurm rpms for installation for the server and computer nodes.

mkdir /nfs/slurm-rpms
cp /root/rpmbuild/BUILD/slurm-17.11.2/src/plugins/auth/munge/.libs/auth_munge.so /usr/lib64/slurm/
cp /root/rpmbuild/BUILD/slurm-17.11.2/src/plugins/cred/munge/.libs/cred_munge.so /usr/lib64/slurm/

On every node that you want to be a server and compute node, we install those rpms. In our case, I want every node to be a compute node.

yum --nogpgcheck localinstall slurm-17.11.2-1.el7.centos.x86_64.rpm slurm-devel-17.11.2-1.el7.centos.x86_64.rpm 
	slurm-munge-17.11.2-1.el7.centos.x86_64.rpm slurm-perlapi-17.11.2-1.el7.centos.x86_64.rpm slurm-plugins-17.11.2-1.el7.centos.x86_64.rpm 
	slurm-sjobexit-17.11.2-1.el7.centos.x86_64.rpm slurm-sjstat-17.11.2-1.el7.centos.x86_64.rpm slurm-torque-17.11.2-1.el7.centos.x86_64.rpm

After we have installed Slurm on every machine, we will configure Slurm properly.

I leave everything default except:

ControlMachine: buhpc3
ControlAddr: 128.197.116.18
NodeName: buhpc[1-6]
CPUs: 4
StateSaveLocation: /var/spool/slurmctld
SlurmctldLogFile: /var/log/slurm/slurmctld.log
SlurmdLogFile: /var/log/slurm/slurmd.log
ClusterName: buhpc

After you hit Submit on the form, you will be given the full Slurm configuration file to copy.

On the server node, which is buhpc3:

cd /etc/slurm
vim slurm.conf

Copy the form’s Slurm configuration file that was created from the website and paste it into slurm.conf. We still need to change something in that file.

Underneathe slurm.conf “# COMPUTE NODES,” we see that Slurm tries to determine the IP addresses automatically with the one line.

NodeName=buhpc[1-6] CPUs = 4 State = UNKOWN

I don’t use IP addresses in order, so I manually delete this one line and change it to:

NodeName=buhpc1 NodeAddr=128.197.115.158 CPUs=4 State=UNKNOWN
NodeName=buhpc2 NodeAddr=128.197.115.7 CPUs=4 State=UNKNOWN
NodeName=buhpc3 NodeAddr=128.197.115.176 CPUs=4 State=UNKNOWN
NodeName=buhpc4 NodeAddr=128.197.115.17 CPUs=4 State=UNKNOWN
NodeName=buhpc5 NodeAddr=128.197.115.9 CPUs=4 State=UNKNOWN
NodeName=buhpc6 NodeAddr=128.197.115.15 CPUs=4 State=UNKNOWN

After you explicitly put in the NodeAddr IP Addresses, you can save and quit. Here is my full slurm.conf and what it looks like:

# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=buhpc3
ControlAddr=128.197.115.176
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=buhpc
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm/slurmctld.log
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm/slurmd.log
#
#
# COMPUTE NODES
NodeName=buhpc1 NodeAddr=128.197.115.158 CPUs=4 State=UNKNOWN
NodeName=buhpc2 NodeAddr=128.197.115.7 CPUs=4 State=UNKNOWN
NodeName=buhpc3 NodeAddr=128.197.115.176 CPUs=4 State=UNKNOWN
NodeName=buhpc4 NodeAddr=128.197.115.17 CPUs=4 State=UNKNOWN
NodeName=buhpc5 NodeAddr=128.197.115.9 CPUs=4 State=UNKNOWN
NodeName=buhpc6 NodeAddr=128.197.115.15 CPUs=4 State=UNKNOWN
PartitionName=debug Nodes=buhpc[1-6] Default=YES MaxTime=INFINITE State=UP

Now that the server node has the slurm.conf correctly, we need to send this file to the other compute nodes.

scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf

Or, you can do this in the manager node to send your file to all nodes in the cluster.

xdcp all  /etc/slurm/slurm.conf  /etc/slurm/slurm.conf

Now, we will configure the server node, buhpc3. We need to make sure that the server has all the right configurations and files.

mkdir /var/spool/slurmctld
chown slurm: /var/spool/slurmctld
chmod 755 /var/spool/slurmctld
mkdir  /var/log/slurm
touch /var/log/slurm/slurmctld.log
touch /var/log/slurm/slurm_jobacct.log /var/log/slurm/slurm_jobcomp.log
chown -R slurm:slurm /var/log/slurm

Now, we will configure all the compute nodes, buhpc[1-6]. We need to make sure that all the compute nodes have the right configurations and files.

mkdir /var/spool/slurmd
chown slurm: /var/spool/slurmd
chmod 755 /var/spool/slurmd
mkdir /var/log/slurm
touch /var/log/slurm/slurmd.log
chown -R slurm:slurm /var/log/slurm

Use the following command to make sure that slurmd is configured properly.

slurmd -C

You should get something like this:

ClusterName=(null) NodeName=buhpc3 CPUs=4 Boards=1 SocketsPerBoard=2 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=7822 TmpDisk=45753
UpTime=13-14:27:52

The firewall will block connections between nodes, so I normally disable the firewall on the compute nodes except for buhpc3.

systemctl stop firewalld
systemctl disable firewalld

On the server node, buhpc3, I usually open the default ports that Slurm uses:

firewall-cmd --permanent --zone=public --add-port=6817/udp
firewall-cmd --permanent --zone=public --add-port=6817/tcp
firewall-cmd --permanent --zone=public --add-port=6817/udp
firewall-cmd --permanent --zone=public --add-port=6818/tcp
firewall-cmd --permanent --zone=public --add-port=6818/udp
firewall-cmd --permanent --zone=public --add-port=7321/tcp
firewall-cmd --permanent --zone=public --add-port=7321/udp
firewall-cmd --reload

If the port freeing does not work, stop the firewalld for testing. Next, we need to check for out of sync clocks on the cluster. On every node:

yum install ntp -y
chkconfig ntpd on
ntpdate pool.ntp.org
systemctl start ntpd

create cluster with command:

sacctmgr create cluster clustername

The clocks should be synced, so we can try starting Slurm! On all the compute nodes, buhpc[1-6]:

systemctl enable slurmd.service
systemctl start slurmd.service
systemctl status slurmd.service

Now, on the server node, buhpc3:

systemctl enable slurmctld.service
systemctl start slurmctld.service
systemctl status slurmctld.service

When you check the status of slurmd and slurmctld, we should see if they successfully completed or not. If problems happen, check the logs!

Compute node bugs: tail /var/log/slurm/slurmd.log
Server node bugs: tail /var/log/slurm/slurmctld.log

 

Use Slurm

To display the compute nodes:

scontrol show nodes

-N allows you to choose how many compute nodes that you want to use. To run jobs on the server node, buhpc3:

srun -N5 /bin/hostname
buhpc3
buhpc2
buhpc4
buhpc5
buhpc1

To display the job queue:

scontrol show jobs
JobId=16 JobName=hostname
UserId=root(0) GroupId=root(0)
Priority=4294901746 Nice=0 Account=(null) QOS=(null)
JobState=COMPLETED Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2016-04-10T16:26:04 EligibleTime=2016-04-10T16:26:04
StartTime=2016-04-10T16:26:04 EndTime=2016-04-10T16:26:04
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=debug AllocNode:Sid=buhpc3:1834
ReqNodeList=(null) ExcNodeList=(null)
NodeList=buhpc[1-5]
BatchHost=buhpc1
NumNodes=5 NumCPUs=20 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=20,node=5
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) Gres=(null) Reservation=(null)
Shared=0 Contiguous=0 Licenses=(null) Network=(null)
Command=/bin/hostname
WorkDir=/root
Power= SICP=0

To submit script jobs, create a script file that contains the commands that you want to run. Then:

sbatch -N2 script-file

Slurm has a lot of useful commands. You may have heard of other queuing tools like torque. Here’s a useful link for the command differences: http://www.sdsc.edu/~hocks/FG/PBS.slurm.html