Skip to content

frontier

Neil Lindquist edited this page Aug 1, 2023 · 7 revisions

Frontier and Crusher at Oak Ridge National Laboratory

Currently, OLCF has the same software stack on Frontier and Crusher (since the latter is a development version of the former). So, the same configuration works for both machines. (Although, note that account numbers on Crusher are appended with _crusher.)

Compiling

The SLATE repo can be cloned as

git clone --recursive https://github.com/icl-utk-edu/slate.git

Currently, the GCC stack appears to be the most robust. It can be configured as:

module load craype-accel-amd-gfx90a
module load PrgEnv-gnu rocm
export MPICH_GPU_SUPPORT_ENABLED=1
export CPATH=${ROCM_PATH}/include:${CPATH}
export LIBRARY_PATH=${ROCM_PATH}/lib:$LIBRARY_PATH
export LD_LIBRARY_PATH="${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}"

cat > make.inc << END
CXX=CC
FC=ftn
CXXFLAGS+=-I${ROCM_PATH}/include -craype-verbose -g
LDFLAGS+=-L${ROCM_PATH}/lib -craype-verbose
blas=libsci
gpu_backend=hip
hip_arch=gfx90a
mpi=cray
END

Then, the tester can be compiled with nice make -j 8 tester.

Running

The SLATE tests can be run as

srun -A $ACCOUNT_NUM -t 5:00 -J slate_example -N 1 -n 4 -c 7 --gpus-per-node=8 --ntasks-per-gpu=1 --threads-per-core=1 --gpu-bind=closest test/tester  --type d --nb 512 gesv

Alternatively, the job can be submitted as a batch script.

#!/bin/bash
#SBATCH -A $ACCOUNT_NUM
#SBATCH -t 5:00
#SBATCH -J slate_example
#SBATCH -o %x-%j.out
#SBATCH -N 1

srun -n 4 -c 7 --gpus-per-node=4 --ntasks-per-gpu=1 --threads-per-core=1 --gpu-bind=closest test/tester --type d --nb 512 gesv
Clone this wiki locally