-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
72 changed files
with
5,442 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
This repository documents the software stack and the experimental setup | ||
of the manuscript: Automatic Code Generation for High-Performance | ||
Discontinuous Galerkin Methods on Modern Architectures. | ||
|
||
It provides the technical details of how the results of the manuscript were | ||
achieved, but reproduction might require adapting its scripts and configuration | ||
files to your computing environment. | ||
|
||
In general the procedure is: | ||
* Clone this repository recursively | ||
* Run ./patch.sh | ||
* Run ./build_{haswell,skylake}.sh | ||
* Go into the individual subfolders of build/dune-codegen-paper and apply this procedure to produce a plot: | ||
* Run the execution script (like `donkey.sbatch` or `skylake.sh`) | ||
(watch out for requirements on the working directory...) | ||
* If there, run `process.sh` to distill the data | ||
* Copy the resulting `csv` files into the source and commit them through `git-lfs` | ||
* Run `plot.py` in a suitable env (say through `run-in-dune-env`) to generate pdfs. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
ml gcc/6.4.0 | ||
ml benchmark/1.4.0 | ||
ml python/3.6.3 | ||
ml openmpi | ||
ml cmake | ||
ml openblas | ||
ml metis | ||
ml suite-sparse | ||
ml superlu | ||
ml parmetis | ||
|
||
SuiteSparse_ROOT=$SUITESPARSE_DIR | ||
MAKE_FLAGS=-j40 ./dune-common/bin/dunecontrol --module=dune-codegen-paper --builddir=$(pwd)/build --opts=./opts/haswell.opts all | ||
|
||
pushd build/dune-codegen-paper | ||
make -j40 haswell_poisson | ||
make -j40 haswell_stokes | ||
make -j40 costmodel_verification_poisson | ||
make -j40 costmodel_verification_poisson_skeleton | ||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
MAKE_FLAGS=-j40 ./dune-common/bin/dunecontrol --opts=skylake.opts --builddir=$(pwd)/build --module=dune-codegen-paper all | ||
|
||
# Spin up the autotune build server that runs exclusively on socket 0 and have code generation happen exclusively | ||
# on socket 1. Communication between the two is realized through the file system, see the server implementation | ||
hwloc-bind socket:0 ./dune-codegen-paper/autotune-scripts/autotune_build_server.py mpirun --bind-to core -np 20 & | ||
sleep 0.2 | ||
|
||
AUTOTUNE_MAKE="hwloc-bind socket:1 make -j20" | ||
|
||
pushd build/dune-codegen-paper | ||
$AUTOTUNE_MAKE skylake_poisson | ||
$AUTOTUNE_MAKE skylake_stokes | ||
popd | ||
|
||
echo "exit" >> tasks.txt |
Submodule dune-codegen
updated
from ff379a to e713c6
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
48 changes: 48 additions & 0 deletions
48
dune-codegen-paper/autotune-scripts/autotune_build_server.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
#!/usr/bin/env python | ||
|
||
""" | ||
This is a small server-like application that can run commands that were written | ||
to a file. This is necessary to separate the code generation process from the | ||
autotune run environment, using the file system for synchronization. | ||
""" | ||
|
||
import filelock | ||
import os | ||
import subprocess | ||
import sys | ||
import time | ||
|
||
filename = "tasks.txt" | ||
open(filename, "w").close() | ||
lock = "{}.lock".format(filename) | ||
arg_prefix = sys.argv[1:] | ||
|
||
keep_running = True | ||
while keep_running: | ||
with filelock.FileLock(lock): | ||
with open(filename, "r") as f: | ||
lines = f.readlines() | ||
|
||
command = None | ||
if lines: | ||
command = lines[0].strip("\n") | ||
|
||
if command == "exit": | ||
keep_running = False | ||
command = None | ||
|
||
if command is not None: | ||
subprocess.call(arg_prefix + command.split()) | ||
|
||
with filelock.FileLock(lock): | ||
with open(filename, "r") as f: | ||
lines = f.readlines() | ||
|
||
with open(filename, "w") as f: | ||
for line in lines[1:]: | ||
f.write(line) | ||
|
||
time.sleep(0.1) | ||
|
||
os.remove(filename) | ||
os.remove(lock) |
26 changes: 26 additions & 0 deletions
26
dune-codegen-paper/autotune-scripts/skylake_benchmark_wrapper.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
#!/usr/bin/env python | ||
|
||
import filelock | ||
import sys | ||
import time | ||
|
||
filename = "/home/dkempf/dune-codegen-paper-software/tasks.txt" | ||
lock = "{}.lock".format(filename) | ||
command = " ".join(sys.argv[1:]) | ||
|
||
# Submit the command into the queue | ||
with filelock.FileLock(lock): | ||
with open(filename, "a") as f: | ||
f.write("{}\n".format(command)) | ||
|
||
# Poll the queue for our command still being in | ||
while True: | ||
found = False | ||
for line in open(filename, "r"): | ||
if command in line: | ||
found = True | ||
|
||
if not found: | ||
sys.exit(0) | ||
|
||
time.sleep(0.1) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
add_subdirectory(poisson_skeleton) | ||
add_subdirectory(poisson_volume) |
5 changes: 5 additions & 0 deletions
5
dune-codegen-paper/costmodel-verification/poisson_skeleton/CMakeLists.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
dune_add_formcompiler_system_test(UFLFILE poisson_dg_tensor_skeleton.ufl | ||
BASENAME costmodel_verification_poisson_skeleton | ||
INIFILE costmodel_verification_poisson_skeleton.mini | ||
NO_TESTS | ||
) |
Binary file added
BIN
+11.8 KB
dune-codegen-paper/costmodel-verification/poisson_skeleton/costmodel_poissonskeleton.pdf
Binary file not shown.
63 changes: 63 additions & 0 deletions
63
...aper/costmodel-verification/poisson_skeleton/costmodel_verification_poisson_skeleton.mini
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
__name = costmodel_verification_poisson_skeleton_{__exec_suffix} | ||
__exec_suffix = strategy{sample_str}_{opcount_suffix} | ||
opcount_suffix = opcount, nonopcount | expand opcount | ||
|
||
# All keys that define the sampling range/rate for this costmodel verification plot thingy | ||
num_samples = 101 | ||
mincost = 7500 | ||
maxcost = 40000 | ||
sample = 0 | range {num_samples} | expand | toint | ||
sample_str = {sample} | zfill 4 | ||
target = {mincost} + ({sample} / ({num_samples} - 1)) * ({maxcost} - {mincost}) | eval | ||
|
||
# Calculate the size of the grid to equlibritate it to 100 MB/rank | ||
# Input parameters | ||
dim = 3 | ||
mbperrank = 100 | ||
ranks = 32 | ||
floatingbytes = 8 | ||
|
||
# Metaini Calculations | ||
memperrank = {mbperrank} * 1048576 | eval | ||
dofsperdir = {formcompiler.ufl_variants.degree} + 1 | eval | ||
celldofs = {dofsperdir} ** {dim} | eval | ||
cellsperrank = {memperrank} / ({floatingbytes} * {celldofs}) | eval | ||
cellsperdir = {cellsperrank} ** (1/{dim}) | eval | toint | ||
firstdircells = {ranks} * {cellsperdir} | eval | ||
dimminusone = {dim} - 1 | eval | ||
ones = 1 | repeat {dimminusone} | ||
otherdircells = {cellsperdir} | repeat {dimminusone} | ||
|
||
# Setup the grid! | ||
extension = 1.0 | repeat {dim} | ||
cells = {firstdircells} {otherdircells} | ||
partitioning = {ranks} {ones} | ||
periodic = true | repeat {dim} | ||
|
||
# Set up the timing identifier | ||
identifier = deg{formcompiler.ufl_variants.degree}_strategy{sample_str} | ||
|
||
[wrapper.vtkcompare] | ||
name = {__name} | ||
extension = vtu | ||
|
||
[formcompiler] | ||
instrumentation_level = 2 | ||
opcounter = 1, 0 | expand opcount | ||
performance_measuring = 0, 1 | expand opcount | ||
|
||
[formcompiler.r] | ||
fastdg = 1 | ||
sumfact = 1 | ||
vectorization_quadloop = 1 | ||
vectorization_strategy = target | ||
vectorization_allow_quadrature_changes = 1 | ||
vectorization_target = {target} | ||
quadrature_order = {formcompiler.ufl_variants.degree} * 2 | eval | ||
assure_statement_ordering = 1 | ||
generate_jacobians = 0 | ||
matrix_free = 0 | ||
|
||
[formcompiler.ufl_variants] | ||
cell = hexahedron | ||
degree = 4 |
101 changes: 101 additions & 0 deletions
101
dune-codegen-paper/costmodel-verification/poisson_skeleton/doftimes.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
deg4_strategy0000 4 residual_evaluation 28.914836542039204 | ||
deg4_strategy0001 4 residual_evaluation 28.786116077171844 | ||
deg4_strategy0002 4 residual_evaluation 27.296327222554076 | ||
deg4_strategy0003 4 residual_evaluation 27.37324618019523 | ||
deg4_strategy0004 4 residual_evaluation 25.07894009635136 | ||
deg4_strategy0005 4 residual_evaluation 25.15784382523873 | ||
deg4_strategy0006 4 residual_evaluation 21.215661945490687 | ||
deg4_strategy0007 4 residual_evaluation 21.03963004393431 | ||
deg4_strategy0008 4 residual_evaluation 20.739215200594472 | ||
deg4_strategy0009 4 residual_evaluation 20.718025877826648 | ||
deg4_strategy0010 4 residual_evaluation 20.854223109234244 | ||
deg4_strategy0011 4 residual_evaluation 20.745480964742892 | ||
deg4_strategy0012 4 residual_evaluation 20.20386987698219 | ||
deg4_strategy0013 4 residual_evaluation 20.168202063150176 | ||
deg4_strategy0014 4 residual_evaluation 19.880871408831737 | ||
deg4_strategy0015 4 residual_evaluation 19.782682592961823 | ||
deg4_strategy0016 4 residual_evaluation 19.309928981985856 | ||
deg4_strategy0017 4 residual_evaluation 18.81085903460431 | ||
deg4_strategy0018 4 residual_evaluation 18.760362894657337 | ||
deg4_strategy0019 4 residual_evaluation 18.825757795546068 | ||
deg4_strategy0020 4 residual_evaluation 18.690789606334594 | ||
deg4_strategy0021 4 residual_evaluation 18.641310046553446 | ||
deg4_strategy0022 4 residual_evaluation 18.88836574329448 | ||
deg4_strategy0023 4 residual_evaluation 18.787929457000736 | ||
deg4_strategy0024 4 residual_evaluation 18.814595151766436 | ||
deg4_strategy0025 4 residual_evaluation 18.757326021885202 | ||
deg4_strategy0026 4 residual_evaluation 17.35397475104736 | ||
deg4_strategy0027 4 residual_evaluation 17.478313511009237 | ||
deg4_strategy0028 4 residual_evaluation 16.981901761015838 | ||
deg4_strategy0029 4 residual_evaluation 17.03799517396567 | ||
deg4_strategy0030 4 residual_evaluation 16.618699738769656 | ||
deg4_strategy0031 4 residual_evaluation 16.9878146794563 | ||
deg4_strategy0032 4 residual_evaluation 16.819390048237306 | ||
deg4_strategy0033 4 residual_evaluation 16.80586436819416 | ||
deg4_strategy0034 4 residual_evaluation 16.596788034032823 | ||
deg4_strategy0035 4 residual_evaluation 16.865751071176543 | ||
deg4_strategy0036 4 residual_evaluation 16.850575324537054 | ||
deg4_strategy0037 4 residual_evaluation 16.368576520489924 | ||
deg4_strategy0038 4 residual_evaluation 16.329875983350362 | ||
deg4_strategy0039 4 residual_evaluation 16.33067738107672 | ||
deg4_strategy0040 4 residual_evaluation 16.26328657394992 | ||
deg4_strategy0041 4 residual_evaluation 16.564884862857824 | ||
deg4_strategy0042 4 residual_evaluation 16.89552481692433 | ||
deg4_strategy0043 4 residual_evaluation 16.811524645611254 | ||
deg4_strategy0044 4 residual_evaluation 15.857696029794866 | ||
deg4_strategy0045 4 residual_evaluation 15.668974335136726 | ||
deg4_strategy0046 4 residual_evaluation 15.975069640797507 | ||
deg4_strategy0047 4 residual_evaluation 16.062859632424562 | ||
deg4_strategy0048 4 residual_evaluation 15.095765630688737 | ||
deg4_strategy0049 4 residual_evaluation 14.987158359826916 | ||
deg4_strategy0050 4 residual_evaluation 15.15365142529533 | ||
deg4_strategy0051 4 residual_evaluation 14.41769845634266 | ||
deg4_strategy0052 4 residual_evaluation 14.325582858587321 | ||
deg4_strategy0053 4 residual_evaluation 14.414271576910732 | ||
deg4_strategy0054 4 residual_evaluation 14.121689483059358 | ||
deg4_strategy0055 4 residual_evaluation 14.27016705866571 | ||
deg4_strategy0056 4 residual_evaluation 14.155684579775654 | ||
deg4_strategy0057 4 residual_evaluation 14.415824680200656 | ||
deg4_strategy0058 4 residual_evaluation 14.439354239334211 | ||
deg4_strategy0059 4 residual_evaluation 14.354897226219567 | ||
deg4_strategy0060 4 residual_evaluation 14.013167840020731 | ||
deg4_strategy0061 4 residual_evaluation 14.41151844321232 | ||
deg4_strategy0062 4 residual_evaluation 14.033806682396909 | ||
deg4_strategy0063 4 residual_evaluation 14.18315993093047 | ||
deg4_strategy0064 4 residual_evaluation 14.404304462291838 | ||
deg4_strategy0065 4 residual_evaluation 14.136056222353657 | ||
deg4_strategy0066 4 residual_evaluation 14.316890850582423 | ||
deg4_strategy0067 4 residual_evaluation 14.189657566523545 | ||
deg4_strategy0068 4 residual_evaluation 14.45981840931552 | ||
deg4_strategy0069 4 residual_evaluation 13.882841934331323 | ||
deg4_strategy0070 4 residual_evaluation 14.188075255028135 | ||
deg4_strategy0071 4 residual_evaluation 13.832266433035754 | ||
deg4_strategy0072 4 residual_evaluation 14.079695663871965 | ||
deg4_strategy0073 4 residual_evaluation 13.957420745327852 | ||
deg4_strategy0074 4 residual_evaluation 14.147798937542175 | ||
deg4_strategy0075 4 residual_evaluation 13.915262760605422 | ||
deg4_strategy0076 4 residual_evaluation 13.816980208033899 | ||
deg4_strategy0077 4 residual_evaluation 14.181625559080294 | ||
deg4_strategy0078 4 residual_evaluation 14.058221181343917 | ||
deg4_strategy0079 4 residual_evaluation 12.839338537183789 | ||
deg4_strategy0080 4 residual_evaluation 13.147983611971688 | ||
deg4_strategy0081 4 residual_evaluation 13.209503646440332 | ||
deg4_strategy0082 4 residual_evaluation 12.477766133374358 | ||
deg4_strategy0083 4 residual_evaluation 12.850271801015912 | ||
deg4_strategy0084 4 residual_evaluation 12.663441741557138 | ||
deg4_strategy0085 4 residual_evaluation 12.71094515181195 | ||
deg4_strategy0086 4 residual_evaluation 12.57058795040682 | ||
deg4_strategy0087 4 residual_evaluation 12.888045324091083 | ||
deg4_strategy0088 4 residual_evaluation 12.907272220951395 | ||
deg4_strategy0089 4 residual_evaluation 12.693912184434208 | ||
deg4_strategy0090 4 residual_evaluation 12.785202005772998 | ||
deg4_strategy0091 4 residual_evaluation 12.907914105547931 | ||
deg4_strategy0092 4 residual_evaluation 12.739893784112773 | ||
deg4_strategy0093 4 residual_evaluation 12.640992548580334 | ||
deg4_strategy0094 4 residual_evaluation 12.674077365547818 | ||
deg4_strategy0095 4 residual_evaluation 12.3204554947976 | ||
deg4_strategy0096 4 residual_evaluation 12.413198595873704 | ||
deg4_strategy0097 4 residual_evaluation 12.35670351433441 | ||
deg4_strategy0098 4 residual_evaluation 12.380987588364926 | ||
deg4_strategy0099 4 residual_evaluation 12.506625356564644 | ||
deg4_strategy0100 4 residual_evaluation 12.257501629249035 |
66 changes: 66 additions & 0 deletions
66
dune-codegen-paper/costmodel-verification/poisson_skeleton/donkey.sbatch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
#!/bin/bash | ||
|
||
# IMPORTANT | ||
# Remember to set the working directory of this script to | ||
# the top level build directory of dune-perftool-paperplots | ||
# through: | ||
# sbatch -D <workdir> | ||
|
||
# Set a name for the job | ||
#SBATCH -J costmodel | ||
|
||
# Number of processes | ||
#SBATCH -n 32 | ||
|
||
# Choose the SLURM partition (sinfo for overview) | ||
#SBATCH -p haswell16c | ||
|
||
# Each process needs two PUs: circumvent hyperthreading | ||
#SBATCH -c 2 | ||
|
||
set -e | ||
|
||
# Load modules | ||
ml gcc/6.4.0 | ||
ml python/3.6.3 | ||
ml openmpi | ||
ml cmake | ||
ml openblas | ||
ml metis | ||
ml suite-sparse | ||
ml superlu | ||
|
||
# Pin processes to cores | ||
# (Possible values: socket, core) | ||
SRUNOPT="--cpu_bind=verbose,core" | ||
|
||
# Delete old measurement results | ||
rm -f *.csv | ||
|
||
# Search for runnable executables and execute all of them | ||
FILES=$(ls ./costmodel-verification/poisson_skeleton/*.ini) | ||
for inifile in $FILES | ||
do | ||
line=$(grep ^"opcounter = " $inifile) | ||
extract=${line##opcounter = } | ||
UPPER=10 | ||
if [ $extract -eq 1 ] | ||
then | ||
UPPER=1 | ||
fi | ||
COUNT=0 | ||
while [ $COUNT -lt $UPPER ]; do | ||
exec=${inifile%.ini} | ||
srun $SRUNOPT ./$exec $inifile | ||
COUNT=$((COUNT + 1)) | ||
done | ||
done | ||
|
||
# Process the measurement results | ||
./run-in-dune-env process_measurements.py | ||
|
||
# Move it to the correct subfolder | ||
cp floprates.csv doftimes.csv ./costmodel-verification/poisson_skeleton | ||
|
||
# And delete all intermediate data | ||
rm *.csv |
Oops, something went wrong.