The Run 3 validation framework is a tool for an easy execution, testing and validation of the Run 3 analysis code on large local samples.
Its features include
- simple specification of input datasets,
- simple configuration and activation of analysis tasks,
- easy generation of the O2 command for complex workflow topologies,
- job parallelisation,
- output merging,
- error checking and reporting,
- specification of postprocessing.
It also provides tools for:
- post mortem debugging of failing jobs,
- comparison of histograms between ROOT files,
- visualisation of workflow dependencies,
- downloading of data samples from the Grid,
- maintenance of Git repositories and installations of aliBuild packages.
The original purpose of the Run 3 validation framework was to provide a compact and flexible tool for validation of the O2(Physics) analysis framework by comparison of its output to its AliPhysics counterpart. The general idea is to run the same analysis using AliPhysics and O2(Physics) and produce comparison plots.
However, it can be used without AliPhysics as well to run O2 analyses locally, similar to running trains on AliHyperloop. This makes it a convenient framework for local development, testing and debugging of O2(Physics) code.
The validation framework is a general configurable platform that gives user the full control over what is done. Its flexibility is enabled by strict separation of its specialised components into a system of Bash scripts. Configuration is separate from execution code, input configuration is separate from task configuration, execution steps are separate from the main steering code.
- The steering script
runtest.sh
provides control parameters and interface to the machinery for task execution. - User provides configuration Bash scripts which:
- modify control parameters,
- produce modified configuration files,
- generate step scripts executed by the framework in the validation steps.
Execution code can be found in the exec
directory.
The user should not touch anything in this directory!
The steering script runtest.sh
performs the following execution steps:
- Load input specification.
- Load tasks configuration.
- Print out input description.
- Clean before running. (activated by
DOCLEAN=1
)- Deletes specified files (produced by previous runs).
- Generate list of input files.
- Modify the JSON file.
- Convert
AliESDs.root
toAO2D.root
. (activated byDOCONVERT=1
)- Executes the AliPhysics conversion macro in parallel jobs.
- Specified input
AliESDs.root
files are converted intoAO2D.root
files in theoutput_conversion
directory.
- Run AliPhysics tasks. (activated by
DOALI=1
)- Executes the AliPhysics step script in parallel jobs.
- Produces the
AnalysisResults_ALI.root
file, resulting from merging output files in theoutput_ali
directory.
- Run O2 tasks. (activated by
DOO2=1
)- Executes the O2 step script in parallel jobs.
- Produces the
AnalysisResults_O2.root
file, resulting from merging output files in theoutput_o2
directory. - If
SAVETREES=1
, tables are saved as trees in theAnalysisResults_trees_O2.root
file. - Parameters of individual tasks are picked up from the JSON configuration file (
dpl-config.json
by default). - By default, the list of input files includes files produced by the conversion step.
- In case you want to use
AO2D.root
files as input directly, you can setINPUT_IS_O2=1
in your input specification and use it in your configuration to deactivate incompatible steps (typically the conversion and AliPhysics tasks).
- Run output postprocessing. (activated by
DOPOSTPROCESS=1
)- Executes the postprocessing step script.
- This step typically compares AliPhysics and O2 output and produces plots.
- Clean after running. (activated by
DOCLEAN=1
)- Deletes specified (temporary) files.
- Done
- This step is just a visual confirmation that all steps have finished without errors.
All steps are activated by default and some can be disabled individually by setting the respective activation variables to 0
in user's task configuration.
The steering script runtest.sh
can be executed with the following optional arguments:
bash [<path>/]runtest.sh [-h] [-i <input-configuration>] [-t <task-configuration>] [-d]
<input-configuration>
Input specification script. See Input specification.
- Defaults to
config_input.sh
(in the current directory).
<task-configuration>
Task configuration script. See Task configuration.
- Defaults to
config_tasks.sh
(in the current directory).
-d
Debug mode. Prints out more information about settings and execution.
-h
Help. Prints out the usage specification above.
The input specification script is a Bash script that sets input parameters used by the steering script.
This script defines which data will be processed and how.
These are the available input parameters and their default values:
INPUT_LABEL="nothing"
Input descriptionINPUT_DIR="$PWD"
Input directoryINPUT_FILES="AliESDs.root"
Input file patternINPUT_SYS="pp"
Collision system ("pp"
,"PbPb"
)INPUT_RUN=2
LHC Run (2, 3, 5)INPUT_IS_O2=0
Input files are in O2 format.INPUT_IS_MC=0
Input files are MC data.INPUT_PARENT_MASK=""
Path replacement mask for the input directory of parent files in case of linked derived O2 input. Set to";"
if no replacement needed.INPUT_TASK_CONFIG=""
Input-specific task configuration (e.g. enabling converters), overriding the task configuration. String of space-separated commands.JSON="dpl-config.json"
O2 device configuration
This allows you to define several input datasets and switch between them easily by setting the corresponding value of INPUT_CASE
.
Other available parameters allow you to specify how many input files to process and how to parallelise the job execution.
The task configuration script is a Bash script that modifies the task parameters used by the steering script.
This script defines which validation steps will run and what they will do.
- It cleans the directory, deactivates incompatible steps, modifies the JSON file, generates step scripts.
- The body of the script has to provide these mandatory functions:
Clean
Performs cleanup before and after running.AdjustJson
Modifies the JSON file (e.g. selection cut activation).MakeScriptAli
Generates the AliPhysics step scriptscript_ali.sh
.MakeScriptO2
Generates the O2 step scriptscript_o2.sh
.MakeScriptPostprocess
Generates the postprocessing step scriptscript_postprocess.sh
(e.g. plotting).
- The
Clean
function takes one argument:$1=1
for cleaning before running,$1=2
for cleaning after running. - The AliPhysics and O2 step scripts take two arguments:
$1="<input-file>"
,$2="<JSON-file>"
. - The postprocessing step script takes two arguments:
$1="<O2-output-file>"
,$2="<AliPhysics-output-file>"
.
Configuration that should be defined in the task configuration includes:
- Deactivation of the validation steps (
DOCLEAN
,DOCONVERT
,DOALI
,DOO2
,DOPOSTPROCESS
) - Customisation of the commands loading the AliPhysics, O2Physics and postprocessing environments (
ENV_ALI
,ENV_O2
,ENV_POST
). By default the latest builds of AliPhysics, O2Physics and ROOT are used, respectively. - Any other parameters related to "what should run and how", e.g.
SAVETREES
,MAKE_GRAPH
,USEALIEVCUTS
The full O2 command, executed in the O2 step script to run the activated O2 workflows, is generated in the MakeScriptO2
function using a dedicated Python script make_command_o2.py
.
This script generates the command using a YAML database (workflows.yml
) that specifies workflow options and how workflows depend on each other.
You can consider a workflow specification in this database to be the equivalent of a wagon definition on AliHyperloop, including the definition of the wagon name, the workflow name, the dependencies and the derived data. The main difference is that the device configuration is stored in the JSON file.
The workflow database has two sections: options
and workflows
.
The options
section defines global
options, used once at the end of the command, and local
options, used for every workflow.
The workflows
section contains the "wagon" definitions.
The available parameters are:
executable
Workflow command, if different from the "wagon" name- This allows you to define multiple wagons for the same workflow.
dependencies
Direct dependencies (i.e. other wagons directly needed to run this wagon)- Allowed formats: string, list of strings
- Direct dependencies are wagons that produce tables consumed by this wagon. You can figure them out using the
find_dependencies.py
script in O2Physics.
requires_mc
Boolean parameter to specify whether the workflow can only run on MCoptions
Command line options. (Currently not supported on AliHyperloop.)- Allowed formats: string, list of strings, dictionary with keys
default
,real
,mc
- Allowed formats: string, list of strings, dictionary with keys
tables
Descriptions of output tables to be saved as trees- Allowed formats: string, list of strings, dictionary with keys
default
,real
,mc
- Allowed formats: string, list of strings, dictionary with keys
The make_command_o2.py
script allows you to generate a topology graph to visualise the dependencies defined in the database, using Graphviz.
Generation of the topology graph can be conveniently enabled with MAKE_GRAPH=1
in the task configuration.
Dummy examples of the configuration files can be found in:
Follow the official AliPhysics installation and O2(Physics) installation instructions.
Make sure the AliPhysics and O2Physics environments can be entered using the following respective commands.
alienv enter AliPhysics/latest
alienv enter O2Physics/latest
git clone --origin upstream https://github.com/AliceO2Group/Run3Analysisvalidation.git
cd Run3Analysisvalidation
- Create your fork repository on GitHub.
- Add it as your remote:
git remote add origin [email protected]:<your-github-username>/Run3Analysisvalidation.git
The execution of validation steps is parallelised using the GNU Parallel tool. You need to have it installed on your machine to run the code in parallel jobs. You can install GNU Parallel on Debian/Ubuntu-based systems with:
sudo apt install parallel
Now you are ready to run the validation code.
Make sure that your Bash environment is clean! Do not load ROOT, AliPhysics, O2, O2Physics or any other aliBuild package environment before running the framework!
Enter any directory and execute the steering script runtest.sh
.
(You can create a symlink for convenience.)
All the processing will take place in the current directory.
JSON file is expected in the current directory unless specified otherwise.
Variable DIR_TASKS
stores the path to the task configuration script directory.
It can be used inside that script to refer to other files in the same directory, (e.g. cleaning script, ROOT macros).
Use the debug command line option -d
to see more details in the terminal.
If everything went fine, the script will exit with the message Done
and you should have got all the output files in the current directory.
If any step fails, the script will display an error message and you should look into the respective log file to investigate the problem.
If the main log file of a validation step mentions "parallel: This job failed:", inspect the respective log file in the directory of the corresponding job.
To add a new workflow in the framework configuration, you need to follow these steps.
- Add the workflow in the task configuration:
- Add the activation switch:
DOO2_...=0 # name of the workflow (without o2-analysis)
. - Add the application of the switch in the
MakeScriptO2
function:[ $DOO2_... -eq 1 ] && WORKFLOWS+=" o2-analysis-..."
. - If needed, add lines in the
AdjustJson
function to modify the JSON configuration.
- Add the activation switch:
- Add the workflow specification in the workflow database:
- See the dummy example
o2-analysis-workflow
for the full list of options.
- See the dummy example
- Add the device configuration in the default JSON file.
If you run many parallelised jobs and some of them don't finish successfully, you can make use of the debugging script debug.sh
in the exec
directory
which can help you figure out what went wrong, where and why.
You can execute the script from the current working directory using the following syntax (options can be combined):
bash [<path>/]debug.sh [-h] [-t TYPE] [-b [-u]] [-f] [-w] [-e]
-h
Print out the usage help.
TYPE
Job type: conversion
, ali
, o2
(o2
by default)
-b
Show bad jobs (without output file or successful end).
-u
Mark unfinished jobs (running, hanging, aborted). (Requires -b
.)
-f
Show input files of bad jobs.
-w
Show warnings (for all jobs).
-e
Show errors (for all jobs).
Enter the codeHF
directory and see the README
.
Enter the codeJE
directory.
With the ongoing fast development, it can easily happen that updating the O2Physics part of the validation also requires updating the O2 and the AliPhysics installations which then requires updating the alidist recipes as well. Also when requesting changes in the main repository via a pull request, it is strongly recommended to update one's personal fork repository first, apply the changes on top the main branch and rebuild the installation to make sure that the new commits can be seamlessly merged into the main repository.
All these maintenance steps can be fully automated using the update_packages.py
Python script which takes care of keeping your (local and remote) repositories
and installations up to date with the latest development in the respective main branches.
This includes updating alidist, AliPhysics, O2(Physics), and this Run 3 validation code repository,
as well as re-building your AliPhysics and O2(Physics) installations via aliBuild and deleting obsolete builds.
You can execute the script from any directory on your system using the following syntax:
python [<path>/]exec/update_packages.py [-h] [-d] [-l] [-c] database
optional arguments:
-h
, --help
show the help message and exit
-d
, --debug
print debugging info
-l
print latest commits and exit
-c
print configuration and exit
The positional argument database
is a YAML database with configuration and options.
The Run3Analysisvalidation repository provides a read-to-use configuration file with a full list of options at config/packages.yml
.
All you need to do is to make sure that the settings in the database correspond to your local setup and
adjust the activation switches, if needed, to change the list of steps to be executed.
By default, all packages are activated for build and update.
If you are happy with the configuration, you can then start the script and it will take care of the full update of your code and installations for all the activated packages.
If your repository is currently on a feature branch (different from the main branch), both the main branch and your feature branch will be updated from the main branch in the main (upstream) remote repository. The main and the current branch are each first updated from their respective counterparts in the fork remote repository and then from the upstream main branch. The updated history of each branch is then force-pushed to the fork repository. This allows for synchronisation across machines where commits pushed to the fork repository from another machine are first incorporated locally before pushing new commits. All your personal changes (committed and uncommitted) are preserved via rebasing and stashing. Check the description of the script behaviour inside the script itself for more details.
If clean: 1
, obsolete builds are deleted from the sw
directory at the end.
If clean_purge: 1
, a deeper purging is done by deleting all builds that are not needed to run the latest AliPhysics and O2(Physics) builds.
WARNING: Do not enable the purging if you need to keep several builds of AliPhysics or O2(Physics) (e.g. for different branches or commits) or builds of other development packages not specified in your configuration!
If any error occurs during the script execution, the script will report the error and exit immediately.
You can easily extend the script to include any other local Git repository and any other aliBuild development package on your machine that you wish to be updated in the same way.
Validity and quality of the code in the repository are checked on GitHub by several tools (linters) that support many coding languages. Linters run automatically for every push or pull request event. Please make sure that your code passes all the tests before making a pull request.
It is possible to check your code locally (before even committing or pushing):
bash [<path>/]exec/check_spaces.sh
clang-format -style=file -i <file>
npx mega-linter-runner