Parametric Neural Networks (pNNs) are a kind of neural networks, developed by Baldi et al., which are mainly used for signal-background classification in High-Energy Physics (HEP). In our journal paper, we propose various improvements to the original pNN, one of which is the affine architecture based on interleaving multiple affine-conditioning layers:
resulting in the following neural architecture (dropout is omitted), which we call the AffinePNN:
We also demonstrate the effectiveness of our balanced training procedure, in which we build balanced mini-batches by leveraging the structure of both the signal and background, as well as discussing the possible choices to distribute the background's mass.
With our proposed improvements, we are able to achieve better classification and interpolation performance:
UPDATE 25/02/2023: added new results about the impact of the conditioning mechanism, used when building a pNN to combine the features with the physics parameter.
With an interpolation test we found out that both biasing and affine conditioning mechanisms are robust (i.e. able to generalize on the missing mass points) and performant (i.e. achieve high AUC.) Also the best results are achieved when the conditioning is performed on all layers.
-
Clone the repository:
git clone https://github.com/Luca96/affine-parametric-networks.git cd affine-parametric-networks
-
Install all the required libraries (usage of
virtualenv
is highly suggested):pip install -r requirements.txt
-
Download the
HEPMASS
dataset from here: click on "data folder", then download and extract:all_test.csv.gz
andall_train.csv.gz
; the latter is only required for the notebookhep-training.ipynb
. IMPORTANT: save the.csv
files under the repository folder at\data\hepmass
. -
Conversion of HEPMASS: some small changes are made to the original csv files of HEPMASS, you can find the procedure at the beginning of the notebook
HEPMASS.ipynb
or here. -
Download and extract our dataset
HEPMASS-IMB
, here: there are two files in total,imbalanced_background.csv
andimbalanced_signal.csv
. As before, save the.csv
files within\data\hepmass
. -
Now you're ready to run the notebooks, or to use the pretrained weights.
The repository is organized as follows:
\script
: contains the source files used into the notebooks.\weights
: contains all the pretrained weights of our experiments; forHEPMASS
(under\hep
) andHEPMASS-IMB
(under\hep-imb
)\data
: should contain a folder\hepmass
in which the dataset (i.e..csv
files) are stored.HEPMASS.ipynb
: data exploration of theHEPMASS
dataset.hep-training.ipynb
: training and evaluation of pNNs forHEPMASS
.hep-imbalanced.ipynb
: construction ofHEPMASS-IMB
, also with training and evaluation of pNNs on it.mass_representation.ipynb
: contains some t-SNE visualization of the learned internal representation of the trained model.conditioning.ipynb
andconditioning-interpolation.ipynb
: are about studying the impact of both the type of conditioning mechanism and at which place of the network happens.
If you use the code and/or the dataset we provide for your own project or research, please cite our paper:
@article{anzalone2022improving,
title={Improving parametric neural networks for high-energy physics (and beyond)},
author={Anzalone, Luca and Diotalevi, Tommaso and Bonacorsi, Daniele},
journal={Machine Learning: Science and Technology},
volume={3},
number={3},
pages={035017},
year={2022},
publisher={IOP Publishing}
}
Dataset citation:
@dataset{hepmass_imb,
author={Luca Anzalone and Tommaso Diotalevi and Daniele Bonacorsi},
title={HEPMASS-IMB},
month=apr,
year=2022,
publisher={Zenodo},
doi={10.5281/zenodo.6453048},
url={https://doi.org/10.5281/zenodo.6453048}
}