This repository provides the official implementation of our papers and conference contributions regarding rate-distortion explanations:
- Explaining Neural Network Decisions Is Hard
J. Macdonald, S. Wäldchen, S. Hauch and G. Kutyniok (2020)
ICML 2020 Workshop, XXAI: Extending Explainable AI Beyond Deep Models and Classifiers - A Rate-Distortion Framework for Explaining Neural Network Decisions
J. Macdonald, S. Wäldchen, S. Hauch and G. Kutyniok (2019)
Each sub-directory contains the code for a separate experiment.
Individual experiments contain definitions of the used neural networks, scripts for loading the data, as well as scripts for computing and evaluating several interpretability methods.
The pretrained network weights are not included due to size restrictions. The data sets for mnist, stl10, and an8flower are not included in this repository but are publicly available at the respective websites.
The boolean-string
experiment uses synthetically generated data and is completely self-contained within this repository.
For the three other experiments, after downloading the data, it needs to be unpacked into the respective prepared data
sub-directories in a format readable by keras
data generators via flow_from_directory
as specified in the keras_generator.py
file (you can also move the data directory somewhere else and adapt the DATAPATH
in the instances.py
file).
Data set statistics will have to be computed when you run a relevance mapping script for the first time (this is done using the statstream
package).
Use the files scripts_*.py
to compute relevance maps using different interpretability methods.
script_adf.py
: Our rate-distortion explanation (RDE) method using the assumed density filtering (ADF) implementation. Different variants are available (diag
,half
akalow-rank
, andfull
which is only feasible for low-dimensional signals and small networks; we recommend to always usediag
orhalf
).script_shap.py
: Shapley value based implementation by Scott Lundberg et al., see https://github.com/slundberg/shap.script_lime.py
: Local Interpretable Model-Agnostic Explanations (LIME) implementation by Marco Ribeiro et al., see https://github.com/marcotcr/lime.script_innvestigate.py
: Collection of sensitivity and backpropagation based interpretation method implementations by Maximilian Alber et al., see https://github.com/albermax/innvestigate.- Deep Taylor Decompositions
- Guided Backprop
- Layerwise relevance Propagation (LRP)
- Sensitivity Analysis
- SmoothGrad
Use the files pixelflip.py
and pixelflip_visualize.py
.
The package versions are the ones we used. Other versions might work as well.
cudatoolkit
(v10.0.130)
cudnn
(v7.6.5)
h5py
(v2.10.0)
hdf5
(v1.10.5)
innvestigate
(v1.0.8)
keras-adf
(v19.1.0)
keras
(v.2.2.4)
lime
(v0.1.1.37)
matplotlib
(v3.1.2)
numpy
(v1.17)
pillow
(v8.0.1)
python
(v3.7.8)
scikit-image
(0.16.2)
scipy
(v1.4.1)
shap
(v0.34.0)
statstream
(19.1.0)
tensorflow-gpu
(v1.15)
tikzplotlib
(v0.9.8)
tqdm
(v4.53.0)
This repository is MIT licensed, as found in the LICENSE file.