Skip to content

Commit

Permalink
Merge pull request #505 from MLecardonnel/feature/remove_backend
Browse files Browse the repository at this point in the history
Removes ACV from shapash and fixes dependencies
  • Loading branch information
ThomasBouche authored Nov 2, 2023
2 parents 3b9577a + 4a85e0b commit 7d59855
Show file tree
Hide file tree
Showing 20 changed files with 59 additions and 806 deletions.
2 changes: 0 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@
| 2.0.x | Refactoring Shapash <br> | Refactoring attributes of compile methods and init. Refactoring implementation for new backends | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/modular.png" width="50" title="modular">](https://github.com/MAIF/shapash/blob/master/tutorial/explainer_and_backend/tuto-expl06-Shapash-custom-backend.ipynb)
| 1.7.x | Variabilize Colors <br> | Giving possibility to have your own colour palette for outputs adapted to your design | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/variabilize-colors.png" width="50" title="variabilize-colors">](https://github.com/MAIF/shapash/blob/master/tutorial/common/tuto-common02-colors.ipynb)
| 1.6.x | Explainability Quality Metrics <br> [Article](https://towardsdatascience.com/building-confidence-on-explainability-methods-66b9ee575514) | To help increase confidence in explainability methods, you can evaluate the relevance of your explainability using 3 metrics: **Stability**, **Consistency** and **Compacity** | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/quality-metrics.png" width="50" title="quality-metrics">](https://github.com/MAIF/shapash/blob/master/tutorial/explainability_quality/tuto-quality01-Builing-confidence-explainability.ipynb)
| 1.5.x | ACV Backend <br> | A new way of estimating Shapley values using ACV. [More info about ACV here](https://towardsdatascience.com/the-right-way-to-compute-your-shapley-values-cfea30509254). | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/wheel.png" width="50" title="wheel-acv-backend">](tutorial/explainer_and_backend/tuto-expl03-Shapash-acv-backend.ipynb) |
| 1.4.x | Groups of features <br> [Demo](https://shapash-demo2.ossbymaif.fr/) | You can now regroup features that share common properties together. <br>This option can be useful if your model has a lot of features. | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/groups_features.gif" width="120" title="groups-features">](https://github.com/MAIF/shapash/blob/master/tutorial/common/tuto-common01-groups_of_features.ipynb) |
| 1.3.x | Shapash Report <br> [Demo](https://shapash.readthedocs.io/en/latest/report.html) | A standalone HTML report that constitutes a basis of an audit document. | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/report-icon.png" width="50" title="shapash-report">](https://github.com/MAIF/shapash/blob/master/tutorial/generate_report/tuto-shapash-report01.ipynb) |

Expand Down Expand Up @@ -287,7 +286,6 @@ This github repository offers many tutorials to allow you to easily get started

- [Compute Shapley Contributions using **Shap**](tutorial/explainer_and_backend/tuto-expl01-Shapash-Viz-using-Shap-contributions.ipynb)
- [Use **Lime** to compute local explanation, Summarize-it with **Shapash**](tutorial/explainer_and_backend/tuto-expl02-Shapash-Viz-using-Lime-contributions.ipynb)
- [Use **ACV backend** to compute Active Shapley Values and SDP global importance](tutorial/explainer_and_backend/tuto-expl03-Shapash-acv-backend.ipynb)
- [Compile faster Lime and consistency of contributions](tutorial/explainer_and_backend/tuto-expl04-Shapash-compute-Lime-faster.ipynb)
- [Use **FastTreeSHAP** or add contributions from another backend](tutorial/explainer_and_backend/tuto-expl05-Shapash-using-Fasttreeshap.ipynb)
- [Use Class Shapash Backend](tutorial/explainer_and_backend/tuto-expl06-Shapash-custom-backend.ipynb)
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
<div class="details">
<h1>Features</h1>
<ul>
<li>Compatible with Shap, Lime and ACV</li>
<li>Compatible with Shap and Lime</li>
<li>Uses shap backend to display results in a few lines of code</li>
<li>Encoders objects and features dictionaries used for clear results</li>
<li>Compatible with category_encoders & Sklearn ColumnTransformer</li>
Expand Down
5 changes: 2 additions & 3 deletions requirements.dev.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
pip>=23.2.0
numpy==1.21.6
numpy>1.18.0
dash==2.3.1
catboost>=1.0.1
category-encoders>=2.6.0
Expand Down Expand Up @@ -32,13 +32,12 @@ numba>=0.53.1
nbconvert>=6.0.7
papermill>=2.0.0
matplotlib>=3.3.0
seaborn>=0.12.2
seaborn==0.12.2
scipy>=0.19.1
notebook>=6.0.0
jupyter-client<8.0.0
Jinja2>=2.11.0
phik>=0.12.0
skranger>=0.8.0
acv-exp>=1.2.3
lime>=0.2.0.0
regex
3 changes: 1 addition & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
'nbconvert>=6.0.7',
'papermill>=2.0.0',
'jupyter-client>=7.4.0',
'seaborn>=0.12.2',
'seaborn==0.12.2',
'notebook',
'Jinja2>=2.11.0',
'phik'
Expand All @@ -53,7 +53,6 @@
extras['xgboost'] = ['xgboost>=1.0.0']
extras['lightgbm'] = ['lightgbm>=2.3.0']
extras['catboost'] = ['catboost>=1.0.1']
extras['acv'] = ['acv-exp>=1.2.0']
extras['lime'] = ['lime>=0.2.0.0']

setup_requirements = ['pytest-runner', ]
Expand Down
1 change: 0 additions & 1 deletion shapash/backend/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@

from .base_backend import BaseBackend
from .shap_backend import ShapBackend
from .acv_backend import AcvBackend
from .lime_backend import LimeBackend


Expand Down
122 changes: 0 additions & 122 deletions shapash/backend/acv_backend.py

This file was deleted.

4 changes: 0 additions & 4 deletions shapash/decomposition/contributions.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,6 @@ def inverse_transform_contributions(contributions, preprocessing=None, agg_colum
The processing apply to the original data.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.
Returns
-------
Expand Down
70 changes: 8 additions & 62 deletions shapash/explainer/consistency.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,18 +33,19 @@ def tuning_colorscale(self, values):
color_scale = list(map(list, (zip(desc_pct_df.values.flatten(), self._style_dict["init_contrib_colorscale"]))))
return color_scale

def compile(self, x=None, model=None, preprocessing=None, contributions=None, methods=["shap", "acv", "lime"]):
"""If not provided, compute contributions according to provided methods (default are shap, acv, lime).
If provided, check whether they respect the correct format:
def compile(self, contributions, x=None, preprocessing=None):
"""Check whether the contributions respect the correct format:
contributions = {"method_name_1": contrib_1, "method_name_2": contrib_2, ...}
where each contrib_i is a pandas DataFrame
Parameters
----------
contributions : dict
Contributions provided by the user if no compute is required.
Format must be {"method_name_1": contrib_1, "method_name_2": contrib_2, ...}
where each contrib_i is a pandas DataFrame. By default None
x : DataFrame, optional
Dataset on which to compute consistency metrics, by default None
model : model object, optional
Model used to compute contributions, by default None
preprocessing : category_encoders, ColumnTransformer, list, dict, optional (default: None)
--> Differents types of preprocessing are available:
Expand All @@ -54,72 +55,17 @@ def compile(self, x=None, model=None, preprocessing=None, contributions=None, me
- A list with a single ColumnTransformer with optional (dict, list of dict)
- A dict
- A list of dict
contributions : dict, optional
Contributions provided by the user if no compute is required.
Format must be {"method_name_1": contrib_1, "method_name_2": contrib_2, ...}
where each contrib_i is a pandas DataFrame. By default None
methods : list
Methods used to compute contributions, by default ["shap", "acv", "lime"]
"""
self.x = x
self.preprocessing = preprocessing
if contributions is None:
if (self.x is None) or (model is None):
raise ValueError('If no contributions are provided, parameters "x" and "model" must be defined')
contributions = self.compute_contributions(self.x, model, methods, self.preprocessing)
else:
if not isinstance(contributions, dict):
raise ValueError('Contributions must be a dictionary')
if not isinstance(contributions, dict):
raise ValueError('Contributions must be a dictionary')
self.methods = list(contributions.keys())
self.weights = list(contributions.values())

self.check_consistency_contributions(self.weights)
self.index = self.weights[0].index

def compute_contributions(self, x, model, methods, preprocessing):
"""
Compute contributions based on specified methods
Parameters
----------
x : pandas.DataFrame
Prediction set.
IMPORTANT: this should be the raw prediction set, whose values are seen by the end user.
x is a preprocessed dataset: Shapash can apply the model to it
model : model object
Model used to consistency check. model object can also be used by some method to compute
predict and predict_proba values
methods : list, optional
When contributions is None, list of methods to use to calculate contributions, by default ["shap", "acv"]
preprocessing : category_encoders, ColumnTransformer, list, dict
--> Differents types of preprocessing are available:
- A single category_encoders (OrdinalEncoder/OnehotEncoder/BaseNEncoder/BinaryEncoder/TargetEncoder)
- A single ColumnTransformer with scikit-learn encoding or category_encoders transformers
- A list with multiple category_encoders with optional (dict, list of dict)
- A list with a single ColumnTransformer with optional (dict, list of dict)
- A dict
- A list of dict
Returns
-------
contributions : dict
Dict whose keys are method names and values are the corresponding contributions
"""
contributions = {}

for backend in methods:
xpl = SmartExplainer(model=model, preprocessing=preprocessing, backend=backend)
xpl.compile(x=x)
if xpl._case == "classification" and len(xpl._classes) == 2:
contributions[backend] = xpl.contributions[1]
elif xpl._case == "classification" and len(xpl._classes) > 2:
raise AssertionError("Multi-class classification is not supported")
else:
contributions[backend] = xpl.contributions

return contributions

def check_consistency_contributions(self, weights):
"""
Assert contributions calculated from different methods are dataframes
Expand Down
2 changes: 1 addition & 1 deletion shapash/explainer/smart_explainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ class SmartExplainer:
predict and predict_proba values
backend : str or shpash.backend object (default: 'shap')
Select which computation method to use in order to compute contributions
and feature importance. Possible values are 'shap', 'acv' or 'lime'. Default is 'shap'.
and feature importance. Possible values are 'shap' or 'lime'. Default is 'shap'.
It is also possible to pass a backend class inherited from shpash.backend.BaseBackend.
preprocessing : category_encoders, ColumnTransformer, list, dict, optional (default: None)
--> Differents types of preprocessing are available:
Expand Down
4 changes: 0 additions & 4 deletions shapash/explainer/smart_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,6 @@ def inverse_transform_contributions(self, contributions, preprocessing, agg_colu
Single step of preprocessing, typically a category encoder.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.
Returns
-------
Expand Down
4 changes: 0 additions & 4 deletions shapash/utils/category_encoder_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,10 +198,6 @@ def calc_inv_contrib_ce(x_contrib, encoding, agg_columns):
The processing apply to the original data.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.
Returns
-------
Expand Down
4 changes: 0 additions & 4 deletions shapash/utils/columntransformer_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,10 +195,6 @@ def calc_inv_contrib_ct(x_contrib, encoding, agg_columns):
The processing apply to the original data.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.
Returns
-------
Expand Down
2 changes: 1 addition & 1 deletion shapash/utils/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ def compute_sorted_variables_interactions_list_indices(interaction_values):
for i in range(tmp.shape[0]):
tmp[i, i:] = 0

interaction_contrib_sorted_indices = np.dstack(np.unravel_index(np.argsort(tmp.ravel()), tmp.shape))[0][::-1]
interaction_contrib_sorted_indices = np.dstack(np.unravel_index(np.argsort(tmp.ravel(), kind="stable"), tmp.shape))[0][::-1]
return interaction_contrib_sorted_indices


Expand Down
Loading

0 comments on commit 7d59855

Please sign in to comment.