eXplainable Machine Learning course for Machine Learning (MSc) studies at the University of Warsaw.
Winter semester 2023/24 @pbiecek @hbaniecki
Previous year: https://github.com/mim-uw/eXplainableMachineLearning-2023
Plan for the winter semester 2023/2024. MIM_UW classes are on Fridays.
- 2023-10-06 -- Introduction, Slides, Audio. Extra reading: Lipton 2017, Rudin 2019
- 2023-10-13 -- Fairness, Slides, Audio, Extra reading: Fairness and ML, Cirillo 2020
- 2023-10-20 -- LIME and friends, Slides, Audio+Video, Extra reading: Why Should I Trust You?, LORE
- 2023-10-27 -- SHAP and friends, Slides, Audio, Extra reading: General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models, Shapley Flow: A Graph-based Approach to Interpreting Model Predictions
- 2023-10-30 -- PROJECT: First checkpoint (remote video)
- 2023-11-10 -- PDP and friends, Slides, Audio, Extra reading: AlePlot, Peeking Inside the Black Box
- 2023-11-17 -- VIP / MCR Slides, Audio, Extra reading: The Two Cultures, More models, Rashomon quartet
- 2023-11-24 -- Explanations specific to neural networks Slides & Evaluation of explanations Slides, Audio
- 2023-12-01 -- From local explanations to global understanding, prof. Wojciech Samek
- 2023-12-08 -- PROJECT: Second checkpoint (remote video)
- 2023-12-15 -- Adversarial XAI
- 2024-01-12 -- Student presentations of research papers
- 2024-01-19 -- XAI-Based Model Debugging, prof. Wojciech Samek
- 2024-01-26 -- PROJECT: Final presentation (in-person presentations)
The final grade is based on activity in four areas:
- mandatory: Project (0-35)
- mandatory: Exam (0-30)
- optional: Homeworks (0-24)
- optional: Presentation (0-10)
In total you can get from 0 to 100 points. 51 points are needed to pass this course.
Grades:
- 51-60: (3) dst
- 61-70: (3.5) dst+
- 71-80: (4) db
- 81-90: (4.5) db+
- 91-100: (5) bdb
- Homework 1 for 0-4 points. Deadline: 2023-10-12 - graded by HBA
- Homework 2 for 0-4 points. Deadline: 2023-10-19 - graded by PBI
- Homework 3 for 0-4 points. Deadline: 2023-10-26 - graded by PBI
- Homework 4 for 0-4 points. Deadline: 2023-11-09 - graded by HBA
- Homework 5 for 0-4 points. Deadline: 2023-11-16 - graded by HBA
- Homework 6 for 0-4 points. Deadline: 2023-11-23 - graded by PBI
This year's project involves conducting a vulnerability analysis of a predictive models using XAI tools. This analysis should be carried out for a selected model and the results should be summarised in a short RedTeaming report.
- Projects can be done in groups of 1, 2 or 3 students
- One model can be analysed by multiple groups (but the discovered vulnerabilities must not be repeated)
- The harder the project, the easier it is to obtain a higher grade.
- 2023-10-30 – First checkpoint: Students chose the model, create a plan of work (to be discussed at the classes). Deliverables: 3 min presentation based on one slide. (0-5 points)
- 2023-12-08 – Second checkpoint: Provide initial experimental results. At least one vulnerability should have been found by now. (0-10 points)
- 2023-01-26 - Final checkpoint: Presentation of all identified vulnerabilities. (0-20 points)
RedTeaming analysis should be carried out for a selected model. Depending on the difficulty of the model, you may receive more or less points
- the analysis concerns your own model (e.g. from homework). Point conversion rate: x0.8
- the analysis applies to a model from Hugging Face (one of the models available there for any modality). Points conversion factor: x1
- the analysis applies to one of the popular foundation models (e.g. TabPFN https://arxiv.org/abs/2207.01848, Segment Anything https://arxiv.org/abs/2304.02643, Llama 2 https://arxiv.org/abs/2307.09288, RETFound https://www.nature.com/articles/s41586-023-06555-x). Points conversion factor: x1.25
Examples of directions to look for vulnerability (creativity will be appreciated)
- bias / fairness. Does the model discriminate against a protected attribute? (e.g. https://arxiv.org/abs/2105.02317)
- using XAI to find instance level artifacts (e.g. Clever Hans https://doi.org/10.1016/j.inffus.2021.07.015, wolf/snow https://arxiv.org/abs/1602.04938), large residuals and explanations why predictions were wrong
- unintended memorisation (e.g. https://arxiv.org/abs/1802.08232)
- drift in performance gap between datasets (e.g. another image or text dataset)
- model malfunction due to an unintendent use
- wrong behaviour on out-of-distribution samples
- SATML 2024 competition concerning CNN Interpretability based on a NeurIPS 2023 paper Red Teaming Deep Neural Networks with Feature Synthesis Tools
The final report will be a short (up to 4 pages) paper in the JMLR template. See an example.
We recommend to dive deep into the following books and explore their references on a particular topic of interest:
- Explanatory Model Analysis. Explore, Explain and Examine Predictive Models by Przemysław Biecek, Tomasz Burzykowski
- Fairness and Machine Learning: Limitations and Opportunities by Solon Barocas, Moritz Hardt, Arvind Narayanan
- Interpretable Machine Learning. A Guide for Making Black Box Models Explainable by Christoph Molnar