Repository of the paper: Towards a more efficient computation of individual attribute and policy contribution for post-hoc explainable multi-agent systems using Myerson values
Look at .sh files to run the scripts
Compute the Myerson Values (Shapleys on a graph) of both the Policy and stats of a 3v3 "arena" game.
In order to reproduce the experiments run the scripts in the .sh files coop_xy.sh Where x is the policy of team A (r = random, s = smart, n = noop), and likewise y is the policy of team B.
e.g. in coop_ss.sh you will find:
python -W ignore main.py --exact 15 --full 1 --sim_num 72 --pol_a smart --pol_b smart
This will generate two .json files and two .npz files containing the results. Once you ran the experiments for all combinations of policies, put the output in the folder statisticaltests to analyze the results with results_table.py that generates a latex table with the statistical difference between the importance of the given feature and zero, and will also compare Myerson and Shapley outputs.
This game is inspired by World of Warcraft 3v3 arenas.
Two teams made by a Warrior, a Mage and a Priest fight each other.
The teams perform their sequence of actions by turn.
At the beginning of each fight, one team is chosen to start first.
The priority of action in every team is the following: 1) Warrior, 2) Mage, 3) Priest
Every player has a policy and a set of stats:
- Maximum Health Points
- Attack Power
- Healing Power
- Control Chance
The warrior can only attack an enemy player.
He damages the enemy by an amount equal to his Attack Power.
A mage can only control (put to sleep) an enemy player.
His chance of controlling the enemy is equal to his Control Chance * (1 + Attack Power/20)
When an enemy player is put to sleep he can not perform any action during the next turn.
A priest can only heal a teammate.
He heals the teammate by an amount equal to his Healing Power
Four different policies are enabled: 1) Random 2) Smart 3) No-op 4) Deep RL (A2C)
With this policy the target of the warrior and the mage are uniformly chosen between the alive enemies. The target of the priest is chosen between the alive teammates.
The Warrior and Mage target the enemies with this priority list: 1) Priest, 2) Mage, 3) Warrior The Priest always heals the teammate with the least HP.
No agent can act.
The agents act following the output of a Deep RL trained algorithm (StableBaselines3's A2C).
The Mayerson Values for the following characteristics are computed:
- Warrior Maximum Health Points
- Warrior Policy
- Warrior Attack Power
- Warrior Healing Power
- Warrior Control Chance
- Mage Maximum Health Points
- Mage Policy
- Mage Attack Power
- Mage Healing Power
- Mage Control Chance
- Priest Maximum Health Points
- Priest Policy
- Priest Attack Power
- Priest Healing Power
- Priest Control Chance
When a characteristic is not present in a coalition it is put to 0. When a policy is not present in a coalition the agent does not perform any action.
Computing the Shapley Values for this set of characteristics is already computationally expensive given the huge number of possible coalitions.
But knowing a-priori something about the structure of the game allows us to build up a graph and to compute the Myerson values on this graph. Note that in order to compute the Meyerson values you sum the utility functions of every connected components in a coalition. This greatly reduces the complexity of the approach.
Since when the player max HP is put to 0 he is dead and when the policy is not present the player does not act, we can build the following graph for the game:
It is clear that when one the Max HP of a player is not in a coalition, all the branch linked to it do not contribute to the coalition utility (the same works for the Policy).
If you used the research from the paper "Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values" by Giorgio Angelotti and Natalia Díaz-Rodríguez, published in the journal "Knowledge-Based Systems" in 2023, please cite it as follows:
@article{angelotti2023towards,
title={Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values},
author={Giorgio Angelotti and Natalia Díaz-Rodríguez},
journal={Knowledge-Based Systems},
volume={260},
pages={110189},
year={2023},
issn={0950-7051},
doi={https://doi.org/10.1016/j.knosys.2022.110189},
keywords={Explainable Multi-Agent Systems, Explainable Artificial Intelligence, Myerson values, Shapley values, A-priori knowledge graphs},
}
Copyright 2022 Giorgio Angelotti