Two players alternately roll dice, and keep track of their total across turns.
They are each trying to reach a sum that lies in a specified target, between a fixed low value and high value. If a player reaches a score in the target range, they immediately win.
If a player exceeds the high value, the player immediately loses.
The players can choose the number of dice to roll on each turn, between 1 and a fixed maximum.
- NSides (int): The number of sides of the die. The die is numbered 1 to NSides, and all outcomes are equally likely.
- LTarget (int): The lowest winning value.
- UTarget (int): The highest winning value.
- NDice (int): The maximum number of dice a player may roll.
- M (float): A hyperparameter that controls the "explore/exploit" trade-off.
- Make sure Python3 is installed.
- Run the program on the terminal using "python3 <program_name>.py".
Note that this is not an interactive game. It is merely a program simulates the game between two players in order to give the user an optimal playing strategy.
The outputs of the program are two LTarget × LTarget arrays, the correct number of dice to roll in state (X, Y), and the probability of winning if you roll the correct number of dice.