Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 1.67 KB

README.md

File metadata and controls

16 lines (12 loc) · 1.67 KB

Differential Game Gym

The repository contains examples of finite-horizon zero-sum differential games implemented as environments (Markov games) for multi-agent reinforcement learning algorithms. Since the problems are initially described by differential equations, in order to formalize them as Markov games, a uniform time-discretization with the diameter dt is used. In addition, it is important to emphasize that, in the games with a finite horizon, agent's optimal policies depend not only on the phase vector $x$, but also on the time $t$. Thus, we obtain Markov games, depending on dt, with continuous state space $S$ containing states $s=(t,x)$ and continuous action space $A$.

Interface

The finite-horizon zero-sum differential games are implemented as environments (Markov games) with an interface close to OpenAI Gym with the following attributes:

  • state_dim - the state space dimension;
  • u_action_dim - the action space dimension of the first agent;
  • v_action_dim - the action space dimension of the second agent;
  • terminal_time - the action space dimension;
  • dt - the time-discretization diameter;
  • reset() - to get an initial state (deterministic);
  • step(u_action, v_action) - to get next_state, current reward, done (True if t > terminal_time, otherwise False), info;
  • virtual_step(state, u_action,v_action) - to get the same as from step(action), but but the current state is also set.