Skip to content

Examples of finite-horizon zero-sum differential games implemented as environments for reinforcement learning algorithms

License

Notifications You must be signed in to change notification settings

imm-rl-lab/differential_game_gym

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Differential Game Gym

The repository contains examples of finite-horizon zero-sum differential games implemented as environments (Markov games) for multi-agent reinforcement learning algorithms. Since the problems are initially described by differential equations, in order to formalize them as Markov games, a uniform time-discretization with the diameter dt is used. In addition, it is important to emphasize that, in the games with a finite horizon, agent's optimal policies depend not only on the phase vector $x$, but also on the time $t$. Thus, we obtain Markov games, depending on dt, with continuous state space $S$ containing states $s=(t,x)$ and continuous action space $A$.

Interface

The finite-horizon zero-sum differential games are implemented as environments (Markov games) with an interface close to OpenAI Gym with the following attributes:

  • state_dim - the state space dimension;
  • u_action_dim - the action space dimension of the first agent;
  • v_action_dim - the action space dimension of the second agent;
  • terminal_time - the action space dimension;
  • dt - the time-discretization diameter;
  • reset() - to get an initial state (deterministic);
  • step(u_action, v_action) - to get next_state, current reward, done (True if t > terminal_time, otherwise False), info;
  • virtual_step(state, u_action,v_action) - to get the same as from step(action), but but the current state is also set.

About

Examples of finite-horizon zero-sum differential games implemented as environments for reinforcement learning algorithms

Topics

Resources

License

Stars

Watchers

Forks

Languages