Skip to content

mikirov/Elsys-Thesis-Work-2019-2020

Repository files navigation

Brawler Arena Shooter Game with implemented Reinfocement Learning Artificial Intelligence

This projet implements and compares both Reinforcement learning AI and conventional Behavior Tree AI on top of a 3D multiplayer brawler arena shooter game.

The Game:

Weapons:

  • The game implements a data-driven weapon system (with multiple projectiles, ammunition, particle systems, reloading animation). Everything is server-side and synchronized across multiple player instances.

Bazooka-gif

Default-gun-gif

Co-op game mode:

  • The game implements a lobby that waits for 4 people to join before entering the main arena. The game spawns BT AI and RL AI against the 4 players.

lobby

lobby2

Various animations:

  • The players have animations for different actions (all animation logic is written in C++) walk-animation

death-animation

reload-animation

Chat:

  • A fully synchronized C++ chat is implemented into the game. It has animations and a sccroll button: chat-animation

The artificial intelligence:

Behavior tree AI:

  • The game implements a Behavior Tree enemy AI. All nodes are implemented in C++. BT

Reinforcement learning AI:

training

  • The reinforcement learning AI consists of two phases: training the agent and playing against the agent:
    • A fully costumizable training user interface was created. It allows the developer to tweek training data in real time to achieve the wanted behavior.
    • Progress is being made towards exporting the RL code to a plugin.
    • The agent stores it's progress when killed and when another one spawns in the next epoch the training continues where it left off.
  • After an agent is trained it reads the model data from it's Q-table and does the best action

Explanation of The Reinforcement learning algorithm:

explanation explanation2

  • The algorithm implemented in this project is tabular Q-learning
  • It takes and agent and an environment.
  • The agent is given a finite set of possible actions it can perform
  • The environment gives the agent the current state of the game
  • The environment gives the agent the current reward for the previous action it has performed
  • The agent tries to optimize the total predicted future reward by doing the action that will give it the highest reward in a given state.

Read more:

About

Thesis work for our final year project at ELSYS

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages