SynchronousGoExplore

A first bare bones paralleled implementation of Go Explore as described by the Uber Engineering blog post

Currently no deep learning is incorperated with the project. The avalible exploration policies are random, and markov chain.

The notebook syncGoExplore.ipynb demonstrates the use of Go Explore to create a speedrun of level in a gym environment using multiple threads.

Dependencies:

ray (linux and osx only)
gym retro
imageio (also needs freeimage)
rom file for the game environment

To do:

Add smarter exploration policies (fast simple models and deep learning)

Asynchronous Go Explore, i.e. allow workers to be constantly playing and updating only when ready/neccesary

Add iterative deepening

Add procedures for experiments to search for good hyperparameters

Add the comb operation - sequentially go to each state encountered in a run that reaches the end of the level

Some early gameplay:

A very polished run:

Provide feedback