Skip to content

Latest commit

 

History

History
40 lines (26 loc) · 2.52 KB

CS_294_fall17.md

File metadata and controls

40 lines (26 loc) · 2.52 KB

Berkeley CS 294: Deep Reinforcement Learning - Study Group

Project name CS294: Deep Reinforcement Learning
Project leader Vatsal Mahajan
Project leader Slack username @vatsal
Project slack channel #cs294_reinforcement

Description

We will be working towards covering the material in CS 294 - Fall 2017 course. The group will meet to review the lectures and work on the assignments. The course has a total of 20 lectures. So, we will be reviewing 3 lectures everytime we meet.

The assignments use environments from the OpenAI gym. The general setup for assignments includes - TensorFlow, OpenAI Gym, MuJoCo, and Anaconda.

At these meetings, we will be focusing on:-

  1. Reviewing the Lectures - probably 1 members should present a summary of the topics covered in the lecture. (So, if you did not find time to go through the lecture you can still come.)
  2. Q&A - Ask questions about the topics that you did not understand Or re-work the derivations.
  3. Working on the assignments.

Course Syllabus: http://rll.berkeley.edu/deeprlcourse/

Why RL?

Reinforcement learning (RL) is the subfield of machine learning concerned with decision making and motor control. It studies how an agent can learn how to achieve goals in a complex, uncertain environment. It’s exciting for two reasons:

RL is very general, encompassing all problems that involve making a sequence of decisions: for example, controlling a robot’s motors so that it’s able to run and jump, making business decisions like pricing and inventory management, or playing video games and board games. RL can even be applied to supervised learning problems with sequential or structured outputs.

RL algorithms have started to achieve good results in many difficult environments. RL has a long history, but until recent advances in deep learning, it required lots of problem-specific engineering. DeepMind’s Atari results, BRETT from Pieter Abbeel’s group, and AlphaGo all used deep RL algorithms which did not make too many assumptions about their environment and thus can be applied in other settings.

Group meetup schedule

We can meet once every 2 weeks. (We can make the schedule more flexible once we have more members in the group)

Jan 30: First meet-up

I will add the complete schedule after the first meetup.

Thanks to the Berkeley course staff for making the material publically available.