Skip to content

Latest commit

 

History

History
53 lines (50 loc) · 6.25 KB

File metadata and controls

53 lines (50 loc) · 6.25 KB

deep reinforcement learning

papers

  • Asynchronous Methods for Deep Reinforcement Learning. [arxiv] ⭐
  • Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR. [arxiv]
  • A New Softmax Operator for Reinforcement Learning.[url]
  • Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML. [arxiv]
  • Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR. [arxiv]
  • Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR. [arxiv]
  • Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv. [url]
  • Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML. [arxiv]
  • Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML. [arxiv]
  • Continuous control with deep reinforcement learning. [arxiv] ⭐
  • Deep Successor Reinforcement Learning. [arxiv]
  • Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop. [arxiv]
  • Deep Exploration via Bootstrapped DQN. [arxiv] ⭐
  • Deep Reinforcement Learning for Dialogue Generation. [arxiv] tensorflow
  • Deep Reinforcement Learning in Parameterized Action Space. [arxiv] ⭐
  • Deep Reinforcement Learning with Successor Features for Navigation across Similar Environments.[url]
  • Designing Neural Network Architectures using Reinforcement Learning. arxiv code
  • Dialogue manager domain adaptation using Gaussian process reinforcement learning. [arxiv]
  • End-to-End Reinforcement Learning of Dialogue Agents for Information Access. [arxiv]
  • Generating Text with Deep Reinforcement Learning. [arxiv]
  • Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv. [arxiv]
  • Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv. [arxiv]
  • Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv. [arxiv]
  • Hierarchical Object Detection with Deep Reinforcement Learning. [arxiv]
  • High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR. [arxiv]
  • Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI. [arxiv]
  • Interactive Spoken Content Retrieval by Deep Reinforcement Learning. [arxiv]
  • Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv. [url]
  • Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv. [url]
  • Learning to compose words into sentences with reinforcement learning. [url]
  • Loss is its own Reward: Self-Supervision for Reinforcement Learning.[arxiv]
  • Model-Free Episodic Control. [arxiv]
  • Mastering the game of Go with deep neural networks and tree search. [nature] ⭐
  • MazeBase: A Sandbox for Learning from Games .[arxiv]
  • Neural Architecture Search with Reinforcement Learning. [pdf]
  • Neural Combinatorial Optimization with Reinforcement Learning. [arxiv]
  • Non-Deterministic Policy Improvement Stabilizes Approximated Reinforcement Learning. [url]
  • Online Sequence-to-Sequence Active Learning for Open-Domain Dialogue Generation. arXiv. [arxiv]
  • Policy Distillation, A. A. Rusu et at., ICLR. [arxiv]
  • Prioritized Experience Replay. [arxiv] ⭐
  • Reinforcement Learning Using Quantum Boltzmann Machines. [arxiv]
  • Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.[arxiv]
  • Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving. [arxiv]
  • Sample-efficient Deep Reinforcement Learning for Dialog Control. [url]
  • Self-Correcting Models for Model-Based Reinforcement Learning.[url]
  • Unifying Count-Based Exploration and Intrinsic Motivation. [arxiv]
  • Value Iteration Networks. [arxiv]