Artificial (General) Intelligence

The path forward:

The Alberta Plan for AI Research
Reward-respecting subtasks for model-based reinforcement learning
FM/LLM-powered RL Agents

Reinforcement Learning Algorithms

Model-Free

A2C: Advantage Actor-Critic
ACER: Sample Efficient Actor-Critic with Experience Replay
ACKTR: Actor Critic using Kronecker-Factored Trust Region
AQT: Action Q-Transformer
DQN: Deep Q-Network
- DDQN: Double Deep Q-Network
- DuelingDQN: Dueling Deep Q-Network
- h-DQN: Hierarchical-DQN: Integrating Temporal Abstraction and Intrinsic Motivation
- PER-DQN: Double Deep Q-Network with Prioritized Experience Replay
- Rainbow DQN: Rainbow: Combining Improvements in Deep Reinforcement Learning
DDPG: Deep Deterministic Policy Gradient
- TD3: Twin-Delayed Deep Deterministic Policy Gradient
FuNs: Feudal Networks for Hierarchical Reinforcement Learning
GAE: High-Dimensional Continuous Control Using Generalized Advantage Estimation
GAIL: Generative Adversarial Imitation Learning
GCL: Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
HER: Hindsight Experience Replay
IMPALA: Importance weighted Actor-Learner Architectures
NAF: Normalised Advantage Functions
NEC: Neural Episodic Control
OK: The option keyboard: Combining skills in reinforcement learning
Option-Critic: The Option-Critic Architecture
PPO: Proximal Policy Optimization
- Continual PPO: Loss of Plasticity in Deep Continual Learning
  - In Appendix E
  - Paper unveils "continual backpropagation": proposes tracking Utility of activation units to guide parameter re-initialisation
- TRPO: Trust-Region Policy Optimization
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
REINFORCE: REINFORCE
- Baseline: REINFORCE with State-Value Baseline
SAC: Soft Actor-Critic

Model-Based

World Model: Recurrent World Models Facilitate Policy Evolution
AlphaZero: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- AlphaGo Zero Mastering the game of go without human knowledge
DreamerV3: Mastering Diverse Domains through World Models
I2A: Imagination-augmented agents for deep reinforcement learning
ICM: Curiosity-driven Exploration by Self-supervised Prediction
PETS: Probabilistic Ensembles with Trajectory Sampling
SAVE: Search with Amortized Value Estimates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReinforcementLearning.md

ReinforcementLearning.md

Artificial (General) Intelligence

Reinforcement Learning Algorithms

Model-Free

Model-Based

Files

ReinforcementLearning.md

Latest commit

History

ReinforcementLearning.md

File metadata and controls

Artificial (General) Intelligence

Reinforcement Learning Algorithms

Model-Free

Model-Based