Skip to content

Commit

Permalink
* Updated documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
nproellochs committed Apr 6, 2017
1 parent 765b792 commit b9f72a6
Show file tree
Hide file tree
Showing 18 changed files with 66 additions and 67 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
inst/doc
.Rproj.user
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ Authors@R: c(person("Nicolas", "Proellochs", email="[email protected]
person("Stefan", "Feuerriegel", email="[email protected]",
role=c("aut")))
Maintainer: Nicolas Proellochs <[email protected]>
Description: Performs model-free reinforcement learning in R. This implementation allows to learn
an optimal policy based on sample sequences consisting of states, actions and rewards. In
Description: Performs model-free reinforcement learning in R. This implementation enables the learning
of an optimal policy based on sample sequences consisting of states, actions and rewards. In
addition, it supplies multiple predefined reinforcement learning algorithms, such as experience
replay.
License: MIT + file LICENSE
Expand Down
3 changes: 1 addition & 2 deletions R/actionSelection.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@
#' @return Character value defining the next action.
#' @import hash
#' @importFrom stats runif
#' @references Sutton and Barto (1998). Reinforcement Learning: An Introduction, Adaptive
#' Computation and Machine Learning, MIT Press, Cambridge, MA.
#' @references Sutton and Barto (1998). "Reinforcement Learning: An Introduction", MIT Press, Cambridge, MA.
#' @export
epsilonGreedyActionSelection <- function(Q, state, epsilon) {
if (runif(1) <= epsilon) {
Expand Down
8 changes: 4 additions & 4 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
#' Game states of 100,000 randomly sampled Tic Tac Toe games.
#' Game states of 100,000 randomly sampled Tic-Tac-Toe games.
#'
#' A dataset containing 406,541 games states of Tic Tac Toe.
#' A dataset containing 406,541 game states of Tic-Tac-Toe.
#' The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.
#' All states are observed from the perspective of player X who is also assumed to have played first.
#' All states are observed from the perspective of player X, who is also assumed to have played first.
#'
#' @format A data frame with 406,541 rows and 4 variables:
#' \describe{
#' \item{State}{The current game state, i.e. the state of the 3x3 grid.}
#' \item{Action}{The move of player X in the current game state.}
#' \item{NextState}{The next observed state after action selection of player X and B.}
#' \item{NextState}{The next observed state after action selection of players X and B.}
#' \item{Reward}{Indicates terminal and non-terminal game states. Reward is +1 for 'win', 0 for 'draw', and -1 for 'loss'.}
#' }
"tictactoe"
8 changes: 4 additions & 4 deletions R/experienceReplay.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#' Performs experience replay
#'
#' Performs experience replay. Experience replay allows reinforcement learning agents to remember and reuse experiences from the past.
#' The algorithm solely requires input data in the form of sample sequences consisting of states, actions and rewards.
#' The result of the learning process is a state-action table Q that allows to infer the best possible action in each state.
#' The algorithm requires input data in the form of sample sequences consisting of states, actions and rewards.
#' The result of the learning process is a state-action table Q that allows one to infer the best possible action in each state.
#'
#' @param D A \code{dataframe} containing the input data for reinforcement learning.
#' Each row represents a state transition tuple \code{(s,a,r,s_new)}.
Expand All @@ -11,8 +11,8 @@
#' @param ... Additional parameters passed to function.
#' @return Returns an object of class \code{hash} that contains the learned Q-table.
#' @seealso \code{\link{ReinforcementLearning}}
#' @references Lin (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning.
#' @references Watkins (1992). Q-learning. Machine Learning.
#' @references Lin (1992). "Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching", Machine Learning (8:3), pp. 293--321.
#' @references Watkins (1992). "Q-learning". Machine Learning (8:3), pp. 279--292.
#' @import hash
#' @export
experienceReplay <- function(D, Q, control, ...) {
Expand Down
8 changes: 4 additions & 4 deletions R/gridworld.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#' Defines an environment for a gridworld example
#'
#' Function defines an environment for a 2x2 gridworld example. Here, an agent is intended to
#' navigate from an arbitrary start position to a goal position. The grid is surrounded by a wall
#' Function defines an environment for a 2x2 gridworld example. Here an agent is intended to
#' navigate from an arbitrary starting position to a goal position. The grid is surrounded by a wall,
#' which makes it impossible for the agent to move off the grid. In addition, the agent faces a wall between s1 and s4.
#' If the agent reaches the goal position, it earns a reward of 10. Crossing each square of the grid leads
#' to a negative reward of -1.
#' If the agent reaches the goal position, it earns a reward of 10. Crossing each square of the grid results in
#' a negative reward of -1.
#'
#' @param state The current state.
#' @param action Action to be executed.
Expand Down
2 changes: 1 addition & 1 deletion R/learningRule.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' Loads reinforcement learning algorithm
#'
#' Decides upon a learning rule for reinforcement learning.
#' Input is a name for the learning rule, output is the corresponding function object.
#' Input is a name for the learning rule, while output is the corresponding function object.
#'
#' @param type A string denoting the learning rule. Allowed values are \code{experienceReplay}.
#' @seealso \code{\link{ReinforcementLearning}}
Expand Down
6 changes: 3 additions & 3 deletions R/policy.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
#' Calculates the reinforcement learning policy
#'
#' Calculates reinforcement learning policy from a given state-action table Q.
#' The policy is the decision making function of the agent and defines the learning
#' agent's way of behaving at a given time.
#' The policy is the decision-making function of the agent and defines the learning
#' agent's behavior at a given time.
#'
#' @param x Variable which encodes the behavior of the agent. This can be
#' either a \code{matrix}, \code{data.frame} or a \code{\link{rl}} object.
#' either a \code{matrix}, \code{data.frame} or an \code{\link{rl}} object.
#' @seealso \code{\link{ReinforcementLearning}}
#' @return Returns the learned policy.
#' @rdname policy
Expand Down
6 changes: 3 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,16 @@ set.seed(0)
[![Build Status](https://travis-ci.org/nproellochs/ReinforcementLearning.svg?branch=master)](https://travis-ci.org/nproellochs/ReinforcementLearning)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/ReinforcementLEarning)](https://cran.r-project.org/package=ReinforcementLearning)

**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation allows to learn
**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation enables the learning of
an optimal policy based on sample sequences consisting of states, actions and rewards. In
addition, it supplies multiple predefined reinforcement learning algorithms, such as experience
replay.

## Overview

The most important functions in **ReinforcementLearning** are:
The most important functions of **ReinforcementLearning** are:

- Learning an optimal policy from a fixed set of a priori-known transition samples
- Learning an optimal policy from a fixed set of a priori known transition samples
- Predefined learning rules and action selection modes
- A highly customizable framework for model-free reinforcement learning tasks

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@ Reinforcement Learning

[![Build Status](https://travis-ci.org/nproellochs/ReinforcementLearning.svg?branch=master)](https://travis-ci.org/nproellochs/ReinforcementLearning) [![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/ReinforcementLEarning)](https://cran.r-project.org/package=ReinforcementLearning)

**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation allows to learn an optimal policy based on sample sequences consisting of states, actions and rewards. In addition, it supplies multiple predefined reinforcement learning algorithms, such as experience replay.
**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation enables the learning of an optimal policy based on sample sequences consisting of states, actions and rewards. In addition, it supplies multiple predefined reinforcement learning algorithms, such as experience replay.

Overview
--------

The most important functions in **ReinforcementLearning** are:
The most important functions of **ReinforcementLearning** are:

- Learning an optimal policy from a fixed set of a priori-known transition samples
- Learning an optimal policy from a fixed set of a priori known transition samples
- Predefined learning rules and action selection modes
- A highly customizable framework for model-free reinforcement learning tasks

Expand Down
Binary file modified ReinforcementLearning.pdf
Binary file not shown.
3 changes: 1 addition & 2 deletions man/epsilonGreedyActionSelection.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/experienceReplay.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/gridworldEnvironment.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/lookupLearningRule.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/policy.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/tictactoe.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit b9f72a6

Please sign in to comment.