* Updated documentation

shadogray · Apr 6, 2017 · b9f72a6 · b9f72a6
1 parent 765b792
commit b9f72a6
Show file tree

Hide file tree

Showing 18 changed files with 66 additions and 67 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1 +1,2 @@
 inst/doc
+.Rproj.user
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -8,8 +8,8 @@ Authors@R: c(person("Nicolas", "Proellochs", email="[email protected]
              person("Stefan", "Feuerriegel", email="[email protected]",
                     role=c("aut")))
 Maintainer: Nicolas Proellochs <[email protected]>
-Description: Performs model-free reinforcement learning in R. This implementation allows to learn
-    an optimal policy based on sample sequences consisting of states, actions and rewards. In 
+Description: Performs model-free reinforcement learning in R. This implementation enables the learning
+    of an optimal policy based on sample sequences consisting of states, actions and rewards. In 
     addition, it supplies multiple predefined reinforcement learning algorithms, such as experience 
     replay.
 License: MIT + file LICENSE

diff --git a/R/actionSelection.R b/R/actionSelection.R
@@ -10,8 +10,7 @@
 #' @return Character value defining the next action.
 #' @import hash
 #' @importFrom stats runif
-#' @references Sutton and Barto (1998). Reinforcement Learning: An Introduction, Adaptive
-#' Computation and Machine Learning, MIT Press, Cambridge, MA.
+#' @references Sutton and Barto (1998). "Reinforcement Learning: An Introduction", MIT Press, Cambridge, MA.
 #' @export
 epsilonGreedyActionSelection <- function(Q, state, epsilon) {
   if (runif(1) <= epsilon) {

diff --git a/R/data.R b/R/data.R
@@ -1,14 +1,14 @@
-#' Game states of 100,000 randomly sampled Tic Tac Toe games.
+#' Game states of 100,000 randomly sampled Tic-Tac-Toe games.
 #'
-#' A dataset containing 406,541 games states of Tic Tac Toe.
+#' A dataset containing 406,541 game states of Tic-Tac-Toe.
 #' The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.
-#' All states are observed from the perspective of player X who is also assumed to have played first.
+#' All states are observed from the perspective of player X, who is also assumed to have played first.
 #'
 #' @format A data frame with 406,541 rows and 4 variables:
 #' \describe{
 #'   \item{State}{The current game state, i.e. the state of the 3x3 grid.}
 #'   \item{Action}{The move of player X in the current game state.}
-#'   \item{NextState}{The next observed state after action selection of player X and B.}
+#'   \item{NextState}{The next observed state after action selection of players X and B.}
 #'   \item{Reward}{Indicates terminal and non-terminal game states. Reward is +1 for 'win', 0 for 'draw', and -1 for 'loss'.}
 #' }
 "tictactoe"
diff --git a/R/experienceReplay.R b/R/experienceReplay.R
@@ -1,8 +1,8 @@
 #' Performs experience replay
 #'
 #' Performs experience replay. Experience replay allows reinforcement learning agents to remember and reuse experiences from the past.
-#' The algorithm solely requires input data in the form of sample sequences consisting of states, actions and rewards.
-#' The result of the learning process is a state-action table Q that allows to infer the best possible action in each state.
+#' The algorithm requires input data in the form of sample sequences consisting of states, actions and rewards.
+#' The result of the learning process is a state-action table Q that allows one to infer the best possible action in each state.
 #'
 #' @param D A \code{dataframe} containing the input data for reinforcement learning.
 #' Each row represents a state transition tuple \code{(s,a,r,s_new)}.
@@ -11,8 +11,8 @@
 #' @param ... Additional parameters passed to function.
 #' @return Returns an object of class \code{hash} that contains the learned Q-table.
 #' @seealso \code{\link{ReinforcementLearning}}
-#' @references Lin (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning.
-#' @references Watkins (1992). Q-learning. Machine Learning.
+#' @references Lin (1992). "Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching", Machine Learning (8:3), pp. 293--321.
+#' @references Watkins (1992). "Q-learning". Machine Learning (8:3), pp. 279--292.
 #' @import hash
 #' @export
 experienceReplay <- function(D, Q, control, ...) {

diff --git a/R/gridworld.R b/R/gridworld.R
@@ -1,10 +1,10 @@
 #' Defines an environment for a gridworld example
 #'
-#' Function defines an environment for a 2x2 gridworld example. Here, an agent is intended to
-#' navigate from an arbitrary start position to a goal position. The grid is surrounded by a wall
+#' Function defines an environment for a 2x2 gridworld example. Here an agent is intended to
+#' navigate from an arbitrary starting position to a goal position. The grid is surrounded by a wall,
 #' which makes it impossible for the agent to move off the grid. In addition, the agent faces a wall between s1 and s4.
-#' If the agent reaches the goal position, it earns a reward of 10. Crossing each square of the grid leads
-#' to a negative reward of -1.
+#' If the agent reaches the goal position, it earns a reward of 10. Crossing each square of the grid results in
+#' a negative reward of -1.
 #'
 #' @param state The current state.
 #' @param action Action to be executed.

diff --git a/R/learningRule.R b/R/learningRule.R
@@ -1,7 +1,7 @@
 #' Loads reinforcement learning algorithm
 #'
 #' Decides upon a learning rule for reinforcement learning.
-#' Input is a name for the learning rule, output is the corresponding function object.
+#' Input is a name for the learning rule, while output is the corresponding function object.
 #'
 #' @param type A string denoting the learning rule. Allowed values are \code{experienceReplay}.
 #' @seealso \code{\link{ReinforcementLearning}}

diff --git a/R/policy.R b/R/policy.R
@@ -1,11 +1,11 @@
 #' Calculates the reinforcement learning policy
 #'
 #' Calculates reinforcement learning policy from a given state-action table Q.
-#' The policy is the decision making function of the agent and defines the learning
-#' agent's way of behaving at a given time.
+#' The policy is the decision-making function of the agent and defines the learning
+#' agent's behavior at a given time.
 #'
 #' @param x Variable which encodes the behavior of the agent. This can be
-#' either a \code{matrix}, \code{data.frame} or a \code{\link{rl}} object.
+#' either a \code{matrix}, \code{data.frame} or an \code{\link{rl}} object.
 #' @seealso \code{\link{ReinforcementLearning}}
 #' @return Returns the learned policy.
 #' @rdname policy

diff --git a/README.Rmd b/README.Rmd
@@ -21,16 +21,16 @@ set.seed(0)
 [![Build Status](https://travis-ci.org/nproellochs/ReinforcementLearning.svg?branch=master)](https://travis-ci.org/nproellochs/ReinforcementLearning)
 [![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/ReinforcementLEarning)](https://cran.r-project.org/package=ReinforcementLearning)
 
-**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation allows to learn
+**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation enables the learning of
     an optimal policy based on sample sequences consisting of states, actions and rewards. In 
     addition, it supplies multiple predefined reinforcement learning algorithms, such as experience 
     replay.
 
 ## Overview
 
-The most important functions in **ReinforcementLearning** are:
+The most important functions of **ReinforcementLearning** are:
 
-- Learning an optimal policy from a fixed set of a priori-known transition samples
+- Learning an optimal policy from a fixed set of a priori known transition samples
 - Predefined learning rules and action selection modes
 - A highly customizable framework for model-free reinforcement learning tasks
 

diff --git a/README.md b/README.md
@@ -5,14 +5,14 @@ Reinforcement Learning
 
 [![Build Status](https://travis-ci.org/nproellochs/ReinforcementLearning.svg?branch=master)](https://travis-ci.org/nproellochs/ReinforcementLearning) [![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/ReinforcementLEarning)](https://cran.r-project.org/package=ReinforcementLearning)
 
-**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation allows to learn an optimal policy based on sample sequences consisting of states, actions and rewards. In addition, it supplies multiple predefined reinforcement learning algorithms, such as experience replay.
+**ReinforcementLearning** performs model-free reinforcement learning in R. This implementation enables the learning of an optimal policy based on sample sequences consisting of states, actions and rewards. In addition, it supplies multiple predefined reinforcement learning algorithms, such as experience replay.
 
 Overview
 --------
 
-The most important functions in **ReinforcementLearning** are:
+The most important functions of **ReinforcementLearning** are:
 
--   Learning an optimal policy from a fixed set of a priori-known transition samples
+-   Learning an optimal policy from a fixed set of a priori known transition samples
 -   Predefined learning rules and action selection modes
 -   A highly customizable framework for model-free reinforcement learning tasks
 

diff --git a/ReinforcementLearning.pdf b/ReinforcementLearning.pdf
diff --git a/man/epsilonGreedyActionSelection.Rd b/man/epsilonGreedyActionSelection.Rd
diff --git a/man/experienceReplay.Rd b/man/experienceReplay.Rd
diff --git a/man/gridworldEnvironment.Rd b/man/gridworldEnvironment.Rd
diff --git a/man/lookupLearningRule.Rd b/man/lookupLearningRule.Rd
diff --git a/man/policy.Rd b/man/policy.Rd
diff --git a/man/tictactoe.Rd b/man/tictactoe.Rd