Repo for final project at Planning Algorithms course at My University.
We want to build an reinforcement learning algorithm to trade stocks with maximum efficiency. To do this, we use:
- Rapidly exploring random tree (RRT)
- Value Iteration (VI)
- Deep Q-learing
- Temporal difference
- Model-predictive control
data
folder holds the data we train and test onutils
contains some useful functions, i.e. to prepare data for modelsmodels
folder has subfolders dedicated to each algorithm we usepipeline.ipynb
is the main file which runs the whole stuff
Todo, see collaborators :D
- Our actions are: sell, buy, do nothing
- Our environment is (vector of prices, vector of pct_changes, current day & day of week)
- Observation is (10 last prices, 10 last pct_changes (9 mb?), current day of week, number of stocks on agent's balance, money he has)
- Reward function: total cost (n_stocks*stock_price + curr_money) - start_money