06_conclusion.Rmd

---
output: pdf_document
geometry: margin = 1in
---

## 5. Conclusion

In this bachelor thesis, we deal with three different fields. First, the network architecture of a neural network is investigated, then the model is analyzed using an XAI approach. In the last step, the information found in this paper and general time series analysis is combined to form a trading strategy. The methodology is based on the guiding questions defined in chapter [1](#introduction):

1) How does the selection of the network architecture influence the quality of predictions? What influence can be found concerning the Sharpe ratio of a simple trading strategy?

Pursuing the first guiding question, all possible network architecture combinations between one layer with one neuron and three layers with ten neurons each are quantitatively compared. The plots of the error function MSE in chapter [3.2.4](#evaluate_nn) show that a complex model is more prone to overfitting. The Sharpe ratio is used to assess trading performance. There it can be seen that the inverse relationship between the in-sample and out-of-sample indicates overfitting characteristics. In conclusion, no rule of thumb can be found that works well without testing and comparing. The model should be complex enough for the sake of accuracy, but should not feature overfitting. Ultimately, it seems to make sense that the number of neurons should be in the range of the number of inputs.

2) What is the added value of enhancing this neural network with aspects of explainable artificial intelligence (XAI)? 

Of the methods presented, the linear parameter data approach (LPD) turns out to be useful for financial time series. Important reasons for this are that the data should remain chronologically ordered and the dependency structure should remain in place. The derivatives after the intercept and the weights reveal important points. On one hand, there is an analogy to linear regression, which suggests that some explanatory variables are more important than others at a point in time. On the other hand, analogous to the autocorrelation of the Bitcoin log returns, it shows which lags are important and should be fed into the neural network. LPD cannot explain the operation of the network, the role of weights and error terms, and the development of the backpropagation algorithm. In summary, the derivations do a good job of revealing when the neural network is unstable, and thus inaccurate outputs are to be expected. 

3) How can the acquired information about neural networks, XAI, and general time series analysis be used to define an efficient trading strategy?

The selected neural network provides the predictions of the log return of the next day. Referring to the central limit theorem, 1,000 neural networks are optimized per prediction. With the help of a majority decision, which depends on a parameter $\beta$, the definite signal is derived from the neuronal network. At the same time, LPD shows in which time periods the model is unstable and therefore the output should be taken with caution. Extreme events of the individual derivatives in the LPD plot are combined to generate an additional trading signal. With the addition of a GARCH model, future volatility is estimated and if estimates are high, the market is exited as a precaution. Combining this information, it is noticeable that over the period tested, the buy-and-hold benchmark beats all methods. However, the addition of the signals from the LPD lead to better returns than the same model without LPD. Therefore, for the tested period, the addition of XAI seems to offer an effective added value. 

Investing is about having your money put in the market. Therefore, the thought process is continued by making an alternative investment in Ether during phases where the previous methods recommend going out of the market. Surprisingly, the strategy that combines the signals from the neural network, LPD and adds Ether on top leads to an outperformance of buy-and-hold for the chosen period. 


\newpage

### 5.1. Outlook

There are many different ways to implement a trading strategy. Due to the time limitation, this thesis only investigates a small part of the applications of feed forward networks in trading. Recent events in the cryptocurrency market, which occured at the end of the writing phase of this thesis, made us curios how the model would perform with the recent drawdown. Therefore, we import the latest BTC and ETH prices in order to evaluate whether the strategy would have worked in recent times. The model handles the drawdown phase properly as seen in figure \ref{fig:with_eth_sole_outlook}. The massive loss in the beginning of May 2021 is amplified. However, shortly after the trend reverses, it makes up for the 'mistakes' and reaches a level better than buy-and-hold. 

```{r with_eth_sole_outlook, fig.align='center', out.width='80%', fig.cap="Data collected at the end of the writing phase of this thesis",echo=FALSE, fig.width = 8, fig.height = 5, fig.keep='last'}

load("data/xai/7_7_withsignal_xai_in/performance_with_eth.rda")
load("data/xai/7_7_withsignal_xai_in/nn_lpd_without_eth.rda")
load("data/xai/7_7_withsignal_xai_in/nn_lpd_without_eth_outlook.rda")


signal <- data$BTC.USD.Close
signal$BTC.USD.Close <- nn_lpd

data=data[,c(1,5)]

concl=cumsum(concl["2021-03-27::"])
concl[,1]=concl[,1]+1.792302
concl[,2]=concl[,2]+2.45390

data = merge(data,concl)

colors= c("green","darkorchid")
plot(data, col=colors, main=TeX(sprintf("Performance cumulated from 9 splits, $\\lambda = %d$", 1)))

addLegend("left",
          legend.names=c(TeX(sprintf("Buy-and-hold", 3.57)),
                         TeX(sprintf("LPD+NN+ETH if 0 $\\kappa = %.1f$", 0.2, 4.41))),
          col=colors,
          lty=c(rep(1,13),2),
          lwd=c(rep(2,13),3),
          ncol=1,
          bg="white")

events<- xts("end", as.Date("2021-03-27"))

addEventLines(events, srt=0, pos=1, lty=3, col ="blue",lwd=2, cex=1.2)
# lines(signal, on=NA, lwd=2, col="red", ylim=c(-1.3, 1.3))

```

Besides the Feedforward Networks (FFN), there are also other methods such as Recurrent Neural Networks (RNN), Gated Recurrent Unit (GRU) or Long Short-Term Memory (LSTM). It is reasonable to investigate whether the choice of network type makes a difference for our case. This is exactly what we investigate in another econometrics project [@oeko3project]. In this work, we have come to the conclusion that choosing a recurrent model does not add much value. Especially due to the fact that lagged log returns are used as features and thus our FFN also has recurrent properties. The idea could be expanded by testing other traditional methods of traditional time series analysis in addition to GARCH. Specifically, we think of different ARMA or ARMA-APARCH models that could be combined with feedforward networks. In this thesis only the log returns were used as features. It would also be conceivable to extend these features with smoothed Moving Average (MA) log returns. Speaking of additional inputs, it might make sense to include elements and indicators of technical chart analysis as well. For example, the Relative Strength Index (RSI) or Moving Average Convergence Divergence (MACD) could help us understand the trend direction and the momentum of that trend better.

Additionally, it should be mentioned at this point that XAI is a young term and is still under research in the field of finance. While in the scope of this bachelor thesis an added value by LPD was discovered, further findings in the area of XAI could achieve similar or even larger effects.  Another important point is that only the R-package `neuralnet` was used in this work. Optimizing a neural network will probably behave differently if alternative packages like `keras` are used. Whether this difference is negligible, or could not be investigated in a further step.

\newpage

Last but not least, it must be mentioned that this entire paper is based on backtesting. Using and comparing with historical data helps to optimize a strategy and thus to get a feeling for it. However, this method does not say much about the future, as market conditions are constantly changing.  Factors that played a large role in market behavior in the past may have flattened out in future periods. This phenomenon can be observed in the stock markets with changing interest rates, market uncertainties and global news. The behavior of cryptocurrency markets has been researched sporadically, but these researches are based on data that is often less than 10 years old. Cryptocurrencies are banned in a few countries, have not yet been fully recognized as an asset class, and are not regulated by an exchange regulator. In order to make a clear and realistic statement about our trading strategies, it would be necessary to test them for a longer period of time during different market phases.