With data explosion and media technology booming in this big data era, applying machine learning methods to stock prediction has become a heated topic of Fintech. Traditional feature selection methods and linear models depend on statistical correlation while disregarding causality, leading to poor prediction in non-linear and evolutionary stock markets. Based on the Iterative Parent-Child based search of Markov Blanket (IPCMB) and Classification and Regression Tree (CART), this paper proposes a novel IPCMB-CART model to lay emphasis on causality relation and improve the performance of stock rise and fall prediction. Its result shows the highest forecast accuracy of 58.0%, 7.7% above the benchmark model, with a 12-month cumulative return 35.88% higher than the CSI 300 index, which proves the effectiveness of the proposed model.
Causality-based prediction, Stock forecasting, Causal feature selection, IPCMB, CART
As the dataset is too big, pls download it form https://pan.baidu.com/s/1B94Mmn-O8sj_11YEp5VtMg?pwd=8ecu