Accompanying code to a master thesis project. The full version of the thesis as submitted on 04 July 2020 can be found in file final-version.pdf
The individual directories contain code for various parts of the project:
- ache-config contains scripts to deploy and control the ACHE crawler.
- cambridgeFeed contains files to repair and upload an SQL database. Some files from this directory could not be shared publicly.
- dataPrep contains Python scripts and Jupter Notebooks to experiment with and prepara training data for the neural network.
- neuralNetFirst Scripts to build, train and evaluate the neural network model in Tensorflow.
- pastebinFeed Scripts and Jupyter Notebooks to download and save data from Pastebin via psbdmp API
- redditFeed Jupyter Notebook to download and save data from Reddit via Pushshift.io
- twitterFeed contains scripts to download and save data from the Twitter API. Credenials and some configuration files have not been shared publicly.