An end-to-end machine learning project in which I used tensorflow
to construct recurrent neural nets and convolutional neural nets as well as scikit-learn
to construct linear regression and random forests for multi-step ahead time series prediction of mean ambient air temperature for a region of Mars known as the Gale Crater.
It is recommended to use Miniconda or Anaconda to install the packages for the repository.
To install the environment without GPU acceleration, run the following commands:
conda install tensorflow scikit-learn statsmodels pandas numpy scipy requests beautifulsoup4
conda install -c conda-forge matplotlib keras-tuner tqdm
To install the environment with GPU acceleration, run the following commands:
conda install tensorflow-gpu scikit-learn statsmodels pandas numpy scipy requests beautifulsoup4
conda install -c conda-forge matplotlib keras-tuner tqdm
Sols 1 through 2837 were scraped from the REDUCED_DATA
archive for the NASA Curiosity Rover Environmental Monitoring System (REMS) available through the Mars Science Laboratory using requests
and beautifulsoup4
. Specifically, the REMS MODRDR
data was used since this is the most processed and therefore best prepared for data analysis. The REMS_DESCRIPTION.txt
(link) describes the instruments and operational capabilities of Curiosity's REMS. The REMS_MORDR_DS.CAT
(link) describes the MODRDR
cleaned data. The MODRDR6.FMT
(link) describes the format of the MODRDR
data files. Notably, the MODRDR
data files have the suffix RMD
(link).
The docs
folder contains the thesis itself with all findings and methods included.
I will likely use an open source MLOps library such as neptune-client
in the future because tracking hyperparameters using yaml
proved to be a fairly annoying task.