This personal research project sought to contribute to the field of machine learning for the healthcare and fitness industry by investigate the efficacy of neural models for consumer-grade wearable fitness tracking devices.
To learn more, please read through the full report, from which the abstract is displayed below. For even more detail, I've also included the full (quite messy) notebooks in which this project was completed: eda.ipynb, eda_xgboost.ipynb, preprocessing.ipynb, model_training_final.ipynb (for neural models), and model_training_xgboost.ipynb (for the XGBoost model).
Obesity and weight gain remain unyielding public health concerns, driving a great demand for wearable devices that can accurately and conveniently monitor physical activity and energy expenditure. Recent advancements in neural networks, specifically Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) models, have shown promising abilities to extract complex features from raw data signals and yield high accuracies in energy expenditure estimation. However, these studies have either relied on additional data that modern wearable devices cannot provide, or have been limited in the types of physical activities performed during data collection. This research builds upon the work of these authors by evaluating and comparing their methods of CNN feature extraction, with and without the assistance of an LSTM, against XGBoost on a dataset that reflects the capabilities of modern wearable devices and a broader range of activities. After testing a total of 192 hyperparameter combinations to rule out model architecture issues, the best RMSE that could be achieved was 5.08 kcal/min, yielding a 146%CV, which is far beyond any acceptable limit for real-world accuracy. While falling short of the best performing XGBoost model with 1.02 kcal/min RMSE (39% CV), these findings highlight the continuing challenges posed by implementing these devices and algorithms on the consumer market.