Skip to content

Uncertainty estimation algorithms for time series data in retail forecasting, featuring models including LightGBM, Bootstrapping, NGBoost, and more. TUM Data Innovation Lab project in cooperation with Lidl Stiftung & Co. KG.

Notifications You must be signed in to change notification settings

sandordaroczi/uncertainty-estimation

Repository files navigation

Uncertainty Estimation in Regression Problems

Welcome to the repository for our collaborative project, focusing on implementing various uncertainty estimation models tailored for a retail use-case. The project includes models like LightGBM, LightGBM Quantile Regressor, Bootstrapping, NGBoost, Probabilistic Gradient Boosting Machines (PGBM), Level Set Forecaster (LSF), Conformalized Regression, Temporal Fusion Transformer (TFT), and MQ-CNN.

Notebooks and Kaggle Datasets

Under the notebooks folder, you'll find Python notebooks utilized to train and evaluate the models on Kaggle datasets. The Kaggle datasets used for evaluation include:

  1. Blue Book For Bulldozers (bulldozer)
  2. Rossmann Store Sales (rossmann)
  3. Corporación Favorita Grocery Sales Forecasting (favorita)

Getting Started

1. Installation

Ensure you have poetry installed on your local machine. After cloning the project, create a virtual environment and install packages from the pyproject.toml file:

poetry update
poetry install
poetry build
pip install uncertainty_estimation_models-0.1.0-py3-none-any.whl

Make sure to separately install the various dependencies listed below.

Dependencies

For Windows users encountering issues installing the PGBM package, install Build Tools for Visual Studio and ensure you add the compiler cl to your PATH environment variable (see here). Verify Windows can find cl by executing where cl in a Windows command line terminal.

2. Build and Test

Navigate to the desired directory for the git repo and run:

git clone https://github.com/daroczisandor/uncertainty-estimation.git
cd uncertainty_estimation

Obtain the required datasets from Google Drive and store them in a folder called "datasets" at the top level of this repo.

Contribute

Contributions are welcome! Feel free to create issues or submit pull requests to enhance and extend the project. 🚀

About

Uncertainty estimation algorithms for time series data in retail forecasting, featuring models including LightGBM, Bootstrapping, NGBoost, and more. TUM Data Innovation Lab project in cooperation with Lidl Stiftung & Co. KG.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published