feat: add windows support for lstm-bayesian and time series transform…

…er, modify api calls for gemini usage.
freddysongg · Nov 25, 2024 · 029de86 · 029de86
1 parent 1015bc9
commit 029de86
Show file tree

Hide file tree

Showing 17 changed files with 1,535 additions and 197 deletions.
diff --git a/.env b/.env
@@ -0,0 +1 @@
+GEMINI_API_KEY = 'AIzaSyDVGTJsgu2T3vpF6ZndIu9vJqA1GMx4yBI
diff --git a/.gitignore b/.gitignore
@@ -2,6 +2,8 @@ venv/
 __pycache__/
 .ipynb_checkpoints/
 
-models/
-
 logs/
+
+data/__pycache__/
+src/__pycache__/
+
diff --git a/README.md b/README.md
@@ -1,51 +1,85 @@
 # CaféCast ☕📊
 
-**CaféCast** is an AI-powered sales forecasting and product analysis application. This project focuses on leveraging advanced machine learning models and data visualization techniques to explore various datasets, fine-tune hyperparameters, and analyze complex temporal patterns. While not optimized for a production-grade frontend, the application demonstrates expertise in applying cutting-edge algorithms and interpretability techniques to time series forecasting and recommendation tasks.
+**CaféCast** is an AI-powered sales forecasting and product analysis application. This project focuses on testing and enhancing my knowledge of machine learning models, hyperparameter fine-tuning, and model interpretability. It serves as a learning platform to explore advanced techniques, refine my skills, and deepen my understanding of how to apply machine learning to complex real-world problems. The main goal is to practice and improve while exploring the capabilities of various machine learning models and methodologies.
 
 ---
 
 ## Features 🚀
 
-- **Sales Forecasting:**
-  - Advanced LSTM-based forecasting with Bayesian optimization and iterative tuning approaches.
-  - State-of-the-art Time Series Transformers for handling complex temporal dependencies.
-  - Traditional ARIMA modeling for interpretable short-term forecasts.
-- **Data Visualization:** Insights into sales trends, seasonality, and product demand patterns.
-- **Explainable AI:** Integration of SHAP values and attention mechanisms for model interpretability.
-- **Customization-Ready:** Easily adaptable for analyzing and forecasting data across various café datasets.
+- **Cross-Platform Model Execution:**
+  - Supports **TensorFlow (CPU)** for macOS users to ensure compatibility and efficient local execution.
+  - Utilizes **PyTorch (CUDA)** for Windows users with GPU support for accelerated training and predictions.
+  - Automatic detection of the operating system to ensure the correct model implementation is selected.
+- **Broad Forecasting Capabilities:**
+  - Handles **multiple concurrent predictions**, enabling comprehensive sales and demand insights.
+  - Adaptable to a wide range of café datasets and forecasting requirements.
+- **Sales Forecasting Models:**
+  - Advanced LSTM-based models with Bayesian optimization and iterative tuning for precise forecasts.
+  - Time Series Transformers for capturing complex temporal dependencies.
+  - Classical ARIMA modeling for interpretable, short-term linear trend forecasts.
+- **Enhanced Model Flexibility:**
+  - Configurable hyperparameters to suit the unique characteristics of each dataset.
+  - Dynamic support for both platform-optimized and manually fine-tuned workflows.
+- **Explainable AI:** SHAP values and attention mechanisms provide transparency into model decisions.
+- **Data Visualization:** Visual insights into sales trends, seasonality, and demand patterns.
+
+---
+
+## Purpose and Motivation 🎯
+
+The primary objective of **CaféCast** is to test my knowledge, challenge myself, and practice advanced machine learning techniques. Through this project, I aim to:
+
+- **Deepen Understanding:** Dive into various machine learning models, including LSTM, Transformers, and ARIMA.
+- **Refine Skills:** Gain hands-on experience in hyperparameter fine-tuning using Bayesian optimization and manual iterative tuning.
+- **Explore Interpretability:** Learn and apply techniques like SHAP values and attention mechanisms to make models more transparent.
+- **Emphasize Learning:** Approach this as a learning process, with the goal of improving my practical skills in applying machine learning models to real-world forecasting tasks.
+
+This project is a testament to continuous learning and experimentation in the field of AI and machine learning.
 
 ---
 
 ## Models and Methodologies 📘
 
 ### 1. **LSTM with Bayesian Optimization**
-- Utilizes Bayesian optimization to automatically tune critical hyperparameters such as:
+- Automatically tunes hyperparameters, including:
   - Learning rate
-  - Number of layers
-  - Neurons per layer
+  - Number of layers and neurons per layer
   - Dropout rates
-- This approach balances exploration and exploitation, ensuring optimal configurations with reduced computational overhead.
+- Strikes a balance between exploration and exploitation to achieve efficient and optimal configurations.
 
 ### 2. **Iterative LSTM Tuning**
-- Applies a manual, systematic method to refine model performance through:
+- A hands-on approach for refining models through:
   - Adjustments to sequence length and batch size
   - Monitoring validation error trends
-  - Incremental fine-tuning of parameters based on observed performance
-- Ensures the LSTM models are tailored to the unique characteristics of each dataset.
+  - Incremental parameter tuning based on observed performance
 
 ### 3. **Time Series Transformers**
-- Implements Transformer-based models for time series forecasting, leveraging self-attention mechanisms to:
-  - Capture long-term temporal dependencies effectively
-  - Model complex seasonality and trends in data
-  - Provide highly interpretable attention weights for feature importance
+- Uses self-attention mechanisms to:
+  - Capture long-term dependencies and seasonality
+  - Model complex temporal patterns in sales data
+- Supports multi-target predictions for key metrics like sales and revenue.
 
 ### 4. **ARIMA Model**
-- Integrates an ARIMA (AutoRegressive Integrated Moving Average) model for classical time series analysis.
-- Features:
-  - Strong interpretability for short-term linear trends and seasonality
-  - Complements deep learning models for robust hybrid forecasting strategies
+- A classical time series model for capturing linear trends and seasonality.
+- Complements deep learning models for hybrid forecasting strategies.
 
-By combining these methodologies, CaféCast excels in flexibility and precision, adapting seamlessly to varying datasets.
+---
+
+## Recent Enhancements 🌟
+
+1. **Platform-Specific Execution:**
+   - macOS: TensorFlow-based models optimized for CPU execution.
+   - Windows: PyTorch-based models leveraging CUDA for GPU acceleration.
+   - Automatic logging to indicate which implementation is running.
+   - Note: this is all for training of the model, mainly just convenience for me :D
+
+2. **Broader Prediction Capabilities:**
+   - Added support for handling multiple predictions across various datasets.
+   - Enhanced flexibility to adapt to different temporal forecasting requirements.
+
+3. **Improved Model Interpretability:**
+   - SHAP values for LSTM and ARIMA models to explain predictions.
+   - Attention weights in Transformer models to provide insights into feature importance.
 
 ---
 
@@ -55,13 +89,14 @@ By combining these methodologies, CaféCast excels in flexibility and precision,
 - **Core Libraries:**
   - NumPy, Pandas for data manipulation
   - Matplotlib for visualization
-  - TensorFlow & PyTorch for deep learning
+  - TensorFlow (CPU) for macOS
+  - PyTorch (CUDA) for Windows
   - scikit-learn and statsmodels for preprocessing and ARIMA modeling
 - **Optimization Techniques:**
   - Bayesian Optimization for automated hyperparameter tuning
-  - Iterative tuning for manual refinement of LSTM models
+  - Iterative tuning for manual refinement of models
 - **Model Interpretability:** SHAP, Attention Mechanisms
-- **Environment:** Designed for local execution on M1 Pro MacBook Pro or similar hardware
+- **Environment:** Tested on macOS and Windows with hardware-optimized configurations
 
 ---
 
@@ -77,17 +112,21 @@ By combining these methodologies, CaféCast excels in flexibility and precision,
     source venv/bin/activate  # On Windows: venv\Scripts\activate
 3. Install dependencies:
     ```bash
-    pip install -r requirements.txt
+    pip install -r requirements.txt # On Windows: add in pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 for Torch-CUDA compatability
 
 ---
 
 ## Usage 💡
-1. **Load Dataset:** Start by uploading your time series dataset in CSV format.
-2. **Choose a Model:** Options include:
-    - LSTM with Bayesian Optimization for automated fine-tuning and precision forecasts.
-    - Iterative LSTM for a hands-on, customized modeling experience.
-    - Time Series Transformers for advanced temporal analysis with attention mechanisms.
-    - ARIMA for interpretable, classical time series modeling.
+1. **Run the Application:** 
+    ```bash
+    python src/main.py
+2. **Menu Options:**
+    1: Run LSTM Model
+    2: Run Time Series Transformer Model
+    3: Run Bayesian LSTM Optimization
+    4: Clear LSTM Model Parameters
+    5: Clear Transformer Model Parameters
+    6: Run ARIMA Model
 3. **Run Forecasts:** Generate detailed predictions for various time frames and visualize results.
 4. **Analyze Results:** Use visualizations and SHAP-based interpretability tools to gain insights into trends and model behavior.
 
@@ -99,8 +138,8 @@ cafecast/
 ├── data/               # Sample datasets and preprocessing scripts
 ├── logs/               # Logging files for training and debugging
 ├── models/             # Model definitions and training scripts
-├── notebooks/          # Jupyter notebooks for exploration and prototyping
 ├── params/             # Hyperparameter files and configurations
+├── src/                # Main application code and model implementations
 ├── venv/               # Virtual environment for dependency management
 ├── requirements.txt    # Project dependencies
 └── README.md           # Project documentation

diff --git a/data/preprocess_data.py b/data/preprocess_data.py
@@ -102,7 +102,7 @@ def process_data(file_path, output_path):
     Returns:
         pd.DataFrame: Processed DataFrame.
     """
-    data = pd.read_excel(file_path, engine='openpyxl')
+    data = pd.read_csv(file_path)
 
     data['transaction_date'] = pd.to_datetime(data['transaction_date'])
 

diff --git a/models/best_lstm_model.pth b/models/best_lstm_model.pth
diff --git a/models/best_ts_transformer_model.pth b/models/best_ts_transformer_model.pth
diff --git a/models/scaler_lstm.pkl b/models/scaler_lstm.pkl
diff --git a/models/scaler_ts_transformer.pkl b/models/scaler_ts_transformer.pkl
diff --git a/params/best_ts_transformer_params.json b/params/best_ts_transformer_params.json
@@ -1,9 +1,9 @@
 {
-    "num_layers": 2,
     "num_heads": 16,
+    "num_layers": 2,
     "d_model": 64,
-    "dim_feedforward": 192,
     "ff_dim": 128,
     "learning_rate": 0.001,
-    "dropout_rate": 0.1
+    "dropout_rate": 0.1,
+    "dim_feedforward": 192
 }
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		GEMINI_API_KEY = 'AIzaSyDVGTJsgu2T3vpF6ZndIu9vJqA1GMx4yBI