Skip to content

DataInertia/core-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataInertia

DataInertia is a lightweight framework designed to optimize datasets for machine learning workflows. It streamlines data preprocessing, cleaning, feature engineering, and reporting, making it easy to build robust pipelines for any dataset.

Key Features

  • Preprocessing:
    • Normalize and scale numeric features.
    • Encode categorical variables (one-hot or label encoding).
    • Handle missing values with imputation.
  • Cleaning:
    • Identify and remove duplicate rows.
    • Detect and handle outliers (IQR or Z-score methods).
  • Feature Engineering:
    • Generate polynomial features and interaction terms.
    • Scale features using Min-Max or Standard scaling.
  • Pipelines:
    • Build preprocessing pipelines.
    • Seamlessly integrate with machine learning models.
  • Reporting:
    • Generate PDF summary reports.
    • Visualize missing data with heatmaps.
    • Create diagnostics files with dataset insights.

Quick Start

  1. Install Dependencies:

    pip install -r requirements.txt
    
    
  2. Explore Examples: Run any example script to see the framework in action:

     python examples/<example_file>.py
     
     # or
    
     python -m unittest discover -s tests

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages