DataInertia

DataInertia is a lightweight framework designed to optimize datasets for machine learning workflows. It streamlines data preprocessing, cleaning, feature engineering, and reporting, making it easy to build robust pipelines for any dataset.

Key Features

Preprocessing:
- Normalize and scale numeric features.
- Encode categorical variables (one-hot or label encoding).
- Handle missing values with imputation.
Cleaning:
- Identify and remove duplicate rows.
- Detect and handle outliers (IQR or Z-score methods).
Feature Engineering:
- Generate polynomial features and interaction terms.
- Scale features using Min-Max or Standard scaling.
Pipelines:
- Build preprocessing pipelines.
- Seamlessly integrate with machine learning models.
Reporting:
- Generate PDF summary reports.
- Visualize missing data with heatmaps.
- Create diagnostics files with dataset insights.

Quick Start

Install Dependencies:
```
pip install -r requirements.txt
```

Explore Examples: Run any example script to see the framework in action:

 python examples/<example_file>.py
 
 # or

 python -m unittest discover -s tests

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data_inertia		data_inertia
docs		docs
examples		examples
tests		tests
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataInertia

Key Features

Quick Start

About

Releases

Packages

Languages

License

DataInertia/core-framework

Folders and files

Latest commit

History

Repository files navigation

DataInertia

Key Features

Quick Start

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages