The project aims to create a predictive model for financial risk in Indian women-led households, introducing a new financial vulnerability index. It will utilize exploratory data analysis (EDA) for insights and visualize hotspots of financial vulnerability across India. This effort seeks to enhance financial resilience and empower these households.
https://www.isdm.org.in/isdm-code-for-change
Problem Statement :
https://drive.google.com/file/d/17tXpKGaFbKowS8ex2fb91GmhY1qwhTQR/view https://drive.google.com/file/d/1qGKPnNiICJErxzRZXwJtqzEqJCSSJVRH/view https://drive.google.com/file/d/17tXpKGaFbKowS8ex2fb91GmhY1qwhTQR/view
Dataset :
https://drive.google.com/file/d/1ukaTvnp_Fm2je4gh3AwORW6fxvISRXqi/view
Solution:
Step 1 : Data Understanding
Preprocessing Steps:
1)Data Cleaning: Removed irrelevant columns and handled missing values through imputation or deletion.
2)Feature Engineering: Created new features by combining or transforming existing ones to improve model performance.
3)Normalization/Standardization: Scaled numerical features to ensure they have a similar range, preventing dominance by certain features.
4)Encoding Categorical Variables: Converted categorical variables into numerical format using techniques like one-hot encoding or label encoding.
Step 2 : EDA
Results
Step 3 : Model Selection
Algorithm Choices: Random Forest, Logistic Regression, Naive Bayes, SVM.
Evaluation Criteria: Selected based on performance metrics (accuracy, precision, recall).
Step 4 : Model Development
Feature Scaling: Normalized numerical features for better model performance.
Model Training: Trained selected algorithms using training data.
Step 5 : Model Evaluation
Performance Metrics: Evaluated models on testing data using metrics like accuracy, precision, recall, and F1-score.
Cross-Validation: Applied cross-validation techniques to ensure robustness of models.
Step 6 : Model Selection and Tuning
Hyperparameter Tuning: Optimized model parameters using techniques like grid search.
Final Model Selection: Chose Random Forest as the best-performing model based on evaluation results
Step 7:Results
Accuracy:
Random Forest: 85% accuracy
Logistic Regression: 78% accuracy
Naive Bayes: 72% accuracy
SVM: 80% accuracy
Visualize financial vulnerability hotspots across India
CONCLUSION
Our analysis identified critical factors impacting financial vulnerability in women-headed households, including education levels and household income. The Random Forest model demonstrated the highest accuracy in predicting financial vulnerability, enabling targeted interventions. Predictive analytics can play a vital role in empowering women in rural India by directing resources effectively. Understanding financial vulnerability can inform policy decisions and interventions aimed at socio-economic empowerment.