Machine Learning Classification with german credit data from UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)
- File germancredit contains data visalisation, preprocessing steps and literally all that needed to be done in order to find the best model incl. parameter settings.
- File Final_Model contains the final, best classifying models
- SVC
- Gaussian Naive Bayes
- Randomforest Classifier
- Extratrees Classifier
- Gradient Boosting Classifier
- AdaBoost Classifier
- Bagging Classifier
Best Algorithm is Gradient Boosting Classifier with a 10-fold Cross-Validation:
- Cross Validation Precision: 0.85 (+/- 0.17)
- Cross Validation Recall: 0.86 (+/- 0.04)
- Cross Validation roc_auc: 0.91 (+/- 0.09)