Binary Classification of incomes as <50k or >50k using decision trees and random forests in R.
Data can be found here: https://www.kaggle.com/uciml/adult-census-income
Kaggle notebook: https://www.kaggle.com/lavanyask/adult-census-income-classify
The project aims at classifying incomes as <50k or >50k based on census data. It is organised as follows:
- Data exploration
- Data cleaning and preprocessing
- Training a decision tree model and a random forest model
- Evaluating the performance of model by ROC and AUC curves