Predicting Heart Disease risk using combined UCI data and machine learning

Basic info

In this jupyter notebook (Predicting Heart Disease risk using combined UCI data and machine learning.ipynb ), we use scikit-learn to train and test classifiers that can decide if a person is at risk of (or has) heart disease, or not, based on patients' results on simple medical tests.

Dataset

The data (heart.csv) was collected from https://www.kaggle.com/fedesoriano/heart-failure-prediction. They are the combination of 5 different datasets that had some overlaping classes. For more info click on the link. The labels are:

Age: age of the patient [years]
Sex: sex of the patient [M: Male, F: Female]
ChestPainType: chest pain type [TA: Typical Angina, ATA: Atypical Angina, NAP: Non-Anginal Pain, ASY: Asymptomatic]
RestingBP: resting blood pressure [mm Hg]
Cholesterol: serum cholesterol [mm/dl]
FastingBS: fasting blood sugar [1: if FastingBS > 120 mg/dl, 0: otherwise]
RestingECG: resting electrocardiogram results [Normal: Normal, ST: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), LVH: showing probable or definite left ventricular hypertrophy by Estes' criteria]
MaxHR: maximum heart rate achieved [Numeric value between 60 and 202]
ExerciseAngina: exercise-induced angina [Y: Yes, N: No]
Oldpeak: oldpeak = ST [Numeric value measured in depression]
ST_Slope: the slope of the peak exercise ST segment [Up: upsloping, Flat: flat, Down: downsloping]
HeartDisease: output class [1: heart disease, 0: Normal]

The dataset contains 918 (non-duplicate) observations. Notice that the target label is binary, for simplicity.

Results

After training and testing, we conclude that many of the trained models are succesful in predicting the correct output label. This are some sample confusion matrices of some models:

References

fedesoriano. (September 2021). Heart Failure Prediction Dataset. Retrieved [Date Retrieved] from https://www.kaggle.com/fedesoriano/heart-failure-prediction.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Predicting Heart Disease risk using combined UCI data and machine learning.ipynb		Predicting Heart Disease risk using combined UCI data and machine learning.ipynb
README.md		README.md
heart.csv		heart.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Heart Disease risk using combined UCI data and machine learning

Basic info

Dataset

Results

References

About

Releases

Packages

Languages

License

AlexStratou/Predicting-heart-disease-risk-with-scikit-learn

Folders and files

Latest commit

History

Repository files navigation

Predicting Heart Disease risk using combined UCI data and machine learning

Basic info

Dataset

Results

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages