❤️ Predicting Heart Disease: SVC vs Random Forest
Heart disease remains one of the leading causes of death worldwide. With the right data, machine learning can help us spot early signs of risk — but which algorithms work best?
In this project, I compared Support Vector Classifier (SVC) and Random Forest Classifier (RFC) to predict heart disease.
The Dataset
I used the Heart Disease dataset, which contains 303 patient records with features like:
- Age
- Blood pressure
- Cholesterol levels
- ECG results
- Exercise-induced angina
- Maximum heart rate achieved
The goal: classify patients as at risk of heart disease or not at risk.
The Approach
Two models were tested:
- Support Vector Classifier (SVC, RBF kernel) – finds the best boundary between classes.
- Random Forest Classifier (RFC) – combines multiple decision trees to improve predictions.
Performance was measured using accuracy, precision, recall, and F1-score.
Results
-
SVC outperformed RFC across all metrics:
- Accuracy: 87% (vs 80% for RFC)
- Precision & Recall: balanced around 0.86–0.88
- RFC showed slightly better recall for positive cases (heart disease present) but weaker precision.
- SVC was more consistent across both classes, making it more reliable for medical screening.
Insights
- SVC provided balanced predictions, which is critical for healthcare applications where false negatives can be dangerous.
- Random Forest offered feature importance (useful for interpretability) but showed more bias toward one class.
- Both models are acceptable, but SVC showed more robustness in this dataset.
Next Steps
If I extended this project, I would:
- Apply robust feature scaling (e.g., StandardScaler, RobustScaler).
- Perform feature engineering using domain knowledge (e.g., cholesterol ratios, blood pressure categories).
- Explore polynomial features and interaction terms for improved prediction power.
Conclusion
This study shows how ML models can support early diagnosis of heart disease. While Random Forest is useful, the Support Vector Classifier gave more reliable results for this dataset.
📄 You can view the full report here.