- Project Overview
- Dataset
- Project Workflow
- Project Deliverables
- Quick Run Instructions
- Key Learnings
- Results
- Streamlit App
- Tools & Technologies Used
- Skills Demonstrated
- Connect with Me
This project demonstrates how Machine Learning can assist in early detection of heart disease, which is critical in saving lives and reducing healthcare costs.
The model predicts the likelihood of heart disease based on clinical features such as blood pressure, cholesterol, age, BMI, and sugar levels, helping in early diagnosis and preventive care.
Source: Cleveland dataset from the UCI Machine Learning Repository.
Includes clinical attributes such as:
- Age, Sex
- Resting Blood Pressure
- Cholesterol levels
- Fasting Blood Sugar
- ECG results
- Maximum Heart Rate Achieved
- Exercise Induced Angina
- Oldpeak, Slope, Thalassemia, etc.
Exploratory Data Analysis (EDA) 🔍
- Identified correlations between features.
- Visualized patterns in patients with and without heart disease.
Data Preprocessing 🧹
- Handled missing values.
- Encoded categorical variables.
- Scaled numerical features.
Model Training 🤖
- Logistic Regression
- Random Forest Classifier
Evaluation Metrics 📊
- Confusion Matrix
- Precision, Recall, F1-Score
- ROC-AUC
Deployment 🌐
- Built an interactive Streamlit App for real-time predictions.
This repository contains everything needed to run and deploy the project:
- Dataset CSV file 📂 (used for training and testing)
- Trained Model File (model.pkl) 🤖
- Streamlit Application Script (app.py) 🌐
- Environment File (env/) ⚙️ containing all required Python libraries
- Jupyter Notebook (Project.ipynb) 📒 with EDA, preprocessing, training, and evaluation
For technical reviewers who want to test the app locally:
pip install -r requirements.txt
streamlit run app.py- Importance of EDA in understanding feature distribution and spotting outliers.
- Selection of Evaluation Metrics (Precision, Recall, F1-score, ROC-AUC) is critical in healthcare, where false negatives can be dangerous.
- Hyperparameter Tuning improves model generalization and reduces overfitting.
- End-to-end pipeline: from raw data → model → deployment on Streamlit.
- Achieved 89% Accuracy ✅
- ROC-AUC Score: 0.92 🏆
- Reliable predictions with balanced Precision & Recall
![]() |
![]() |
![]() |
![]() |
| Confusion Matrix | ROC AUC Curve | Train/Test Score Plot | Streamlit App Screenshot |
Try the live interactive app here 👉 Heart Disease Prediction App
- Python 🐍
- Pandas, NumPy for data handling
- Matplotlib, Seaborn for EDA & Visualization
- Scikit-learn for ML modeling
- Streamlit for deployment
- Data Cleaning & Preprocessing
- Exploratory Data Analysis (EDA)
- Feature Engineering
- Model Training & Evaluation
- Hyperparameter Tuning
- Deployment with Streamlit
📌 GitHub: Check out my GitHub Profile for other projects
🌐 Live App: Streamlit App
💼 LinkedIn: Mohammad Navaman Jamadar



