Skip to content

Rudra-G-23/breast-cancer-prediction-app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Breast Cancer Prediction Web Application

cancer logo

🚀 Live App: breast-cancer-prediction-app-by-rudra.streamlit.app


GitHub Badge LinkedIn Badge Kaggle Badge

numpy pandas plotly scikit-learn streamlit


📌 Overview

This is a Streamlit-based web application that predicts whether a breast tumor is benign or malignant using a trained logistic regression model. Built using the Breast Cancer Wisconsin Diagnostic Dataset, this project demonstrates the complete data science lifecycle — from problem definition to model deployment.

🎯 Objective: Build a reliable, interactive, and modular diagnostic tool that aids in early breast cancer detection using medical imaging features.


⭐ Project Highlights

  • ✅ Demonstrates end-to-end ML workflow
  • ✅ Real-time prediction app with deployed UI
  • ✅ Modular, readable, and scalable Python code
  • ✅ Excellent use case for health tech/AI in diagnostics

🧠 Key Features

  • Predict tumor type (Benign or Malignant)
  • Real-time probability scores
  • User-friendly UI with input sliders
  • Modular codebase for easy scalability
  • Live deployment on Streamlit Cloud

🗂️ Project Architecture

├── assets/                # Visual assets (e.g., logos, screenshots)
├── data/                 # Dataset and derived files
├── model/                # Trained model (Pickle file)
├── notebooks/            # Exploratory and preprocessing notebooks
│   ├── p1-understand-the-data.ipynb
│   ├── p2-eda.ipynb
│   └── p3-outliers.ipynb
├── utils/                # Custom modules for charts, sidebar, data
│   ├── charts.py
│   ├── data_model.py
│   ├── sidebar.py
├── streamlit_app.py      # Main app entry point
├── requirements.txt      # Python dependencies
└── README.md             # Project documentation

📊 Features Used for Prediction

The model uses 30 key measurements from cell nuclei obtained via digitized images, including:

  • Mean: Radius, Texture, Perimeter, Area, Smoothness
  • Standard Error: Radius SE, Perimeter SE, Concavity SE, etc.
  • Worst (largest): Texture worst, Area worst, Symmetry worst, etc.

Each of these is captured using an intuitive sidebar interface in the app.


🧪 Model Details

Note: The pickle file is version-sensitive. Please ensure compatible scikit-learn versions (1.6.1 recommended) for loading the model.


📌 Step-by-Step Workflow

This project was developed following an end-to-end data science pipeline:

  1. Problem Definition
  2. Data Collection & Cleaning
  3. Exploratory Data Analysis (EDA)
  4. Outlier Handling & Feature Engineering
  5. Model Building & Evaluation
  6. Prediction Pipeline Creation
  7. Web App Development using Streamlit
  8. Deployment on Streamlit Cloud

🧰 Tech Stack

Tool/Library Usage
Python Core programming language
Pandas / NumPy Data manipulation
Matplotlib / Seaborn Data visualization
Plotly Interactive Data visualization
Scikit-learn Model building & evaluation
Streamlit Web interface & deployment
Pickle Model serialization

📷 App Interface

Example inputs include cell features like radius, area, smoothness, etc.
The prediction result is shown instantly with confidence levels.

cancer-prediction.mp4

🧪 Run the App Locally

  • Clone the repository

    git clone https://github.com/Rudra-G-23/breast-cancer-prediction-app.git
    cd breast-cancer-prediction-app
  • Install dependencies

    pip install -r requirements.txt
  • Launch the app

    streamlit run streamlit_app.py

📚 References


🙋‍♂️ Author

Rudra Prasad Bhuyan
📧 rudraprasadbhuyan000@gmail.com
🔗 GitHub | LinkedIn | Kaggle


About

A Streamlit web application for breast cancer prediction using logistic regression. Users can input key tumor features to receive real-time prediction and confidence scores based on the Breast Cancer Wisconsin Diagnostic Dataset.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages