Learn every major ML algorithm with math intuition, from-scratch implementation, and real-world projects.
Most ML tutorials give you either:
- 🔴 Copy-paste code with no explanation of why it works, or
- 🔴 Dense math papers with no practical implementation
These notebooks bridge that gap. Every notebook gives you:
✅ Plain English explanation → ✅ Mathematical derivation → ✅ Visual intuition → ✅ From-scratch NumPy code → ✅ Production sklearn code
Whether you are a complete beginner or an experienced engineer brushing up before an interview, this is your one-stop reference.
| 🎓 Students | 👨💻 Engineers | 🎯 Interview Prep |
|---|---|---|
| Learning ML from scratch with one structured, progressive path | Quickly look up syntax, parameters, and best practices mid-project | Algorithms explained conceptually + common interview Q&A included |
ml-mastery/
│
├── 01_libraries/ # 📦 Core Python libraries for ML
│ ├── numpy.ipynb # Arrays, operations, broadcasting, indexing
│ ├── pandas.ipynb # DataFrames, cleaning, groupby, merging
│ └── matplotlib.ipynb # Plots, subplots, styling, saving figures
│
├── 02_ml_concepts/ # 🧠 ML algorithms with math intuition + code
│ ├── linear_regression.ipynb # OLS, gradient descent, cost function
│ ├── logistic_regression.ipynb # Sigmoid, log loss, decision boundary
│ ├── decision_trees.ipynb # Entropy, Gini, information gain
│ ├── random_forest.ipynb # Bagging, feature importance, OOB error
│ ├── svm.ipynb # Hyperplane, margin, kernel trick
│ ├── knn.ipynb # Distance metrics, choosing K
│ ├── naive_bayes.ipynb # Bayes theorem, conditional probability
│ ├── unsupervised.ipynb # K-Means, DBSCAN, PCA, t-SNE
│ ├── feature_engineering.ipynb # Encoding, scaling, selection, pipelines
│ └── model_evaluation.ipynb # Metrics, cross-validation, bias-variance
│
├── 03_projects/ # 🏗️ End-to-end ML projects on real datasets
│ ├── titanic.ipynb # Binary classification — survival prediction
│ └── house_price.ipynb # Regression — price prediction
│
├── extras/ # 📌 Quick reference materials
│ ├── cheatsheet.md # Most-used commands across all libraries
│ └── interview_qa.md # Common ML interview questions and answers
│
├── requirements.txt # All dependencies
├── CONTRIBUTING.md # How to contribute
└── README.md # You are here
Click a notebook name to view it on GitHub, or the Colab badge to open it directly in Google Colab — no setup required.
| Notebook | What You'll Learn | Open |
|---|---|---|
| numpy.ipynb | Arrays, broadcasting, vectorized math, indexing | |
| pandas.ipynb | DataFrames, data cleaning, groupby, merging | |
| matplotlib.ipynb | Plots, subplots, styling, saving figures |
| Notebook | What You'll Learn | Open |
|---|---|---|
| linear_regression.ipynb | OLS, gradient descent, cost function, R² | |
| logistic_regression.ipynb | Sigmoid, log loss, decision boundary | |
| decision_trees.ipynb | Entropy, Gini impurity, information gain | |
| random_forest.ipynb | Bagging, feature importance, OOB error | |
| svm.ipynb | Hyperplane, margin maximization, kernel trick | |
| knn.ipynb | Distance metrics, K selection, curse of dimensionality | |
| naive_bayes.ipynb | Bayes theorem, conditional probability, Laplace smoothing | |
| unsupervised.ipynb | K-Means, DBSCAN, PCA, t-SNE dimensionality reduction | |
| feature_engineering.ipynb | Encoding, scaling, feature selection, pipelines | |
| model_evaluation.ipynb | Metrics, cross-validation, bias-variance tradeoff |
| Notebook | Problem Type | Dataset | Open |
|---|---|---|---|
| titanic.ipynb | Binary Classification | Titanic survival | |
| house_price.ipynb | Regression | Housing prices |
Follow this order if you are starting from scratch:
📦 NumPy → Pandas → Matplotlib
↓
🧠 Linear Regression → Logistic Regression
↓
Decision Trees → Random Forest
↓
SVM → KNN → Naive Bayes
↓
Feature Engineering → Model Evaluation
↓
Unsupervised Learning
↓
🏗️ Projects (Titanic → House Price)
Pro tip: Don't skip the libraries section. Everything in
02_ml_concepts/depends heavily on NumPy and Pandas.
Every notebook in 02_ml_concepts/ follows this proven structure:
| Section | What You Get |
|---|---|
| 🗣️ Concept Overview | Plain English explanation of what the algorithm does and when to use it |
| 📐 Math Intuition | The actual math — cost function, derivation, key equations |
| 📊 Visual Intuition | Plots and diagrams built from scratch to show what is happening |
| 💻 Code from Scratch | Full implementation using only NumPy — no black boxes |
| ⚙️ Sklearn Implementation | The production-ready way with all parameters explained |
| What goes wrong, why, and how to fix it | |
| 🧩 Exercises | Practice problems with worked solutions |
Option 1 — Google Colab (recommended, zero setup)
Click any "Open in Colab" badge above. Everything runs instantly in your browser — no installation needed.
Option 2 — Run Locally
git clone https://github.com/himanshu231204/ml-mastery.git
cd ml-mastery
pip install -r requirements.txt
jupyter notebook| Resource | Description |
|---|---|
| 📋 Cheatsheet | Most-used NumPy, Pandas, Matplotlib, and sklearn commands with output examples |
| 🎯 Interview Q&A | Common ML interview questions with detailed answers |
| Library | Version | Purpose |
|---|---|---|
numpy |
≥ 1.24 | Array operations, mathematical computing |
pandas |
≥ 2.0 | Data manipulation and analysis |
matplotlib |
≥ 3.7 | Data visualization |
seaborn |
≥ 0.12 | Statistical data visualization |
scikit-learn |
≥ 1.3 | Machine learning algorithms |
scipy |
≥ 1.11 | Scientific and statistical computing |
Contributions are welcome and appreciated! Whether it's fixing a typo, adding a new algorithm notebook, or improving an explanation — every contribution helps.
See CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License — free to use, share, and modify with attribution. See LICENSE for details.
If this repository helped you learn or saved you time, please consider:
- ⭐ Starring this repo — it helps others discover it
- 🍴 Forking it — to build your own ML reference
- 📢 Sharing it — with friends, colleagues, or on social media
Every star motivates continued improvement. Thank you! 🙏