Skip to content

himanshu231204/ml-mastery

🤖 ML Mastery

The Complete Machine Learning Reference — From Zero to Production

Learn every major ML algorithm with math intuition, from-scratch implementation, and real-world projects.

Stars Forks License Open In Colab

Python Jupyter scikit-learn PRs Welcome


🚀 Why ML Mastery?

Most ML tutorials give you either:

  • 🔴 Copy-paste code with no explanation of why it works, or
  • 🔴 Dense math papers with no practical implementation

These notebooks bridge that gap. Every notebook gives you:

✅ Plain English explanation → ✅ Mathematical derivation → ✅ Visual intuition → ✅ From-scratch NumPy code → ✅ Production sklearn code

Whether you are a complete beginner or an experienced engineer brushing up before an interview, this is your one-stop reference.


👥 Who is this for?

🎓 Students 👨‍💻 Engineers 🎯 Interview Prep
Learning ML from scratch with one structured, progressive path Quickly look up syntax, parameters, and best practices mid-project Algorithms explained conceptually + common interview Q&A included

📁 Repository Structure

ml-mastery/
│
├── 01_libraries/               # 📦 Core Python libraries for ML
│   ├── numpy.ipynb             # Arrays, operations, broadcasting, indexing
│   ├── pandas.ipynb            # DataFrames, cleaning, groupby, merging
│   └── matplotlib.ipynb        # Plots, subplots, styling, saving figures
│
├── 02_ml_concepts/             # 🧠 ML algorithms with math intuition + code
│   ├── linear_regression.ipynb     # OLS, gradient descent, cost function
│   ├── logistic_regression.ipynb   # Sigmoid, log loss, decision boundary
│   ├── decision_trees.ipynb        # Entropy, Gini, information gain
│   ├── random_forest.ipynb         # Bagging, feature importance, OOB error
│   ├── svm.ipynb                   # Hyperplane, margin, kernel trick
│   ├── knn.ipynb                   # Distance metrics, choosing K
│   ├── naive_bayes.ipynb           # Bayes theorem, conditional probability
│   ├── unsupervised.ipynb          # K-Means, DBSCAN, PCA, t-SNE
│   ├── feature_engineering.ipynb   # Encoding, scaling, selection, pipelines
│   └── model_evaluation.ipynb      # Metrics, cross-validation, bias-variance
│
├── 03_projects/                # 🏗️ End-to-end ML projects on real datasets
│   ├── titanic.ipynb           # Binary classification — survival prediction
│   └── house_price.ipynb       # Regression — price prediction
│
├── extras/                     # 📌 Quick reference materials
│   ├── cheatsheet.md           # Most-used commands across all libraries
│   └── interview_qa.md         # Common ML interview questions and answers
│
├── requirements.txt            # All dependencies
├── CONTRIBUTING.md             # How to contribute
└── README.md                   # You are here

📓 Notebooks Quick Access

Click a notebook name to view it on GitHub, or the Colab badge to open it directly in Google Colab — no setup required.

📦 01 — Libraries

Notebook What You'll Learn Open
numpy.ipynb Arrays, broadcasting, vectorized math, indexing Colab
pandas.ipynb DataFrames, data cleaning, groupby, merging Colab
matplotlib.ipynb Plots, subplots, styling, saving figures Colab

🧠 02 — ML Concepts

Notebook What You'll Learn Open
linear_regression.ipynb OLS, gradient descent, cost function, R² Colab
logistic_regression.ipynb Sigmoid, log loss, decision boundary Colab
decision_trees.ipynb Entropy, Gini impurity, information gain Colab
random_forest.ipynb Bagging, feature importance, OOB error Colab
svm.ipynb Hyperplane, margin maximization, kernel trick Colab
knn.ipynb Distance metrics, K selection, curse of dimensionality Colab
naive_bayes.ipynb Bayes theorem, conditional probability, Laplace smoothing Colab
unsupervised.ipynb K-Means, DBSCAN, PCA, t-SNE dimensionality reduction Colab
feature_engineering.ipynb Encoding, scaling, feature selection, pipelines Colab
model_evaluation.ipynb Metrics, cross-validation, bias-variance tradeoff Colab

🏗️ 03 — Real-World Projects

Notebook Problem Type Dataset Open
titanic.ipynb Binary Classification Titanic survival Colab
house_price.ipynb Regression Housing prices Colab

🗺️ Recommended Learning Path

Follow this order if you are starting from scratch:

📦 NumPy → Pandas → Matplotlib
                ↓
🧠 Linear Regression → Logistic Regression
                ↓
        Decision Trees → Random Forest
                ↓
         SVM → KNN → Naive Bayes
                ↓
  Feature Engineering → Model Evaluation
                ↓
        Unsupervised Learning
                ↓
🏗️ Projects (Titanic → House Price)

Pro tip: Don't skip the libraries section. Everything in 02_ml_concepts/ depends heavily on NumPy and Pandas.


🔬 What's Inside Each Algorithm Notebook

Every notebook in 02_ml_concepts/ follows this proven structure:

Section What You Get
🗣️ Concept Overview Plain English explanation of what the algorithm does and when to use it
📐 Math Intuition The actual math — cost function, derivation, key equations
📊 Visual Intuition Plots and diagrams built from scratch to show what is happening
💻 Code from Scratch Full implementation using only NumPy — no black boxes
⚙️ Sklearn Implementation The production-ready way with all parameters explained
⚠️ Common Mistakes What goes wrong, why, and how to fix it
🧩 Exercises Practice problems with worked solutions

⚡ Quick Start

Option 1 — Google Colab (recommended, zero setup)

Click any "Open in Colab" badge above. Everything runs instantly in your browser — no installation needed.

Option 2 — Run Locally

git clone https://github.com/himanshu231204/ml-mastery.git
cd ml-mastery
pip install -r requirements.txt
jupyter notebook

📚 Extra Resources

Resource Description
📋 Cheatsheet Most-used NumPy, Pandas, Matplotlib, and sklearn commands with output examples
🎯 Interview Q&A Common ML interview questions with detailed answers

🛠️ Dependencies

Library Version Purpose
numpy ≥ 1.24 Array operations, mathematical computing
pandas ≥ 2.0 Data manipulation and analysis
matplotlib ≥ 3.7 Data visualization
seaborn ≥ 0.12 Statistical data visualization
scikit-learn ≥ 1.3 Machine learning algorithms
scipy ≥ 1.11 Scientific and statistical computing

🤝 Contributing

Contributions are welcome and appreciated! Whether it's fixing a typo, adding a new algorithm notebook, or improving an explanation — every contribution helps.

See CONTRIBUTING.md for guidelines.


📄 License

This project is licensed under the MIT License — free to use, share, and modify with attribution. See LICENSE for details.


⭐ Support This Project

If this repository helped you learn or saved you time, please consider:

  • Starring this repo — it helps others discover it
  • 🍴 Forking it — to build your own ML reference
  • 📢 Sharing it — with friends, colleagues, or on social media

Every star motivates continued improvement. Thank you! 🙏


👨‍💻 Author

Himanshu Kumar

GitHub LinkedIn Twitter Email

Building tools that make machine learning accessible to everyone.

About

A structured Machine Learning learning & revision repository with hands-on notebooks, practical examples, and quick reference guides for developers and students.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors