An intelligent machine learning application that analyzes housing data and provides real-time house price predictions through an intuitive web interface. This project demonstrates end-to-end data science workflow from exploratory data analysis to model deployment.
π House Price Predictor Web App
- π Interactive Data Analysis: Comprehensive Jupyter notebook with data exploration and visualization
- π€ Multiple ML Models: Linear Regression and Random Forest algorithms for price prediction
- π¨ Beautiful Web Interface: Modern Streamlit dashboard with customizable backgrounds
- β‘ Real-time Predictions: Instant house price estimates based on user inputs
- π Model Performance Metrics: RΒ² scores and accuracy measurements
- π― User-friendly Design: Intuitive sliders and input controls
House-Price-Predictor/
βββ π 1.Data Analysis.ipynb # Exploratory data analysis & model experiments
βββ π housing.csv # Training dataset (California housing data)
βββ π streamlit_app.py # Interactive web application
βββ π requirements.txt # Python dependencies
βββ π LICENSE # MIT License
βββ π README.md # Project documentation
- Backend: Python 3.8+
- ML Libraries: Scikit-learn, Pandas, NumPy
- Frontend: Streamlit
- Data Analysis: Jupyter Notebook
- Visualization: Matplotlib, Seaborn (in notebook)
- Python 3.8 or higher
- Git (optional)
- PowerShell (Windows) or Terminal (Mac/Linux)
-
π₯ Clone the repository
git clone https://github.com/sandudul/House-Price-Predictor.git cd House-Price-Predictor
-
π Create virtual environment (Recommended)
python -m venv .\.venv .\.venv\Scripts\Activate.ps1 -
π¦ Install dependencies
pip install -r .\requirements.txt -
π Launch the application
streamlit run .\streamlit_app.py
-
π Open in browser
Navigate to
http://localhost:8501to access the application
- Launch the App: Follow the installation steps above
- Input House Features: Use the sidebar sliders to adjust:
- π House age
- π₯ Average rooms
- ποΈ Population density
- π° Median income
- π Geographic coordinates
- Get Predictions: View real-time price estimates from multiple models
- Compare Models: Analyze RΒ² scores to understand model performance
The application uses the California Housing Dataset, which includes:
| Feature | Description |
|---|---|
housing_median_age |
Median age of houses in the block |
total_rooms |
Total number of rooms in the block |
total_bedrooms |
Total number of bedrooms in the block |
population |
Population in the block |
households |
Number of households in the block |
median_income |
Median income of households |
latitude |
Latitude coordinate |
longitude |
Longitude coordinate |
Target Variable: median_house_value (in hundreds of thousands of dollars)
| Model | Description | Use Case |
|---|---|---|
| Linear Regression | Simple linear relationship modeling | Baseline performance & interpretability |
| Random Forest | Ensemble method with decision trees | Handling non-linear patterns & feature interactions |
Place your image file in the project directory and modify the background path in streamlit_app.py:
set_background("your_image.jpg")For production deployment, consider:
- Saving trained models using
jobliborpickle - Implementing model versioning
- Adding cross-validation
- Feature engineering enhancements
The application displays model performance using:
- RΒ² Score: Coefficient of determination
- Prediction Accuracy: Real-time validation
- Training Time: Model efficiency metrics
| Issue | Solution |
|---|---|
| Streamlit won't start | Ensure virtual environment is activated and dependencies are installed |
| Missing dataset | Verify housing.csv exists in the project root directory |
| Import errors | Run pip install -r requirements.txt to install missing packages |
| Performance issues | Consider using saved models instead of training at startup |
Contributions are welcome! Please feel free to submit a Pull Request. For major changes:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
This project demonstrates:
- Data Science Workflow: From EDA to model deployment
- Machine Learning: Regression algorithms and performance evaluation
- Web Development: Interactive dashboards with Streamlit
- Best Practices: Code organization, documentation, and version control
sandu - @sandudul
Project Link: https://github.com/sandudul/House-Price-Predictor
β Star this repository if you found it helpful!