Skip to content

asaygom/p1-plan-vs-actual-powerbi

Repository files navigation

P1: Plan vs Actual - EVM Analytics Dashboard

Weekly Plan vs Actual dashboard for engineering/construction projects. Power BI semantic model with DAX-driven variance hotspots and performance indices (SPI/CPI).

Power BI Python Pandas


🎯 Business Problem

Project managers in construction/engineering need weekly visibility into:

  • Which projects are off-track (schedule/cost)?
  • What's the forecasted budget at completion?
  • Where should we intervene first?

Traditional reporting took 4-6 hours/week per project, consolidating Primavera P6 exports, Excel calculations, and manual variance analysis.

✅ Solution

Automated ETL pipeline + Power BI dashboard that:

  • Reduces reporting time 60% (from 6h to 2.5h per portfolio)
  • Surfaces variance 2 weeks earlier via EVM indicators (SPI/CPI)
  • One-page executive view with drill-through to project/task detail

Key Features

  • Earned Value Management (EVM): SPI, CPI, variance tracking
  • S-Curves: Planned vs Actual spend over time
  • Risk Scoring: Automated flagging of at-risk projects
  • Weekly Trending: 12-week historical performance
  • Task Breakdown: Granular progress by category

📊 Dashboard Preview

Dashboard Screenshot

Live Demo: Power BI Service (synthetic data)
Video Walkthrough: 3-min demo


🏗️ Architecture

┌─────────────┐      ┌──────────────┐        ┌─────────────┐
│  Raw Data   │─────▶│ ETL Pipeline │─────▶ │  Power BI   │
│  (CSV/XER)  │      │  (Python)    │        │  Dashboard  │
└─────────────┘      └──────────────┘        └─────────────┘
     │                      │                      │
     │                      │                      │
  Projects              Transform              Semantic
  Tasks                 Calculate EVM           Model
  Snapshots            Quality Checks          DAX Measures

Data Flow

  1. Extract: Projects, tasks, and weekly snapshots from CSV (or Primavera P6 XER)
  2. Transform: Calculate SPI/CPI, risk scores, forecasts
  3. Validate: Data quality checks (nulls, ranges, consistency)
  4. Load: Output transformed CSVs for Power BI import
  5. Visualize: Power BI connects to transformed data, applies star schema

🚀 Quick Start

Prerequisites

  • Python 3.9+
  • Power BI Desktop (for .pbix file)

Installation

# Clone repo
git clone https://github.com/asaygom/p1-plan-vs-actual-powerbi.git
cd p1-plan-vs-actual-powerbi

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Generate Synthetic Data

# Create sample projects (10 projects, 12 weeks history)
python scripts/generate_data.py

# Output: data/projects.csv, data/weekly_snapshots.csv, data/tasks.csv

Run ETL Pipeline

# Execute full pipeline: Extract → Transform → Load
python pipeline/run_etl.py

# Output: output/projects_transformed.csv + analytics views

Validate Data Quality

# Run quality checks
python pipeline/quality_checks.py

# Output: output/quality_report.csv

Open Dashboard

# Open Power BI file
open powerbi/PlanVsActual.pbix  # macOS
# or double-click PlanVsActual.pbix on Windows

# Refresh data sources to load transformed CSVs

📁 Project Structure

p1-plan-vs-actual-powerbi/
├── README.md                    # This file
├── requirements.txt             # Python dependencies
├── .gitignore
│
├── data/                        # Raw input data
│   ├── projects.csv
│   ├── weekly_snapshots.csv
│   └── tasks.csv
│
├── pipeline/                    # ETL scripts
│   ├── run_etl.py              # Main pipeline
│   ├── quality_checks.py       # Data validation
│   └── __init__.py
│
├── scripts/                     # Utility scripts
│   └── generate_data.py        # Synthetic data generator
│
├── output/                      # Transformed data (gitignored)
│   ├── projects_transformed.csv
│   ├── analytics_*.csv
│   └── quality_report.csv
│
├── powerbi/                     # Power BI files
│   ├── PlanVsActual.pbix       # Main dashboard
│   └── measures.dax            # DAX measures documentation
│
└── docs/                        # Documentation
    ├── data_dictionary.md      # Column definitions
    ├── architecture.md         # Technical architecture
    ├── dashboard_preview.png
    └── demo_video.mp4

📈 Key Metrics

Earned Value Management (EVM)

Metric Formula Meaning
SPI (Schedule Performance Index) EV / PV < 1.0 = Behind schedule
CPI (Cost Performance Index) EV / AC < 1.0 = Over budget
Cost Variance EV - AC Negative = Cost overrun
Schedule Variance EV - PV Negative = Schedule slip
EAC (Estimate at Completion) BAC / CPI Forecasted final cost

Risk Scoring

risk_score = (1 - SPI) * 50 + (1 - CPI) * 50
# 0-10: Low risk (green)
# 10-30: Medium risk (yellow)
# 30+: High risk (red)

🔧 Tech Stack

Component Technology Purpose
Data Generation Python (NumPy, Pandas) Synthetic realistic project data
ETL Pipeline Python (Pandas) Extract, transform, validate
Data Quality Custom validators Null checks, range validation, consistency
Semantic Model Power BI (DAX) Star schema, calculated measures
Visualization Power BI Desktop Interactive dashboards
Version Control Git/GitHub Code management

📝 Data Dictionary

Projects Table

Column Type Description
project_id String Unique identifier (PRJ-001)
project_name String Project display name
planned_budget_usd Float Original approved budget
earned_value_usd Float Value of work completed
actual_cost_usd Float Actual spend to date
spi Float Schedule Performance Index
cpi Float Cost Performance Index
progress_pct Float Physical % complete (0-100)
risk_score Float Calculated risk (0-100)

See full data dictionary for all columns.


🎓 Use Cases

1. Weekly PMO Report

  • Portfolio-level SPI/CPI
  • Top 5 at-risk projects
  • Budget forecast vs baseline

2. Executive Dashboard

  • One-page summary (traffic lights)
  • S-curves: planned vs actual spend
  • Variance drill-through

3. Project Deep-Dive

  • Task-level progress
  • Schedule slippage by category
  • Cost variance by work package

🔄 Extending the Pipeline

Connect to Primavera P6

# pipeline/extractors/p6_extractor.py
from xerparser import Xer

def extract_p6_data(xer_file_path):
    xer = Xer(xer_file_path)
    
    projects = [{
        'project_id': proj.proj_id,
        'project_name': proj.proj_short_name,
        'planned_budget_usd': proj.orig_cost,
        # ... map P6 fields
    } for proj in xer.projects]
    
    return pd.DataFrame(projects)

Add Forecasting Model

# pipeline/forecasting.py
from sklearn.linear_model import LinearRegression

def forecast_completion_date(weekly_progress):
    """Predict completion based on velocity."""
    X = weekly_progress[['week_number']].values
    y = weekly_progress['progress_pct'].values
    
    model = LinearRegression().fit(X, y)
    weeks_to_100 = (100 - y[-1]) / model.coef_[0]
    
    return datetime.now() + timedelta(weeks=weeks_to_100)

📊 Sample Output

Portfolio Summary (Console)

=== PORTFOLIO SUMMARY ===
Total Projects: 10
Active: 6 | Completed: 4

Portfolio SPI: 0.94 (6% behind schedule)
Portfolio CPI: 1.02 (2% under budget)

Projects at Risk: 3
- PRJ-003: SPI 0.82, CPI 0.91 (Risk: 28.5)
- PRJ-007: SPI 0.79, CPI 0.88 (Risk: 32.1)
- PRJ-009: SPI 0.85, CPI 0.89 (Risk: 25.5)

🧪 Testing

# Run quality checks (should pass all)
python pipeline/quality_checks.py

# Generate fresh data and pipeline
python scripts/generate_data.py && python pipeline/run_etl.py

# Check output files exist
ls output/

🤝 Contributing

This is a demo project with synthetic data. For production use:

  1. Replace data source: Connect to actual P6/Procore/Buildertrend API
  2. Add authentication: Secure data access
  3. Implement incremental refresh: Only process new/changed records
  4. Deploy to Power BI Service: Scheduled refresh + RLS

📄 License

MIT License - Feel free to use for learning/portfolio purposes.


👤 Author

Alexis Sayago
Analytics Engineer | Fabric · Power BI · SQL · Python


📚 Related Projects


Built with ❤️ for project controls and data engineering

About

Weekly Plan vs Actual dashboard for engineering/construction projects. Power BI (DAX/modeling), variance hotspots and a simple pace index. Demo data included.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages