P1: Plan vs Actual - EVM Analytics Dashboard

Weekly Plan vs Actual dashboard for engineering/construction projects. Power BI semantic model with DAX-driven variance hotspots and performance indices (SPI/CPI).

🎯 Business Problem

Project managers in construction/engineering need weekly visibility into:

Which projects are off-track (schedule/cost)?
What's the forecasted budget at completion?
Where should we intervene first?

Traditional reporting took 4-6 hours/week per project, consolidating Primavera P6 exports, Excel calculations, and manual variance analysis.

✅ Solution

Automated ETL pipeline + Power BI dashboard that:

Reduces reporting time 60% (from 6h to 2.5h per portfolio)
Surfaces variance 2 weeks earlier via EVM indicators (SPI/CPI)
One-page executive view with drill-through to project/task detail

Key Features

Earned Value Management (EVM): SPI, CPI, variance tracking
S-Curves: Planned vs Actual spend over time
Risk Scoring: Automated flagging of at-risk projects
Weekly Trending: 12-week historical performance
Task Breakdown: Granular progress by category

📊 Dashboard Preview

Live Demo: Power BI Service (synthetic data)
Video Walkthrough: 3-min demo

🏗️ Architecture

┌─────────────┐      ┌──────────────┐        ┌─────────────┐
│  Raw Data   │─────▶│ ETL Pipeline │─────▶ │  Power BI   │
│  (CSV/XER)  │      │  (Python)    │        │  Dashboard  │
└─────────────┘      └──────────────┘        └─────────────┘
     │                      │                      │
     │                      │                      │
  Projects              Transform              Semantic
  Tasks                 Calculate EVM           Model
  Snapshots            Quality Checks          DAX Measures

Data Flow

Extract: Projects, tasks, and weekly snapshots from CSV (or Primavera P6 XER)
Transform: Calculate SPI/CPI, risk scores, forecasts
Validate: Data quality checks (nulls, ranges, consistency)
Load: Output transformed CSVs for Power BI import
Visualize: Power BI connects to transformed data, applies star schema

🚀 Quick Start

Prerequisites

Python 3.9+
Power BI Desktop (for .pbix file)

Installation

# Clone repo
git clone https://github.com/asaygom/p1-plan-vs-actual-powerbi.git
cd p1-plan-vs-actual-powerbi

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Generate Synthetic Data

# Create sample projects (10 projects, 12 weeks history)
python scripts/generate_data.py

# Output: data/projects.csv, data/weekly_snapshots.csv, data/tasks.csv

Run ETL Pipeline

# Execute full pipeline: Extract → Transform → Load
python pipeline/run_etl.py

# Output: output/projects_transformed.csv + analytics views

Validate Data Quality

# Run quality checks
python pipeline/quality_checks.py

# Output: output/quality_report.csv

Open Dashboard

# Open Power BI file
open powerbi/PlanVsActual.pbix  # macOS
# or double-click PlanVsActual.pbix on Windows

# Refresh data sources to load transformed CSVs

📁 Project Structure

p1-plan-vs-actual-powerbi/
├── README.md                    # This file
├── requirements.txt             # Python dependencies
├── .gitignore
│
├── data/                        # Raw input data
│   ├── projects.csv
│   ├── weekly_snapshots.csv
│   └── tasks.csv
│
├── pipeline/                    # ETL scripts
│   ├── run_etl.py              # Main pipeline
│   ├── quality_checks.py       # Data validation
│   └── __init__.py
│
├── scripts/                     # Utility scripts
│   └── generate_data.py        # Synthetic data generator
│
├── output/                      # Transformed data (gitignored)
│   ├── projects_transformed.csv
│   ├── analytics_*.csv
│   └── quality_report.csv
│
├── powerbi/                     # Power BI files
│   ├── PlanVsActual.pbix       # Main dashboard
│   └── measures.dax            # DAX measures documentation
│
└── docs/                        # Documentation
    ├── data_dictionary.md      # Column definitions
    ├── architecture.md         # Technical architecture
    ├── dashboard_preview.png
    └── demo_video.mp4

📈 Key Metrics

Earned Value Management (EVM)

Metric	Formula	Meaning
SPI (Schedule Performance Index)	EV / PV	< 1.0 = Behind schedule
CPI (Cost Performance Index)	EV / AC	< 1.0 = Over budget
Cost Variance	EV - AC	Negative = Cost overrun
Schedule Variance	EV - PV	Negative = Schedule slip
EAC (Estimate at Completion)	BAC / CPI	Forecasted final cost

Risk Scoring

risk_score = (1 - SPI) * 50 + (1 - CPI) * 50
# 0-10: Low risk (green)
# 10-30: Medium risk (yellow)
# 30+: High risk (red)

🔧 Tech Stack

Component	Technology	Purpose
Data Generation	Python (NumPy, Pandas)	Synthetic realistic project data
ETL Pipeline	Python (Pandas)	Extract, transform, validate
Data Quality	Custom validators	Null checks, range validation, consistency
Semantic Model	Power BI (DAX)	Star schema, calculated measures
Visualization	Power BI Desktop	Interactive dashboards
Version Control	Git/GitHub	Code management

📝 Data Dictionary

Projects Table

Column	Type	Description
`project_id`	String	Unique identifier (PRJ-001)
`project_name`	String	Project display name
`planned_budget_usd`	Float	Original approved budget
`earned_value_usd`	Float	Value of work completed
`actual_cost_usd`	Float	Actual spend to date
`spi`	Float	Schedule Performance Index
`cpi`	Float	Cost Performance Index
`progress_pct`	Float	Physical % complete (0-100)
`risk_score`	Float	Calculated risk (0-100)

See full data dictionary for all columns.

🎓 Use Cases

1. Weekly PMO Report

Portfolio-level SPI/CPI
Top 5 at-risk projects
Budget forecast vs baseline

2. Executive Dashboard

One-page summary (traffic lights)
S-curves: planned vs actual spend
Variance drill-through

3. Project Deep-Dive

Task-level progress
Schedule slippage by category
Cost variance by work package

🔄 Extending the Pipeline

Connect to Primavera P6

# pipeline/extractors/p6_extractor.py
from xerparser import Xer

def extract_p6_data(xer_file_path):
    xer = Xer(xer_file_path)
    
    projects = [{
        'project_id': proj.proj_id,
        'project_name': proj.proj_short_name,
        'planned_budget_usd': proj.orig_cost,
        # ... map P6 fields
    } for proj in xer.projects]
    
    return pd.DataFrame(projects)

Add Forecasting Model

# pipeline/forecasting.py
from sklearn.linear_model import LinearRegression

def forecast_completion_date(weekly_progress):
    """Predict completion based on velocity."""
    X = weekly_progress[['week_number']].values
    y = weekly_progress['progress_pct'].values
    
    model = LinearRegression().fit(X, y)
    weeks_to_100 = (100 - y[-1]) / model.coef_[0]
    
    return datetime.now() + timedelta(weeks=weeks_to_100)

📊 Sample Output

Portfolio Summary (Console)

=== PORTFOLIO SUMMARY ===
Total Projects: 10
Active: 6 | Completed: 4

Portfolio SPI: 0.94 (6% behind schedule)
Portfolio CPI: 1.02 (2% under budget)

Projects at Risk: 3
- PRJ-003: SPI 0.82, CPI 0.91 (Risk: 28.5)
- PRJ-007: SPI 0.79, CPI 0.88 (Risk: 32.1)
- PRJ-009: SPI 0.85, CPI 0.89 (Risk: 25.5)

🧪 Testing

# Run quality checks (should pass all)
python pipeline/quality_checks.py

# Generate fresh data and pipeline
python scripts/generate_data.py && python pipeline/run_etl.py

# Check output files exist
ls output/

🤝 Contributing

This is a demo project with synthetic data. For production use:

Replace data source: Connect to actual P6/Procore/Buildertrend API
Add authentication: Secure data access
Implement incremental refresh: Only process new/changed records
Deploy to Power BI Service: Scheduled refresh + RLS

📄 License

MIT License - Feel free to use for learning/portfolio purposes.

👤 Author

Alexis Sayago
Analytics Engineer | Fabric · Power BI · SQL · Python

LinkedIn: linkedin.com/in/asaygom
GitHub: github.com/asaygom

📚 Related Projects

P2: Nordic Infrastructure Sustainability - Medallion architecture with Fabric
P3: Supply Chain Analytics (coming soon)

Built with ❤️ for project controls and data engineering

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
data		data
pipeline		pipeline
powerbi		powerbi
scripts		scripts
.gitattributes		.gitattributes
.gitattributes-LFS.example		.gitattributes-LFS.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

P1: Plan vs Actual - EVM Analytics Dashboard

🎯 Business Problem

✅ Solution

Key Features

📊 Dashboard Preview

🏗️ Architecture

Data Flow

🚀 Quick Start

Prerequisites

Installation

Generate Synthetic Data

Run ETL Pipeline

Validate Data Quality

Open Dashboard

📁 Project Structure

📈 Key Metrics

Earned Value Management (EVM)

Risk Scoring

🔧 Tech Stack

📝 Data Dictionary

Projects Table

🎓 Use Cases

1. Weekly PMO Report

2. Executive Dashboard

3. Project Deep-Dive

🔄 Extending the Pipeline

Connect to Primavera P6

Add Forecasting Model

📊 Sample Output

Portfolio Summary (Console)

🧪 Testing

🤝 Contributing

📄 License

👤 Author

📚 Related Projects

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages