Weekly Plan vs Actual dashboard for engineering/construction projects. Power BI semantic model with DAX-driven variance hotspots and performance indices (SPI/CPI).
Project managers in construction/engineering need weekly visibility into:
- Which projects are off-track (schedule/cost)?
- What's the forecasted budget at completion?
- Where should we intervene first?
Traditional reporting took 4-6 hours/week per project, consolidating Primavera P6 exports, Excel calculations, and manual variance analysis.
Automated ETL pipeline + Power BI dashboard that:
- Reduces reporting time 60% (from 6h to 2.5h per portfolio)
- Surfaces variance 2 weeks earlier via EVM indicators (SPI/CPI)
- One-page executive view with drill-through to project/task detail
- Earned Value Management (EVM): SPI, CPI, variance tracking
- S-Curves: Planned vs Actual spend over time
- Risk Scoring: Automated flagging of at-risk projects
- Weekly Trending: 12-week historical performance
- Task Breakdown: Granular progress by category
Live Demo: Power BI Service (synthetic data)
Video Walkthrough: 3-min demo
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Raw Data │─────▶│ ETL Pipeline │─────▶ │ Power BI │
│ (CSV/XER) │ │ (Python) │ │ Dashboard │
└─────────────┘ └──────────────┘ └─────────────┘
│ │ │
│ │ │
Projects Transform Semantic
Tasks Calculate EVM Model
Snapshots Quality Checks DAX Measures
- Extract: Projects, tasks, and weekly snapshots from CSV (or Primavera P6 XER)
- Transform: Calculate SPI/CPI, risk scores, forecasts
- Validate: Data quality checks (nulls, ranges, consistency)
- Load: Output transformed CSVs for Power BI import
- Visualize: Power BI connects to transformed data, applies star schema
- Python 3.9+
- Power BI Desktop (for
.pbixfile)
# Clone repo
git clone https://github.com/asaygom/p1-plan-vs-actual-powerbi.git
cd p1-plan-vs-actual-powerbi
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Create sample projects (10 projects, 12 weeks history)
python scripts/generate_data.py
# Output: data/projects.csv, data/weekly_snapshots.csv, data/tasks.csv# Execute full pipeline: Extract → Transform → Load
python pipeline/run_etl.py
# Output: output/projects_transformed.csv + analytics views# Run quality checks
python pipeline/quality_checks.py
# Output: output/quality_report.csv# Open Power BI file
open powerbi/PlanVsActual.pbix # macOS
# or double-click PlanVsActual.pbix on Windows
# Refresh data sources to load transformed CSVsp1-plan-vs-actual-powerbi/
├── README.md # This file
├── requirements.txt # Python dependencies
├── .gitignore
│
├── data/ # Raw input data
│ ├── projects.csv
│ ├── weekly_snapshots.csv
│ └── tasks.csv
│
├── pipeline/ # ETL scripts
│ ├── run_etl.py # Main pipeline
│ ├── quality_checks.py # Data validation
│ └── __init__.py
│
├── scripts/ # Utility scripts
│ └── generate_data.py # Synthetic data generator
│
├── output/ # Transformed data (gitignored)
│ ├── projects_transformed.csv
│ ├── analytics_*.csv
│ └── quality_report.csv
│
├── powerbi/ # Power BI files
│ ├── PlanVsActual.pbix # Main dashboard
│ └── measures.dax # DAX measures documentation
│
└── docs/ # Documentation
├── data_dictionary.md # Column definitions
├── architecture.md # Technical architecture
├── dashboard_preview.png
└── demo_video.mp4
| Metric | Formula | Meaning |
|---|---|---|
| SPI (Schedule Performance Index) | EV / PV | < 1.0 = Behind schedule |
| CPI (Cost Performance Index) | EV / AC | < 1.0 = Over budget |
| Cost Variance | EV - AC | Negative = Cost overrun |
| Schedule Variance | EV - PV | Negative = Schedule slip |
| EAC (Estimate at Completion) | BAC / CPI | Forecasted final cost |
risk_score = (1 - SPI) * 50 + (1 - CPI) * 50
# 0-10: Low risk (green)
# 10-30: Medium risk (yellow)
# 30+: High risk (red)| Component | Technology | Purpose |
|---|---|---|
| Data Generation | Python (NumPy, Pandas) | Synthetic realistic project data |
| ETL Pipeline | Python (Pandas) | Extract, transform, validate |
| Data Quality | Custom validators | Null checks, range validation, consistency |
| Semantic Model | Power BI (DAX) | Star schema, calculated measures |
| Visualization | Power BI Desktop | Interactive dashboards |
| Version Control | Git/GitHub | Code management |
| Column | Type | Description |
|---|---|---|
project_id |
String | Unique identifier (PRJ-001) |
project_name |
String | Project display name |
planned_budget_usd |
Float | Original approved budget |
earned_value_usd |
Float | Value of work completed |
actual_cost_usd |
Float | Actual spend to date |
spi |
Float | Schedule Performance Index |
cpi |
Float | Cost Performance Index |
progress_pct |
Float | Physical % complete (0-100) |
risk_score |
Float | Calculated risk (0-100) |
See full data dictionary for all columns.
- Portfolio-level SPI/CPI
- Top 5 at-risk projects
- Budget forecast vs baseline
- One-page summary (traffic lights)
- S-curves: planned vs actual spend
- Variance drill-through
- Task-level progress
- Schedule slippage by category
- Cost variance by work package
# pipeline/extractors/p6_extractor.py
from xerparser import Xer
def extract_p6_data(xer_file_path):
xer = Xer(xer_file_path)
projects = [{
'project_id': proj.proj_id,
'project_name': proj.proj_short_name,
'planned_budget_usd': proj.orig_cost,
# ... map P6 fields
} for proj in xer.projects]
return pd.DataFrame(projects)# pipeline/forecasting.py
from sklearn.linear_model import LinearRegression
def forecast_completion_date(weekly_progress):
"""Predict completion based on velocity."""
X = weekly_progress[['week_number']].values
y = weekly_progress['progress_pct'].values
model = LinearRegression().fit(X, y)
weeks_to_100 = (100 - y[-1]) / model.coef_[0]
return datetime.now() + timedelta(weeks=weeks_to_100)=== PORTFOLIO SUMMARY ===
Total Projects: 10
Active: 6 | Completed: 4
Portfolio SPI: 0.94 (6% behind schedule)
Portfolio CPI: 1.02 (2% under budget)
Projects at Risk: 3
- PRJ-003: SPI 0.82, CPI 0.91 (Risk: 28.5)
- PRJ-007: SPI 0.79, CPI 0.88 (Risk: 32.1)
- PRJ-009: SPI 0.85, CPI 0.89 (Risk: 25.5)
# Run quality checks (should pass all)
python pipeline/quality_checks.py
# Generate fresh data and pipeline
python scripts/generate_data.py && python pipeline/run_etl.py
# Check output files exist
ls output/This is a demo project with synthetic data. For production use:
- Replace data source: Connect to actual P6/Procore/Buildertrend API
- Add authentication: Secure data access
- Implement incremental refresh: Only process new/changed records
- Deploy to Power BI Service: Scheduled refresh + RLS
MIT License - Feel free to use for learning/portfolio purposes.
Alexis Sayago
Analytics Engineer | Fabric · Power BI · SQL · Python
- LinkedIn: linkedin.com/in/asaygom
- GitHub: github.com/asaygom
- P2: Nordic Infrastructure Sustainability - Medallion architecture with Fabric
- P3: Supply Chain Analytics (coming soon)
Built with ❤️ for project controls and data engineering
