Raw NHL game JSON data scraped from the NHL API via fastRhockey.
graph LR;
A[fastRhockey-nhl-raw]-->B[fastRhockey-nhl-data];
B[fastRhockey-nhl-data]-->C1[nhl_pbp_full];
B[fastRhockey-nhl-data]-->C2[nhl_pbp_lite];
B[fastRhockey-nhl-data]-->C3[nhl_team_boxscores];
B[fastRhockey-nhl-data]-->C4[nhl_player_boxscores];
B[fastRhockey-nhl-data]-->C5[nhl_rosters];
B[fastRhockey-nhl-data]-->C6[nhl_schedules];
flowchart TB;
subgraph A[fastRhockey-nhl-raw];
direction TB;
A1[scripts/daily_nhl_scraper.sh]-->A2[R/scrape_nhl_raw.R];
end;
subgraph B[fastRhockey-nhl-data];
direction TB;
B1[scripts/daily_nhl_R_processor.sh]-->B2[R/nhl_data_creation.R];
end;
subgraph C[sportsdataverse Releases];
direction TB;
C1[nhl_pbp_full];
C2[nhl_pbp_lite];
C3[nhl_team_boxscores];
C4[nhl_player_boxscores];
C5[nhl_rosters];
C6[nhl_schedules];
end;
A-->B;
B-->C1;
B-->C2;
B-->C3;
B-->C4;
B-->C5;
B-->C6;
click C1 "https://github.com/sportsdataverse/sportsdataverse-data/releases/tag/nhl_pbp_full" _blank;
click C2 "https://github.com/sportsdataverse/sportsdataverse-data/releases/tag/nhl_pbp_lite" _blank;
click C3 "https://github.com/sportsdataverse/sportsdataverse-data/releases/tag/nhl_team_boxscores" _blank;
click C4 "https://github.com/sportsdataverse/sportsdataverse-data/releases/tag/nhl_player_boxscores" _blank;
click C5 "https://github.com/sportsdataverse/sportsdataverse-data/releases/tag/nhl_rosters" _blank;
click C6 "https://github.com/sportsdataverse/sportsdataverse-data/releases/tag/nhl_schedules" _blank;
| Release tag | Content |
|---|---|
nhl_pbp_full |
NHL play-by-play data (full, includes shifts) |
nhl_pbp_lite |
NHL play-by-play data (lite, no shift CHANGE events) |
nhl_team_boxscores |
NHL team box scores |
nhl_player_boxscores |
NHL player box scores (skaters + goalies) |
nhl_rosters |
NHL rosters |
nhl_schedules |
NHL schedules |
nhl/
├── json/
│ ├── raw/ # Raw NHL API responses per game
│ └── final/ # Processed via fastRhockey pipeline (PBP, box scores, game info)
├── schedules/
│ ├── rds/ # Season schedules (nhl_schedule_{year}.rds)
│ └── parquet/ # Season schedules in parquet format
├── nhl_schedule_master.rds # Combined schedule across all seasons
└── nhl_schedule_master.parquet
- Scraping workflow runs daily during the NHL season
- On push, triggers the fastRhockey-nhl-data repo to compile datasets
fastRhockey-nhl-raw data repository (source: NHL API)
fastRhockey-nhl-data repository (source: NHL API)
fastRhockey-pwhl-raw data repository (source: HockeyTech API)
fastRhockey-pwhl-data repository (source: HockeyTech API)
fastRhockey-data legacy repository (archived; sources: NHL Stats API + PHF)