A GTFS file is a ZIP archive containing a number of CSV files. A "GTFS Diff v2" file is a single JSON document that describes all differences between two GTFS archives, combining aggregate summaries with capped per-file details.
- Single file output: one JSON document contains summary + capped details.
- Capped for performance: max 50 row changes per file; full counts always preserved in summary.
- Explicit scope: files outside the supported scope are reported in
metadata.unsupported_filesrather than silently ignored. - Spec-aligned: compatible with the existing GTFS Diff v1 specification.
This version of the schema only supports files defined in the official GTFS Schedule reference.
Any unsupported file in the GTFS archive (including non-.txt files like readme.pdf, locations.geojson, etc.) is not diffed. Instead, it is reported in metadata.unsupported_files.
GtfsDiffOutput
├── metadata
│ ├── schema_version
│ ├── generated_at
│ ├── row_changes_cap_per_file
│ ├── base_feed # source (URL or local path), downloaded_at
│ ├── new_feed # source (URL or local path), downloaded_at
│ └── unsupported_files[] # files skipped by the diff engine
├── summary # true aggregate counts (drives file tree sidebar)
│ └── files[] # per-file: name + true counts by action
└── file_diffs[] # one entry per changed supported file
├── file_name
├── file_action # "added" | "deleted" | "modified"
├── columns_added[]
├── columns_deleted[]
├── row_changes
│ ├── primary_key[]
│ ├── columns[] # union of base + new; base column order first
│ ├── added[] # capped
│ ├── deleted[] # capped
│ └── modified[] # capped
└── truncated # cap metadata (omitted counts)
| Field | Type | Required | Description |
|---|---|---|---|
schema_version |
String | Required | The version of this schema (e.g. "2.0.0"). |
generated_at |
String (ISO 8601) | Required | Timestamp of when the diff was generated. |
row_changes_cap_per_file |
Integer | Required | Maximum number of row changes included per file in file_diffs. |
base_feed |
Object | Required | Information about the base (old) GTFS archive. |
base_feed.source |
String | Required | URL or local path to the base feed. |
base_feed.downloaded_at |
String (ISO 8601) | Required | When the base feed was downloaded. |
new_feed |
Object | Required | Information about the new GTFS archive. |
new_feed.source |
String | Required | URL or local path to the new feed. |
new_feed.downloaded_at |
String (ISO 8601) | Required | When the new feed was downloaded. |
unsupported_files |
Array | Required | List of files in the archives that are outside the supported scope. |
unsupported_files[].file_name |
String | Required | File name as it appears in the archive. |
unsupported_files[].present_in |
String. Enum: base, new, both |
Required | Which archive(s) contain this file. |
| Field | Type | Required | Description |
|---|---|---|---|
total_changes |
Integer | Required | Total number of changes across all files. |
files_added |
Integer | Required | Number of files added. |
files_deleted |
Integer | Required | Number of files deleted. |
files_modified |
Integer | Required | Number of files modified. |
files |
Array | Required | Per-file summary with true (uncapped) counts. |
files[].file_name |
String | Required | Name of the GTFS file. |
files[].status |
String. Enum: added, deleted, modified |
Required | The file-level status. |
files[].columns_added |
Integer | Optional | Number of columns added. Present when > 0. |
files[].columns_deleted |
Integer | Optional | Number of columns deleted. Present when > 0. |
files[].rows_added |
Integer | Optional | True count of rows added. Present when > 0. |
files[].rows_deleted |
Integer | Optional | True count of rows deleted. Present when > 0. |
files[].rows_modified |
Integer | Optional | True count of rows modified. Present when > 0. |
Each entry represents one changed supported file.
| Field | Type | Required | Description |
|---|---|---|---|
file_name |
String | Required | Name of the GTFS file. |
file_action |
String. Enum: "added", "deleted", "modified" |
Required | How this file changed between the two archives. |
columns_added |
Array of String | Required | List of column names added to this file. |
columns_deleted |
Array of String | Required | List of column names deleted from this file. |
row_changes |
Object | Conditionally required | Present when the file has row-level changes (i.e. file_action is "modified"). |
row_changes.primary_key |
Array of String | Required | Column(s) that uniquely identify a row in this file. |
row_changes.columns |
Array of String | Required | Union of all columns across both the base and new versions of the file. The order matches the base feed's original column order; columns only present in the new feed are appended at the end. |
row_changes.added |
Array | Required | Added rows (capped). Each entry has identifier (primary key values), raw_value (the CSV row from the new file, field order matching columns), and new_line_number (1-based line number in the new CSV file). |
row_changes.deleted |
Array | Required | Deleted rows (capped). Each entry has identifier (primary key values), raw_value (the CSV row from the base file, field order matching columns), and base_line_number (1-based line number in the base CSV file). |
row_changes.modified |
Array | Required | Modified rows (capped). Each entry has identifier (primary key values), raw_value (the base CSV row, field order matching columns), base_line_number (1-based line number in the base CSV file), new_line_number (1-based line number in the new CSV file), and field_changes (array of {field, base_value, new_value}). |
truncated |
Object | Optional | Present only when row changes exceed the cap. |
truncated.is_truncated |
Boolean | Required | Always true when present. |
truncated.omitted_count |
Integer | Required | Number of row changes omitted due to the cap. |
The cap applies to the combined count of added + deleted + modified rows per file, in first-encountered order. A file with 30 added, 15 deleted, and 200 modified rows (245 total) hits the cap at 50 and reports omitted_count: 195.
File-level changes (file_action) and column-level changes (columns_added, columns_deleted) are not capped.
When the diff engine encounters a file that is not in the supported scope:
- It is not added to
file_diffs[]. - It is not counted in
summary.total_changesorsummary.files_*. - It is listed in
metadata.unsupported_files[]withfile_nameandpresent_in.
This keeps the diff focused on what was actually compared, while still surfacing skipped files so the UI can show a "files not diffed" section.
A formal JSON Schema for validation is available at json_schema/gtfs_diff_v2_schema.json.
See examples/example_output.json for a full example, and below for a walkthrough.
Given two GTFS archives where:
shapes.txtwas added as a new filestop_times.txthad 1213 row changes (120 added, 45 deleted, 1048 modified)stops.txthad a column deleted and 8 row changes (2 added, 1 deleted, 5 modified)readme.pdfandcustom_notes.txtare non-GTFS files present in the archives
The v2 output will:
- Report
readme.pdfandcustom_notes.txtinmetadata.unsupported_files - Show true counts in
summary(total_changes: 1213) - Cap
stop_times.txtrow details to 50, reportingomitted_count: 1163 - Show all 8
stops.txtrow changes (under the cap, notruncatedfield) - Show
shapes.txtas a file-level addition with norow_changes
The examples folder contains a complete example output.