Thunder Citizen — Transit data export
Date range: 2026-06-01 to 2026-06-07 (service dates, America/Thunder_Bay)
Source: unofficial, derived from observing Thunder Bay Transit's GTFS feeds.

These files are the minimum set to reproduce — or redefine — every metric on
the Metrics tab in a spreadsheet. All counts are raw; percentages are never
stored, so you sum the counts and divide once yourself.

FILES
-----
metrics_chunks.csv
    Pre-rolled aggregates: one row per route x service-day x 6-hour band.
    Columns hold raw counts (trip_count, on_time_count, scheduled_count,
    cancelled_count, no_notice_count) and SUM-stable headway sums
    (headway_count, headway_sum_sec, headway_sum_sec_sq, sched_headway_sec).
    This is the "answer key": aggregate it across rows for any KPI.
      OTP            = sum(on_time_count) / sum(trip_count)
      Cancel rate    = sum(cancelled_count) / sum(scheduled_count)
      Headway CV     = sqrt(headway_sum_sec_sq/headway_count
                           - (headway_sum_sec/headway_count)^2)
                       / (headway_sum_sec/headway_count), per route, then
                       averaged across routes.

timepoint_stop_events.csv
    The raw layer behind OTP. One row per trip per timepoint stop (timepoint
    membership already applied). delay_sec = arrival_delay, falling back to
    departure_delay. on_time is OUR classification: delay_sec within
    [-60, 300] seconds (1 min early to 5 min late). To use your own
    definition, ignore on_time and threshold delay_sec however you like.
    Our official OTP is per-TRIP: average delay_sec across a trip's timepoint
    stops, then apply the window — group by (date, trip_id) to reproduce it.

cancellations.csv
    One row per cancelled trip (trip_id + start_date). We observe these on
    every feed poll while the trip stays cancelled; rows are de-duplicated to
    one per trip, with first_seen / last_seen (when we first/last saw it) and
    poll_count (how many polls reported it). Notice lead = scheduled departure
    minus first_seen.

alerts.csv
    One row per service alert (alert_id), de-duplicated the same way with
    first_seen / last_seen / poll_count. Content columns (header, description,
    affected_routes, ...) reflect the most recent poll.

NOTES
-----
- Times in *_delay columns are seconds (negative = early).
- first_seen / last_seen are timestamps with timezone (UTC offset shown).
- Downloads are capped at one year per request.
- Raw vehicle GPS positions are not included: that feed is polled every six
  seconds and runs to gigabytes per year.
