Methodology
How we measure flight reliability
OnTimeStats turns public U.S. flight records into clear reliability figures. Here is exactly what every number on the site means and where it comes from.
Definition
What "on-time" means
A flight is on-time when it arrives within 15 minutes of its scheduled gate arrival. This is the U.S. Department of Transportation’s standard measure (the ArrDel15 flag). A flight 14 minutes late counts as on-time; 16 minutes late does not.
Source
Where the data comes from
Every figure is computed from the U.S. DOT Bureau of Transportation Statistics (BTS) On-Time Performance dataset — the same records U.S. airlines are required to report for domestic flights. This is public data; we aggregate it, we do not estimate or model it.
Window
The time period covered
Figures currently cover flights from 2000–2025. Every page states the window and the number of flights behind its numbers so you always know how much data a figure rests on.
Confidence
Sample size and low-data airports
Reliability over a handful of flights is noise, not signal. We show the flight count behind each figure, and rankings such as the best/worst on-time list only include airports with enough traffic to be meaningful. Small, low-traffic airports are not ranked against major hubs.
Tail risk
The "bad day" delay (p90)
Averages hide the flights that ruin a trip. The 90th-percentile delay (p90) is the figure that 9 in 10 flights beat — a realistic picture of a bad day. A route can have a great average and still a punishing p90.
Airlines
How we handle airline mergers
An airline’s headline figures use only its own DOT-coded flights. When a carrier absorbed a predecessor, the predecessor is shown separately and clearly labeled — we never retroactively credit one airline with another’s history.
Predictor
Two time windows on this site
Headline reliability figures use the full history available. The predictor and connectiontools use only the recent ~3 years, because they need day-of-week and time-of-day detail that the long history doesn’t carry — and recent flying is the better guide to what happens next. That’s why a prediction can differ from a route’s all-time average.
Predictor
When your exact slice is thin
If your exact slice — airline, month, day, time of day — has too few flights to be reliable, the predictor generalizes one step at a time: it drops day-of-week first, then month, then time of day, and keeps the airline longest, ending at the all-airline figure for the route. It always tells you which slice actually answered. If you choose weekday or weekend, that choice is kept longest.
Connection
Time needed to change flights
We don’t have each airline’s official minimum connection time, so we start from a sensible estimate based on the airport’s size — and let you adjust it. It assumes a domestic-to-domestic, same-terminal connection; international arrivals, terminal changes, or customs need more time. You know your situation best, so changing this number makes the estimate more honest, not less.
Connection
Cancelled and diverted flights count as missed
If your inbound flight is cancelled or diverted you can’t make the connection — and the odds we show already include that chance. The make-it probability is “out of all the times this flight was scheduled,” not “assuming it operates.”
Connection
A deliberately conservative estimate
The make-it odds assume your onward flight leaves exactly on time. In reality, if it’s also delayed you get more time to connect, not less — so your real chances are usually equal to or better than what we show. We’d rather under-promise than over-promise on a tight connection.
Uncertainty
Two different kinds of uncertainty
We separate how much flights vary from how confidentwe are. Variation is the percentile band (“most flights landed between X and Y”) — real flight-to-flight spread, which is why we never show a misleading plus-or-minus average. Confidence is the sample size and the low-data flag: a slice with few flights gives a shakier estimate, and we say so.