Methodology

How we measure flight reliability

OnTimeStats turns public U.S. flight records into clear reliability figures. Here is exactly what every number on the site means and where it comes from.

Definition

What "on-time" means

A flight is on-time when it arrives within 15 minutes of its scheduled gate arrival. This is the U.S. Department of Transportation’s standard measure (the ArrDel15 flag). A flight 14 minutes late counts as on-time; 16 minutes late does not.

Source

Where the data comes from

Every figure is computed from the U.S. DOT Bureau of Transportation Statistics (BTS) On-Time Performance dataset — the same records U.S. airlines are required to report for domestic flights. This is public data; we aggregate it, we do not estimate or model it.

Window

The time period covered

Figures currently cover flights from 2000–2025. Every page states the window and the number of flights behind its numbers so you always know how much data a figure rests on.

Confidence

Sample size and low-data airports

Reliability over a handful of flights is noise, not signal. We show the flight count behind each figure, and rankings such as the best/worst on-time list only include airports with enough traffic to be meaningful. Small, low-traffic airports are not ranked against major hubs.

Tail risk

The "bad day" delay (p90)

Averages hide the flights that ruin a trip. The 90th-percentile delay (p90) is the figure that 9 in 10 flights beat — a realistic picture of a bad day. A route can have a great average and still a punishing p90.

Airlines

How we handle airline mergers

An airline’s headline figures use only its own DOT-coded flights. When a carrier absorbed a predecessor, the predecessor is shown separately and clearly labeled — we never retroactively credit one airline with another’s history.

Predictor

Two time windows on this site

Headline reliability figures use the full history available. The predictor and connectiontools use only the recent ~3 years, because they need day-of-week and time-of-day detail that the long history doesn’t carry — and recent flying is the better guide to what happens next. That’s why a prediction can differ from a route’s all-time average.

Predictor

When your exact slice is thin

If your exact slice — airline, month, day, time of day — has too few flights to be reliable, the predictor generalizes one step at a time: it drops day-of-week first, then month, then time of day, and keeps the airline longest, ending at the all-airline figure for the route. It always tells you which slice actually answered. If you choose weekday or weekend, that choice is kept longest.

Connection

Time needed to change flights

We don’t have each airline’s official minimum connection time, so we start from a sensible estimate based on the airport’s size — and let you adjust it. It assumes a domestic-to-domestic, same-terminal connection; international arrivals, terminal changes, or customs need more time. You know your situation best, so changing this number makes the estimate more honest, not less.

Connection

Cancelled and diverted flights count as missed

If your inbound flight is cancelled or diverted you can’t make the connection — and the odds we show already include that chance. The make-it probability is “out of all the times this flight was scheduled,” not “assuming it operates.”

Connection

A deliberately conservative estimate

The make-it odds assume your onward flight leaves exactly on time. In reality, if it’s also delayed you get more time to connect, not less — so your real chances are usually equal to or better than what we show. We’d rather under-promise than over-promise on a tight connection.

Uncertainty

Two different kinds of uncertainty

We separate how much flights vary from how confidentwe are. Variation is the percentile band (“most flights landed between X and Y”) — real flight-to-flight spread, which is why we never show a misleading plus-or-minus average. Confidence is the sample size and the low-data flag: a slice with few flights gives a shakier estimate, and we say so.