We have written a fair amount at Ars recently about the superiority of the European forecast model, suggesting to readers that they focus on the ensemble runs of this system to get a good handle on track forecasts for Hurricane Irma. Then we checked out some of the preliminary data on model performance during this major hurricane, and it was truly eye-opening.

Brian Tang, an atmospheric scientist at the University of Albany, tabulates data on "mean absolute error" for the location of a storm's center at a given time and where it was forecast to be at that time. Hurricane Irma has been a thing for about a week now, so we have started to get a decent sample size—at least 10 model runs—to assess performance.

The model data

The chart below is extremely busy, but when you understand how to read it, the data is striking. It shows the average position error (in kilometers) at forecast lead times of 12, 24, 48, 72, 96, and 120 hours (so, out to five days). It compares several different classes of models, including global models that forecast conditions around the planet, nested models focused on hurricanes, and consensus forecasts. Specifically, the models are referenced as follows:

AVNO: US Global Forecast System, or GFS. The premiere US global model

CMC: Canadian global model

UKM: UK Met Office global model

ECMWF: European global model

NGX: US Navy global model

HMON: New, experiment US hurricane model

HWRF: Operational US hurricane model

TVCN: Consensus model, essentially an average of the global models

OFCL: Official National Hurricane Center forecast

Forecast models typically show their skill with three-, four-, and five-day forecasts. For simplicity's sake, we will focus on 120-hour forecasts. At this lead time, the average error of the European model with respect to Irma has been about 175km in its position forecast. The next best forecast is from the hurricane center, which is slightly more than 300km. An automated model, then, has so far beaten human forecasters at the National Hurricane Center (looking at all of this model data) by a wide margin. That's pretty astounding.

What is particularly embarrassing for NOAA, however, is the comparison between the European model and the various US forecast modeling efforts. The average 120-hour error of the GFS model is about 475km. The operational, hurricane-specific model, HWRF, does better, with an average error of 325km. But the experimental HMON model does terribly, at nearly 550km of error. A similar disparity in quality goes all the way down to 24-hour forecasts.

Another method of determining track accuracy is by looking at trend maps, which show a time series of tracks. In the gallery below, you can gauge the consistency of forecast models and their accuracy based upon the actual track of the storm.

Brian Tang/U Albany

Brian Tang/U Albany

Brian Tang/U Albany





Brian Tang/U Albany

Brian Tang/U Albany

Brian Tang/U Albany

Brian Tang/U Albany

Why the US lags

So what's the deal here? The overall performance of the National Weather Service's GFS model has lagged for years behind the European forecast system, which is backed up by superior resources and computing power. Finally, this year, the GFS was upgraded. However, even before those upgrades went into effect, hurricane forecasters were raising concerns about the new GFS.

Shortly before the beginning of the 2017 Atlantic hurricane season, in fact, forecasters at the National Hurricane Center in Miami pushed back against the upgrade. They had noted degraded performance during internal tests of the GFS model on Atlantic tropical cyclones. The track forecasts were about 10 percent worse with the newer version of the model than the older one.

In a presentation posted on the National Weather Service website, first reported by Mashable, the hurricane center officials said, "The loss of short- to medium-range [tropical cyclone] track and intensity forecast skill for the Atlantic basin in the proposed 2017 GFS is unacceptable to the National Hurricane Center." Ultimately, the upgrade was initiated anyway.

An independent expert on global forecast models, Ryan Maue, said the NOAA office responsible for developing US computer models, the National Centers For Environmental Prediction, is understaffed and has less funding than the European forecasting center, which is based in the United Kingdom. America, he said, is getting what it pays for.

"NOAA and the National Weather Service are stretched a mile and an inch deep in some places for all of the responsibilities that they have," said Maue, a research meteorologist at the Cato Institute. "If we want to focus on having the best weather forecast in the world, we should focus on having the best weather forecast."