As one of the first global coupled climate models to simulate and predict category 4 and 5 (Saffir–Simpson scale) tropical cyclones (TCs) and their interannual variations, the High-Resolution Forecast-Oriented Low Ocean Resolution (HiFLOR) model at the Geophysical Fluid Dynamics Laboratory (GFDL) represents a novel source of insight on how the entire TC intensification distribution could be transformed because of climate change. In this study, three 70-yr HiFLOR experiments are performed to identify the effects of climate change on TC intensity and intensification. For each of the experiments, sea surface temperature (SST) is nudged to different climatological targets and atmospheric radiative forcing is specified, allowing us to explore the sensitivity of TCs to these conditions. First, a control experiment, which uses prescribed climatological ocean and radiative forcing based on observations during the years 1986–2005, is compared to two observational records and evaluated for its ability to capture the mean TC behavior during these years. The simulated intensification distributions as well as the percentage of TCs that become major hurricanes show similarities with observations. The control experiment is then compared to two twenty-first-century experiments, in which the climatological SSTs from the control experiment are perturbed by multimodel projected SST anomalies and atmospheric radiative forcing from either 2016–35 or 2081–2100 (RCP4.5 scenario). The frequency, intensity, and intensification distribution of TCs all shift to higher values as the twenty-first century progresses. HiFLOR’s unique response to climate change and fidelity in simulating the present climate lays the groundwork for future studies involving models of this type.

Section 2 provides more details on the methodology of each experiment along with a description of the HiFLOR model and observational data sources. The next two sections are composed of the analyses of the HiFLOR experiments. We first compare the mean global TC intensity behavior observed for the years 1986–2005 to the HiFLOR CTL simulation, and then our analysis pivots toward understanding how the intensity and intensification distribution of TCs will evolve during the twenty-first century. The final section of the paper includes a summary of the results and a discussion of future research inspired by this study.

Motivated by HiFLOR’s success at capturing TC intensity behavior and structure, this study explores whether HiFLOR can also recover the intensification distribution of TCs and how climate change can affect that distribution. Three 70-yr HiFLOR experiments (introduced in van der Wiel et al. 2017 ) are performed to identify the effects of climate change on TC intensity characteristics. The “control” (CTL) experiment aims to represent the observed climate during the period 1986–2005, while the “early” and “late” experiments respectively project the climate during 2016–35 and 2081–2100 under the RCP4.5 scenario. Two observational datasets are used to validate the HiFLOR CTL experiment: the International Best Track Archive for Climate Stewardship (IBTrACS) produced by the National Hurricane Center (NHC) and the Joint Typhoon Warming Center (JTWC) ( Knapp et al. 2010 ), and the advanced Dvorak technique-Hurricane Satellite-B1 (ADT-HURSAT) ( Kossin et al. 2013 ).

To analyze how TCs will respond to different climate conditions, dynamical downscaling inputs the TC vortex structure and the environmental conditions of CGCMs or AGCMs into a higher-resolution regional model (first utilized by Knutson et al. 1998 and Knutson and Tuleya 2004 ). Statistical–dynamical downscaling is another inexpensive method for extracting information from climate models and was first discussed by Emanuel et al. (2006) and Emanuel (2006) . Using this approach, TCs are randomly seeded before a beta-and-advection model and coupled air–sea model (Coupled Hurricane Prediction Intensity Prediction System) respectively control the track and intensity evolutions of TCs. Although dynamical and statistical–dynamical downscaling techniques were able to reproduce past TC activity very well ( Knutson et al. 2007 ; Emanuel et al. 2008 ), the assumptions implicit in these techniques introduce additional uncertainties into their TC projections.

CGCMs represent an appealing alternative because they explicitly resolve physical processes and their nonlinear interactions on a variety of time and spatial scales in the ocean, atmosphere, and the ocean–atmosphere interface. CGCMs are arguably the most seamless approach to projecting climate change effects on TCs, but they require tremendous computational resources (e.g., Small et al. 2014 ) and often have significant biases in the mean state that decrease the likelihood of accurate future projections. For these reasons, lower-resolution CGCMs and AGCMs are often paired with downscaling techniques to reproduce the strongest TCs.

The ability of society to adapt to future climate change could be enhanced by furthering research on how the most intense tropical cyclones (TCs) will respond to climate change. Between 1900 and 2005, major hurricanes [wind speeds greater than 95 kt (1 kt = 0.5144 m s −1 ); categories 3–5 on the Saffir–Simpson scale] accounted for 85% of the total damage of all storms in the United States ( Pielke et al. 2008 ). A recent study by Lee et al. (2016) highlighted that almost all of these intense tropical cyclones undergo rapid intensification (RI; commonly defined as the 95th percentile of all 24-h intensity changes) during their lifetime. RI events are responsible for intensity forecasts with the highest errors, and hurricanes that rapidly intensify before landfall cause a majority of the fatalities and damage from TCs ( Emanuel 2017 ). Therefore, researching whether the frequency of major hurricanes and their associated intensification rates are likely to change during the twenty-first century is critical for the development of adaptation strategies and resiliency efforts for coastal cities.

A comparison of the nudged-SST and fully coupled experiments indicates that the nudged-SST framework can capture the response of TCs to radiative forcing (G. Vecchi et al. 2018, manuscript submitted to Climate Dyn.). Meanwhile, the response of TCs to carbon dioxide forcing in HiFLOR is sensitive to the model’s underlying SST biases (G. Vecchi et al. 2018, manuscript submitted to Climate Dyn.). These two factors argue in favor of experiments in which the CTL run climatology is kept close to the observed climatology and motivate the experimental setup used here.

in which φ is the model-computed tendency for SST, τ is the restoring time scale (5 days), and t is time. The restoring time-scale length was carefully selected in Murakami et al. (2015) to be short enough to prevent climate drift in the model but long enough to maintain SST cooling effects from a storm’s wake. Without φ, this equation would simply bring the model’s SSTs toward the prescribed values with an e-folding time scale of τ. The model SST tendency term, which involves advection, mixing, and heat fluxes, enables the SSTs of the model to drift away from SST T on short and long time scales.

HiFLOR was integrated for 70 years in each experiment. SSTs were relaxed to climatological SST values representative of the 20-yr period of interest. Thus, each of the 70 years has approximately the same SSTs and radiative forcing, which substantially reduces the interannual and decadal variability of the climate system. This experimental setup will not be able to capture the rectified response of climatological TC activity to changes in the interannual or decadal variability of SSTs and radiative forcing. However, we hypothesize that these variations are smaller than the response of climatological TC activity driven by climatological changes in SST—a hypothesis that is supported by fully coupled and nudged-SST experiments in G. Vecchi et al. (2018, manuscript submitted to Climate Dyn.). For the CTL experiment, the prescribed SST target was the monthly varying climatology from the Met Office Hadley Centre Sea Ice and SST dataset (HadISST1.1; Rayner et al. 2003 ) over the years 1986–2005. The early and late experiments used the same climatological values of SSTs, sea ice, and greenhouse gas concentrations from the CTL experiment plus the projected changes derived from a multimodel mean of 17 CMIP5 models. 1 These anomalies were based on the RCP4.5 pathway ( Van Vuuren et al. 2011 ).

Even with their deficiencies, IBTrACS and ADT-HURSAT represent the best options for evaluating the skill of HiFLOR. To maintain consistency when we compare HiFLOR to observations, all data sources are rounded to the nearest 5 kt. We only consider TCs that are active for at least 72 h and exceed wind speeds of 34 kt for at least 36 h, which are identical criteria to the HiFLOR tracker ( Murakami et al. 2015 ). Although we do not have warm-core information (also part of the HiFLOR criteria), the available criteria help us consider TCs of similar longevity and strength to those tracked in HiFLOR. We restrict our analysis sample to only consider cases where the TC center is located over the ocean. When a TC traverses land, intensification processes are controlled by unique physical processes, which are poorly resolved in climate models. Additionally, we do not examine TC intensity changes above 40° latitude because storms often undergo extratropical transition at higher latitudes ( Liu et al. 2017 ). These TCs lose their warm core, and their intensity evolution is controlled by mechanisms that are not typical of tropical systems.

To assemble a more homogeneous record of TC intensity, Kossin et al. (2007 , 2013 ) developed an automated approach called ADT-HURSAT. The creation of ADT-HURSAT consists of four main steps. Geostationary satellite imagery is first analyzed from International Satellite Cloud Climatology Project (ISCCP)-B1 data ( Knapp and Kossin 2007 ; Knapp 2008a , b ). Then, the data are centered on IBTrACS TCs and subsampled to be both spatially and temporally homogeneous. Finally, a simplified version of the advanced Dvorak technique ( Olander and Velden 2007 ) is used to evaluate the data and determine a maximum TC wind speed. ADT-HURSAT data are produced every 3 h based on satellite data that has been uniformly subsampled to a horizontal resolution of 8 km, and wind speeds are recorded to the nearest tenth of a Dvorak “T-number” (depending on the current intensity, between 1 and 3 kt). ADT-HURSAT maintains the same protocol to determine TC intensities but its efforts to stay homogenous prevent it from using the best technology and analysis techniques available. As a result, TC intensities determined with ADT-HURSAT average higher errors than those in the IBTrACS dataset ( Kossin et al. 2013 ; Olander and Velden 2007 ).

As an official archiving and distribution resource for TC best track data, IBTrACS is the primary dataset used in meteorological studies to validate model performance and compute observational trends in TC metrics. With the exception of the Indian Ocean (which did not gain continuous satellite coverage until 1998), IBTrACS data quality significantly improved in the early 1980s when satellites were deployed globally. Satellites allowed warning centers to employ the Dvorak technique, which is a robust way to estimate TC intensities based on infrared satellites ( Kossin et al. 2007 ; Knapp et al. 2010 ). As a result, Chu et al. (2002) noted that JTWC’s data since 1985 are more reliable than earlier years. Observational tools have continued to evolve to supplement the Dvorak technique, and new geostationary satellites, aerial Doppler radars, and stepped frequency microwave radiometers (SFMRs; Uhlhorn et al. 2007 ) have recently become available. Therefore, uncertainty in the best track intensity estimates have decreased with time, which introduces temporal heterogeneities in the quality of the data. There are also spatial inconsistencies in observational quality because the number of measurements available in different basins varies considerably.

TC intensity observations during 1986–2005 were obtained from IBTrACS and ADT-HURSAT to evaluate the performance of the HiFLOR CTL experiment. IBTrACS is composed of global “best track” data, which are recordings of TC locations and intensities from forecasting agencies across the world. Best track data start as operational estimates of the intensity and track of a TC and are refined at the end of a TC’s lifetime with a combination of in situ (e.g., dropsondes, scatterometers, buoys), radar, and satellite measurements. Best track intensity and position estimates are available every 6 h at the 4 synoptic times (0000, 0600, 1200, and 1800 UTC) and are recorded to the nearest 5 kt and 0.1° latitude/longitude ( Landsea and Franklin 2013 ). For our analysis, we utilize IBTrACS, v03r09, but only consider data from NHC for the Atlantic and east Pacific and the JTWC for the remainder of the globe. One of the benefits of only using data from these U.S. agencies 2 is they follow the same definition of maximum winds: the highest 1-min average at 10-m height over a smooth surface ( Harper et al. 2010 ).

The mean TC intensity and intensification behavior observed between 1986 and 2005 in IBTrACS and ADT-HURSAT is compared to the HiFLOR CTL simulation. Specifically, we examine the realism of the following: the annual frequency of TCs and major hurricanes, the spatial distribution of TCs and major hurricanes, the relationship between lifetime maximum intensity (LMI) and RI, the probability density of 6- and 24-h wind speed changes, and the spatial distribution of RI rates.

For the map in Fig. 3 depicting the difference in major hurricanes between HiFLOR and IBTrACS, the east and west Pacific basins are again emphasized as the main areas where the two datasets differ. Only 39% of plotted grid boxes are statistically significant in the major hurricane map, while 52% of plotted grid boxes are statistically significant in the TC map. Clearly, HiFLOR is better at reproducing the geographical distribution of stronger TCs than weaker TCs. The number of major hurricanes captured in the HiFLOR CTL simulation is impressive for a CGCM and enables HiFLOR to resolve the entire breadth of the LMI distribution observed in TCs. Lee et al. (2016) recently showed that RI is a fundamental characteristic of the storms located in the high-intensity tail of the LMI distribution, which broaches the question of whether RI is a common occurrence for strong storms in HiFLOR.

The percent difference in annual mean (top) TC and (bottom) major hurricane days between the HiFLOR control simulation and IBTrACS. TC (major hurricane) days are calculated by counting the number of times a TC (major hurricane) passes into a 5° × 5° grid box and dividing by the number of observation increments in a day. Red grid boxes highlight areas where HiFLOR has more days than IBTrACS, and blue grid boxes highlight areas where HiFLOR has fewer days than IBTrACS. Data are only plotted in a grid box if there is at least one TC day in HiFLOR and ¼ of a TC day in IBTrACS. Grid boxes that achieve a p value of 0.05 using the Mann–Wilcoxon–Whitney test are considered statistically significant. White “X” marks are located in grid boxes that are not statistically significant.

The percent difference in annual mean (top) TC and (bottom) major hurricane days between the HiFLOR control simulation and IBTrACS. TC (major hurricane) days are calculated by counting the number of times a TC (major hurricane) passes into a 5° × 5° grid box and dividing by the number of observation increments in a day. Red grid boxes highlight areas where HiFLOR has more days than IBTrACS, and blue grid boxes highlight areas where HiFLOR has fewer days than IBTrACS. Data are only plotted in a grid box if there is at least one TC day in HiFLOR and ¼ of a TC day in IBTrACS. Grid boxes that achieve a p value of 0.05 using the Mann–Wilcoxon–Whitney test are considered statistically significant. White “X” marks are located in grid boxes that are not statistically significant.

Data are only plotted in a grid box if there is at least one TC day per year in HiFLOR and ¼ of a TC day per year in IBTrACS. Criteria thresholds were independently selected to compensate for the approximately 4 times as many storms in the HiFLOR sample. 4 A two-sided Mann–Wilcoxon–Whitney test, using Eqs. (5.22a,b) and (5.23a,b) from Wilks (2011 ), determines whether grid boxes are significant. Grid boxes with a p value less than 0.05 are considered statistically significant, and all other grid boxes are demarcated with a white “X.” HiFLOR produces significantly more TCs than IBTrACS for large portions of every basin. The only exception is the area surrounding Mexico in the Caribbean Sea, Gulf of Mexico, and east Pacific Ocean. HiFLOR has significantly less TCs than IBTrACS for much of this region.

Figure 2 also demonstrates that the two observational datasets disagree on the annual number of TCs. A comparison of the top and bottom maps reveals that difference in TC days can be attributed to ADT-HURSAT producing more weak TCs. In the top map of Fig. 2 , the additional blue contours without concurrent green contours, and magenta contours without concurrent red contours highlight areas where ADT-HURSAT identifies TCs but IBTrACS does not. In the bottom map of Fig. 2 , the colored contours almost overlap, confirming that ADT-HURSAT and IBTrACS agree well on the location and frequency of major hurricanes. The noted known deficiencies in the ADT-HURSAT scheme (in particular, that the absolute accuracy of the intensity estimates are necessarily compromised by the temporal homogenization process) likely cause the slight differences in the plotted fields, and therefore only IBTrACS is included in the following two figures.

The annual mean (top) TC and (bottom) major hurricane days in the 70-yr HiFLOR control simulation are shaded in gray. Days are calculated by counting the number of times a TC passes into a 5° × 5° grid box and dividing by the number of observation increments in a day (i.e., four 6-h increments per day in IBTrACS and HiFLOR, eight 3-h increments per day in ADT-HURSAT). The data are then smoothed with linear interpolation. The highest value on the top and bottom color bars is set ~30% lower than the maximum recorded value so that the west Pacific maximum does not prevent other geographical locations from displaying contours. Blue and green contours, respectively, demarcate areas where ADT-HURSAT and IBTrACS annually average one (¼) more TC (major hurricane) day than HiFLOR. Red and magenta contours respectively demarcate areas where ADT-HURSAT and IBTrACS annually average one (¼) less TC (major hurricane) day than HiFLOR. Storms must meet the criteria outlined in section 2 to contribute to the day totals in a grid box.

The annual mean (top) TC and (bottom) major hurricane days in the 70-yr HiFLOR control simulation are shaded in gray. Days are calculated by counting the number of times a TC passes into a 5° × 5° grid box and dividing by the number of observation increments in a day (i.e., four 6-h increments per day in IBTrACS and HiFLOR, eight 3-h increments per day in ADT-HURSAT). The data are then smoothed with linear interpolation. The highest value on the top and bottom color bars is set ~30% lower than the maximum recorded value so that the west Pacific maximum does not prevent other geographical locations from displaying contours. Blue and green contours, respectively, demarcate areas where ADT-HURSAT and IBTrACS annually average one (¼) more TC (major hurricane) day than HiFLOR. Red and magenta contours respectively demarcate areas where ADT-HURSAT and IBTrACS annually average one (¼) less TC (major hurricane) day than HiFLOR. Storms must meet the criteria outlined in section 2 to contribute to the day totals in a grid box.

Figure 2 illustrates the spatial distribution of mean annual TC (top) and major hurricane (bottom) density in the 70-yr HiFLOR CTL simulation and compares it to the 1986–2005 annual average of these fields in IBTrACS and ADT-HURSAT. In a large section of the Atlantic, north Indian, South Pacific, and central Pacific regions, HiFLOR matches well with observations. In general, HiFLOR appears to resolve the locations of major hurricanes better than the locations of TCs. As expected from Fig. 1 , the main differences between HiFLOR and the observational datasets are visible in the east and west Pacific basin. Both the top and bottom maps show blue and green contours in the east Pacific and red and magenta contours in the west Pacific, respectively highlighting HiFLOR’s main area of low and high biases. Figure 2 also highlights HiFLOR’s poor performance in the south Indian Ocean, specifically in the 10°–25°S latitude band between Madagascar and northwest Australia, which was not apparent in Fig. 1 . Just east of Madagascar and west of Australia, HiFLOR generates too many TCs and major hurricanes, but in the area between these two landmasses, HiFLOR generates too few TCs and major hurricanes. Additional analysis into spatial biases in this region will be presented in a future study.

The Pacific Ocean biases are common features of CGCMs that resolve the strongest TCs ( Small et al. 2014 ; McClean et al. 2011 ). In Murakami et al. (2015) , two types of HiFLOR simulations were also examined, one where SSTs were either allowed to evolve “freely” (no flux adjustments) and one where SSTs were restored to the interannually varying monthly mean values derived from HadISST1.1 ( Rayner et al. 2003 ). Even though HiFLOR was much better at representing SSTs in the flux-adjusted run, both simulations showed biases in TC frequency that closely resemble Fig. 1 . The comparable error behavior in the HiFLOR experiments with and without interannual variability suggests that our omission of El Niño–Southern Oscillation (ENSO) is likely not the source of the unique Pacific Ocean biases. In Murakami et al. (2015) , the correlations between synoptic-scale parameters in observations and HiFLOR were also high, which suggests that poor representation of the TC environment is not the main source of errors in TC totals. Like most CGCMs, the atmospheric physics and resolution of the model is likely at fault for the large biases, and all conclusions presented here are subject to these uncertainties. In an upcoming study, we examine whether synoptic-scale variables provide an explanation for global and basin-specific biases.

HiFLOR develops significantly more TCs and major hurricanes than both observational datasets. Globally, each year, HiFLOR approximately averages 13 more storms than ADT-HURSAT and 26 more storms than IBTrACS. In the west Pacific, HiFLOR produces significantly more TCs and major hurricanes than observations. HiFLOR exhibits the opposite behavior in the east Pacific basin, generating far fewer TCs than observations and rarely developing them into strong TCs. HiFLOR most closely resembles ADT-HURSAT and IBTrACS in the Atlantic basin, which is a positive sign for HiFLOR. The Atlantic basin has the most reliable data quality because of the superior observational network in this basin ( Kossin et al. 2013 ).

After applying the TC criteria discussed in the previous section, ADT-HURSAT and IBTrACS appear to have slightly different storm totals. This unexpected discrepancy is due to the fact that ADT-HURSAT derives TC locations from the version of IBTrACS that uses data from all agencies and information sources, not just NHC and JTWC ( Kossin et al. 2013 ). There are also TCs in IBTrACS that have position fixes with no associated intensity estimates. For these cases, ADT-HURSAT still provides an intensity estimate because the algorithm only requires the storm position to retrospectively utilize satellite imagery. Therefore, ADT-HURSAT records higher annual storm totals because TCs have longer lifetimes than in IBTrACS.

Histograms represent the annual frequency of TCs (background bars) and major hurricanes (foreground bars) for IBTrACS (yellow), ADT-HURSAT (green), and the HiFLOR CTL simulation (red). The title of each histogram is an abbreviation for the adjacent basin that provides the data to compute the bars. The dashed lines separate the boundaries of the different basins considered in this study. The dataset identifier on the x axis of the histograms is underlined with orange, green, or red if the annual TC count for that dataset is significantly greater than IBTrACS, ADT-HURSAT, or HiFLOR, respectively. An orange, green, or red asterisk on the top-left corner of the dataset identifier indicates the dataset has significantly more major hurricanes than IBTrACS, ADT-HURSAT, or HiFLOR, respectively.

Histograms represent the annual frequency of TCs (background bars) and major hurricanes (foreground bars) for IBTrACS (yellow), ADT-HURSAT (green), and the HiFLOR CTL simulation (red). The title of each histogram is an abbreviation for the adjacent basin that provides the data to compute the bars. The dashed lines separate the boundaries of the different basins considered in this study. The dataset identifier on the x axis of the histograms is underlined with orange, green, or red if the annual TC count for that dataset is significantly greater than IBTrACS, ADT-HURSAT, or HiFLOR, respectively. An orange, green, or red asterisk on the top-left corner of the dataset identifier indicates the dataset has significantly more major hurricanes than IBTrACS, ADT-HURSAT, or HiFLOR, respectively.

Figure 1 shows annual totals of TCs (background) and major hurricanes (foreground) for each ocean basin. Histograms are located near the basin that corresponds to the displayed data, and different data sources are represented by each bar. Dashed lines demarcate the boundaries between basins. Two-variable, unpaired t tests (e.g., Wilks 2011 ) are computed for each basin to establish statistical significance between the datasets. Equation (5.8) from Wilks (2011 ), adjusted to account for serial correlation between the forecasts [see Wilks’s (2011 ) Eq. (5.12)], is used to determine the Gaussian test statistic z, which is converted to a p value. When the p value is less than the significance threshold of 0.05 for the two-sided test, the difference in the mean TC or major hurricane count is considered statistically significant.

Figure 4 illustrates how the LMI distribution and the relationship between RI and LMI compare in HiFLOR, ADT-HURSAT, and IBTrACS. Probability density functions (PDFs) are plotted to represent the global LMI distribution of all storms, storms that undergo RI, and storms that do not undergo RI. Following the work of Lee et al. (2016), the intensification rate of 30 kt in 24 h is used as the RI threshold.5 The same smoothing technique (moving average with window width of 15 kt) is also applied here.

Fig . 4. View largeDownload slide PDFs of global TC LMI for IBTrACS, ADT-HURSAT, and HiFLOR. ADT-HURSAT (blue) and IBTrACS (black) distributions are calculated using data from 1986 to 2005. HiFLOR (red) distributions are calculated using the 70-yr control run that is nudged to the mean climate during 1986–2005. Raw data are grouped in 5-kt bins and smoothed by a moving average with window width of 15 kts. The solid, dashed, and dotted lines show the smoothed PDF for all storms, storms that undergo RI during their lifetime (RI storms), and those that do not (non-RI storms), respectively. The percent of storms that undergo RI is listed in the title with each percent colored according to the dataset. Fig . 4. View largeDownload slide PDFs of global TC LMI for IBTrACS, ADT-HURSAT, and HiFLOR. ADT-HURSAT (blue) and IBTrACS (black) distributions are calculated using data from 1986 to 2005. HiFLOR (red) distributions are calculated using the 70-yr control run that is nudged to the mean climate during 1986–2005. Raw data are grouped in 5-kt bins and smoothed by a moving average with window width of 15 kts. The solid, dashed, and dotted lines show the smoothed PDF for all storms, storms that undergo RI during their lifetime (RI storms), and those that do not (non-RI storms), respectively. The percent of storms that undergo RI is listed in the title with each percent colored according to the dataset.

Figure 4 agrees well with the main results of Lee et al. (2016). PDFs for IBTrACS and ADT-HURSAT are hypothesized to appear less smooth than those in Lee et al. (2016), because we focus on a shorter time period. IBTrACS and ADT-HURSAT LMI distributions have a bimodal structure and the most frequently attained LMI for both datasets is between 55 and 65 kt. However, ADT-HURSAT shows an unphysical peak in the number of storms whose LMI is between 55 and 65 kt (Kossin et al. 2013). It is well documented that ADT-HURSAT typically outputs TC intensity just below hurricane strength when the eye of a TC is not visible in the infrared imagery. Therefore, ADT-HURSAT suggests that an anomalous number of storms maintain their intensities between 55 and 65 kt, even though their actual intensities are much higher. After the eye appears, the automated algorithm artificially augments the intensification rate of TCs. IBTrACS intensity estimates are better in this intensity regime because forecasters can use microwave or radar imagery to see the eye of a TC when it is hidden by a cirrus shield (Landsea and Franklin 2013). Thus, ADT-HURSAT LMI features two maxima but its primary maximum is unrealistically large.

The probability distribution in HiFLOR peaks at higher LMI values than both observational datasets. Additionally, the HiFLOR curve does not have the two well-defined maxima found in observational datasets and high-resolution AGCMs (Manganello et al. 2012; Murakami et al. 2012). Murakami et al. (2015) also noted this inconsistency, which suggests it is likely a by-product of the tracking algorithm or deficiencies in the model physics. For example, the warm-core requirements and other components of the tracking procedure in HiFLOR could lower the number of weak TCs, which shifts the LMI distribution and peak to higher values. Alternatively, the atmospheric physics and horizontal resolution of HiFLOR could prevent it from capturing the intensification processes that cause the two distinct peaks.

Another possible explanation for the LMI differences is HiFLOR could be a better representation of the true LMI distribution of TCs than the observational data. Outside the Atlantic basin, most of the world relies on the Dvorak technique for intensity estimation. Dvorak estimates usually avail of satellite imagery with higher resolution than ADT, but there are still large errors when the eye of a TC is shielded by cirrus clouds. The LMI bimodality in observations is potentially an artifact of the uncertainty associated with intensity estimation (Uhlhorn and Nolan 2012). Model output in HiFLOR does not have these issues. Additionally, HiFLOR is sampling many more storms and has larger sample sizes, which could result in a smoother distribution. More analysis on the relationship between RI and LMI in high-resolution numerical simulations of TCs could help clarify if HiFLOR’s unique behavior for low-LMI TCs is because of model flaws.

Above 120 kt, HiFLOR LMI mirrors the LMI of the two observational datasets. Like ADT-HURSAT and IBTrACS, HiFLOR shows a connection between TCs with high intensification rates and high LMIs. In all three datasets, a majority of the strongest storms undergo RI at some point during their life cycle. However, the unique intensity calculation procedure followed by ADT-HURSAT that causes an exaggerated peak in LMI also results in a disproportionate number of major hurricanes (wind speeds greater than 95 kt) that undergo RI. Of major hurricanes, 97.9% experience RI using ADT-HURSAT while only 80.6% and 81.0% experience RI in IBTrACS and HiFLOR, respectively. Although HiFLOR captures the relationship between RI and major hurricanes, it differs from ADT-HURSAT and IBTrACS when considering all storms that undergo RI. The percentages at the top of Fig. 4 indicate that 42.6% of all storms rapidly intensify in HiFLOR but only 31.4% and 32.7% of all storms rapidly intensify in IBTrACS and ADT-HURSAT.

Figure 5 is similar to Fig. 4 but only contains the solid curves that represent the PDFs of LMI for all storms in the west Pacific (top) and Atlantic basin (bottom). The individual basins have distinct characteristics that agree well with the results displayed in Figs. 1–3. HiFLOR produces TCs in the west Pacific that reach higher LMIs than TCs in both observational datasets, which results in one broad LMI maximum at high wind speeds. ADT-HURSAT and IBTrACS maintain their bimodal distributions, and the relative size of the second, high-intensity maximum becomes larger. HiFLOR LMI peaks at almost the same intensity as the second LMI peak in the observational datasets, and it matches well with observations beyond 125 kt. In the Atlantic basin, the LMI distribution of HiFLOR appears to follow a more bimodal distribution. Unlike the west Pacific, HiFLOR produces too few storms above 140 kt but otherwise matches the observational datasets better.

The intensification characteristics of the different datasets are examined to explain the discrepancies in their RI frequencies and LMI distributions. We only consider the intensity changes that meet the TC longevity and location conditions discussed in section 2. Figure 6 shows the common logarithm of probability densities calculated from ADT-HURSAT, IBTrACS, and HiFLOR 24-h intensity changes that meet the criteria discussed in section 2. The three different plots show global (Fig. 6, top), west Pacific (Fig. 6, bottom left), and Atlantic (Fig. 6, bottom right) results. The construction of this image closely follows the methodology used to create Fig. 2 in Kowch and Emanuel (2015). To account for the substantially smaller sample sizes of the observational datasets, we randomly subsample the HiFLOR data at the same rate as the observational dataset with the fewest number of cases. This procedure is repeated 1000 times, and the probability densities of the intensity changes for each subsample are then calculated. The mean bin values from all the subsamples are then plotted as a solid red curve to communicate HiFLOR’s PDF. Red dashed lines demarcate the 5th and 95th percentiles of the subsamples’ intensity changes for each bin. This procedure compensates for the substantially larger number of 24-h intensity changes for HiFLOR compared to IBTrACS and ADT-HURSAT (70 years of data compared to 20 years).

Fig . 6. View largeDownload slide Common logarithm of the probability densities calculated from IBTrACS (black), ADT-HURSAT (blue), and HiFLOR (red) 24-h intensity changes. Plots generated with (top) global data, (bottom left) west Pacific basin data, and (bottom right) Atlantic basin data. To be included in the sample, the intensity change must occur over open ocean between 40°N and 40°S, and the beginning and ending intensity of the TC must be greater than 34 kt. Additional criteria to select storms for evaluation are described in section 2. The number of cases for each dataset is listed in the legend. HiFLOR tracks data are subsampled at the rate of the IBTrACS data for each intensity change bin. The red dashed lines indicate the 5th and 95th percentiles of 1000 subsamples. All distributions are bounded below by 10−5. The PDFs are truncated horizontally when two of the PDFs have logged a bin with a y value below or equal to 10−5. Fig . 6. View largeDownload slide Common logarithm of the probability densities calculated from IBTrACS (black), ADT-HURSAT (blue), and HiFLOR (red) 24-h intensity changes. Plots generated with (top) global data, (bottom left) west Pacific basin data, and (bottom right) Atlantic basin data. To be included in the sample, the intensity change must occur over open ocean between 40°N and 40°S, and the beginning and ending intensity of the TC must be greater than 34 kt. Additional criteria to select storms for evaluation are described in section 2. The number of cases for each dataset is listed in the legend. HiFLOR tracks data are subsampled at the rate of the IBTrACS data for each intensity change bin. The red dashed lines indicate the 5th and 95th percentiles of 1000 subsamples. All distributions are bounded below by 10−5. The PDFs are truncated horizontally when two of the PDFs have logged a bin with a y value below or equal to 10−5.

For the global data, the shape of the HiFLOR 24-h intensity change distribution closely mirrors IBTrACS, and the two datasets agree on the probabilities of some bins. For both of these datasets, the bin entries on either side of zero intensity change appear linear, conveying that the probability densities are exponentially distributed. ADT-HURSAT exhibits different behavior, with comparatively lower probabilities for smaller intensity changes and higher probabilities for a majority of the larger intensity changes. One of the most unique features of the ADT-HURSAT distribution is the probabilities of the bins between 30 and 70 kt are almost equal, and the probabilities of bins above this range are significantly higher than those in IBTrACS and HiFLOR. The “shelf” between 30 and 70 kt is likely caused by scene-type changes (from noneye to eye) that lead to large, spurious intensification rates in ADT-HURSAT (Olander and Velden 2007). This explanation corroborates the higher percentage of major hurricanes that undergo RI in ADT-HURSAT, which was discussed in reference to Fig. 4.

Between −40 and 20 kt of intensity change, all three datasets have similar probabilities. These bins account for approximately 83% of all IBTrACS cases, 86% of all HiFLOR cases, and 88% of all ADT-HURSAT cases. However, for a majority of the bins outside of this intensity range, sampling error does not explain the differences in the curves. HiFLOR has significantly less occurrences of the largest positive 24-h intensity changes than both observational datasets. This result is counterintuitive because Fig. 4 revealed that more storms increase their intensity by greater than 30 kt in HiFLOR compared to ADT-HURSAT and IBTrACS. However, for storms that rapidly intensify in HiFLOR, only 6.7% of all the recorded 24-h intensity changes for those storms are greater than 30 kt, while the corresponding percentages in ADT-HURSAT and IBTrACS are 11.1% and 14.2%, respectively. Thus, HiFLOR is able to capture the highest 24-h intensity changes observed in nature, but the probability of attaining the extreme intensification rates is too low.

In the west Pacific basin, the ADT-HURSAT probability distribution closely resembles its distribution for the global data. However, ADT-HURSAT still greatly differs from IBTrACS and HiFLOR, which display remarkable agreement for the west Pacific data. HiFLOR and IBTrACS probability distributions are almost identical for the highest intensification and decay rates. Therefore, HiFLOR appears to skillfully resolve intensification processes in the west Pacific, which indicates exaggerated genesis rates might be the cause of HiFLOR overestimating the annual major hurricane count in this basin.

In the Atlantic basin, HiFLOR, ADT-HURSAT, and IBTrACS have similar probability densities between −20 and 20 kt. However, at higher intensification rates, HiFLOR has lower probabilities than both observational datasets. The lack of extreme RI events for HiFLOR helps explain the shortage of major hurricanes visible in Fig. 3. However, the annual number of TCs in HiFLOR matches well with the observational datasets in Fig. 1, which suggests that HiFLOR likely generates a realistic number of TCs in the Atlantic basin but is a little conservative in the intensification of TCs.

Figure 6 demonstrates HiFLOR successfully captures the shape of the PDF for 24-h intensity changes but slightly underrepresents the highest intensification events. HiFLOR resembles IBTrACS for individual basins and globally but often deviates from ADT-HURSAT. Many of the known deficiencies in ADT-HURSAT, most specifically the documented artificial kurtosis in the LMI distribution, are manifested in the anomalous 24-h intensity change PDFs. Therefore, ADT-HURSAT achieves its original goal of serving as an excellent resource for trend analysis of TCs that achieve hurricane status but is less reliable than IBTrACS for verifying TC intensification. As a result, the following figure focuses only on the geographical distributions of RI in IBTrACS and HiFLOR.

Figure 7 provides a spatial perspective on how well HiFLOR reproduces the largest 24-h intensity changes. Percent difference in RI ratio between IBTrACS and HiFLOR is plotted in each 5° × 5° grid box, where RI ratio is defined as

As in Fig. 3, data are only plotted in a grid box if there is at least one TC day per year in HiFLOR and ¼ of a TC day per year in IBTrACS. Statistical significance is computed using a binomial proportion test with p values below 0.05 considered significant (Suissa and Shuster 1985). Grid boxes that are not statistically significant are demarcated with a white “X.” Figure 7 reinforces Fig. 6. The prevalence of blue boxes throughout the bottom map conveys that a smaller percentage of HiFLOR TCs are undergoing RI compared to those in IBTrACS and suggests that HiFLOR underestimates TC intensification in parts of every basin. The western half of the west Pacific basin contains the only sizable area where HiFLOR has a significantly higher RI ratio than IBTrACS. However, less than a third of all grid boxes that contain TCs show significant differences in RI ratio between IBTrACS and HiFLOR. It is also important to note that the choice of grid box size, significance test, and RI threshold all affect the interpretation of Fig. 7, so its results should be viewed in tandem with Fig. 6.

Fig . 7. View largeDownload slide The percent difference in RI ratio between IBTrACS and HiFLOR is plotted in each 5° × 5° grid box. Blue (red) squares indicate grid boxes where a larger (smaller) percentage of 24-h intensity changes exceed 30 kt in IBTrACS than in HiFLOR. Grid boxes that achieve a p value of 0.05 using a binomial proportion test are considered statistically significant. White “X” marks are located in grid boxes that are not statistically significant. Fig . 7. View largeDownload slide The percent difference in RI ratio between IBTrACS and HiFLOR is plotted in each 5° × 5° grid box. Blue (red) squares indicate grid boxes where a larger (smaller) percentage of 24-h intensity changes exceed 30 kt in IBTrACS than in HiFLOR. Grid boxes that achieve a p value of 0.05 using a binomial proportion test are considered statistically significant. White “X” marks are located in grid boxes that are not statistically significant.

Further insight into the ability of HiFLOR to resolve TC intensity evolution is possible by evaluating 6-h intensity changes. Figure 8 is similar to Fig. 6 but it compares the common logarithm of probability densities calculated from ADT-HURSAT, IBTrACS, and HiFLOR 6-h intensity changes. Again, plots are created using global (Fig. 8, top), west Pacific (Fig. 8, bottom left), and Atlantic (Fig. 8, bottom right) data. The 3-h measurements for ADT-HURSAT are subsampled every 6 h to match the other two datasets. The probability density of intensity changes are binned in increments of 2 kt h−1 instead of listing the absolute changes over 6-h periods. Unlike Fig. 6, the two observational datasets show more similarities with each other than HiFLOR. HiFLOR produces higher-magnitude intensity variations than IBTrACS and ADT-HURSAT, which is the opposite relationship observed for 24-h intensity changes. The probability distributions for the west Pacific are very similar to the global ones. In the Atlantic basin, the HiFLOR curve more closely follows the two observational datasets.

It is likely that the unique shape of the PDF for 6- and 24-h intensity changes in HiFLOR could be attributed to flawed model physics that prevent TCs from maintaining a steady intensity. Considering this reasoning along with the conclusions from Fig. 6, it appears that HiFLOR generates erratic convection that causes too many high-frequency intensity variations, but it does not produce as much sustained and organized convection. Alternatively, the significantly fewer 6-h intensity changes of 0 kt in HiFLOR could be because the model framework enables high-frequency intensity variations to be easily detected; in nature, calculations of acceleration rates are much less precise (Uhlhorn and Nolan 2012).