Water vapor is the most important greenhouse gas in the atmosphere although changes in carbon dioxide constitute the “control knob” for surface temperatures. While the latter fact is well recognized, resulting in extensive space‐borne and ground‐based measurement programs for carbon dioxide as detailed in the studies by Keeling et al. (1996), Kuze et al. (2009), and Liu et al. (2014), the need for an accurate characterization of the long‐term changes in upper tropospheric and lower stratospheric (UTLS) water vapor has not yet resulted in sufficiently extensive long‐term international measurement programs (although first steps have been taken). Here, we argue for the implementation of a long‐term balloon‐borne measurement program for UTLS water vapor covering the entire globe that will likely have to be sustained for hundreds of years.

1 Introduction Water vapor is arguably the most important greenhouse gas in the Earth's atmosphere; it accounts for about half of the present day greenhouse effect and is the most important gaseous source of infrared opacity in the atmosphere [Held and Soden, 2000; Schmidt et al., 2010]. Nonetheless, changes in atmospheric CO 2 constitute the principal “control knob” governing the temperature of the Earth's atmosphere. In contrast to CO 2 , water vapor in the atmosphere can condense and precipitate; therefore, water vapor concentrations in the atmosphere are determined by condensation—and thus by temperature—rather than by water vapor sources [Lacis et al., 2010]. Increasing atmospheric CO 2 also causes changes in the tropospheric Hadley circulation, with implications for convective ascent, cloud patterns, and thus the tropospheric humidity distribution, particularly in the tropics [Lau and Kim, 2015]. Changes in non‐condensing greenhouse gases (like CO 2 , N 2 O, CH 4 , chlorofluorocarbons, and ozone) cause a radiative forcing that changes the atmospheric temperature structure. This will alter water vapor and cloud abundances, producing an additional radiative forcing and changing the atmospheric temperature structure yet again. This forcing is referred to as the “water vapor feedback” mechanism [Dessler and Sherwood, 2009; Lacis et al., 2010]. Atmospheric water vapor effectively makes the climate more sensitive to forcing by non‐condensable greenhouse gases. The attenuation of outgoing long‐wave radiation, i.e., the water vapor feedback, is most sensitive to water vapor changes in the upper troposphere and lowermost stratosphere, where the air is coldest and driest [Held and Soden, 2000; Dessler and Sherwood, 2009]. In climate models, enhanced warming is found in the tropical upper troposphere due to changes in the temperature lapse rate. This impacts both moisture changes and radiative effects in this region so that lapse rate changes and water vapor changes in the tropical upper troposphere need to be considered together. Moreover, water vapor changes in the lower stratosphere have been identified as an important driver of decadal global surface climate change [Forster and Shine, 2002; Solomon et al., 2010; Riese et al., 2012; Dessler et al., 2013]. Stratospheric water vapor will increase due to an anthropogenic increase in tropospheric methane concentrations [Rohs et al., 2006]; direct injection of water vapor into the stratosphere by large volcanic eruptions is also discussed as a process that might enhance stratospheric water vapor [Joshi and Jones, 2009].

2 Current Measurements of Atmospheric Water Vapor In spite of its climatic importance, accurate measurements of water vapor in the upper troposphere and in the lower stratosphere are sparse, and trends and variability in this region are not well established [Hurst et al., 2011; Kunz et al., 2013; Hegglin et al., 2014]. Uncertainties are particularly pronounced close to the tropopause, the interface between large water vapor concentrations in the upper troposphere and very low water vapor concentrations in the stratosphere. The tropopause fluctuates in altitude, making trend determinations at constant altitudes problematic. Furthermore, deep convective injection and isentropic transport from the tropics are processes that have the potential to moisten the lowermost stratosphere in mid‐latitudes [Anderson et al., 2012; Ploeger et al., 2013; Vogel et al., 2014; Spang et al., 2015], leading to further variability of water vapor in this region. However, the region close to the tropopause is the region most relevant for radiative forcing of surface climate [Held and Soden, 2000; Solomon et al., 2010; Riese et al., 2012]. This fact represents a challenge for the vertical resolution of water vapor measurements in the vicinity of the tropopause. Studies of stratospheric water vapor variability and trends in mid‐latitudes often avoid the region close to the tropopause [Rosenlof et al., 2001; Scherer et al., 2008; Hurst et al., 2011] when calculating trends and variability in water vapor. Studies that focus on water vapor trends in the UTLS (− 2/+ 4 km from the tropopause) [Oltmans and Hofmann, 1995; Oltmans et al., 2000; Kunz et al., 2013] show that these trends and the associated radiative forcing are highly uncertain and even of an undetermined sign. Despite its importance, there are only limited strategies for the long‐term monitoring of upper tropospheric and lower stratospheric water vapor. The international network of weather balloons has been in operation for many decades but is considered insufficient for detecting trends and variability in upper tropospheric and lower stratospheric water vapor [Elliott and Gaffen, 1991; Soden and Lanzante, 1996; Seidel et al., 2009; Dee et al., 2011]. Valuable information on water vapor in the lowermost stratosphere and upper troposphere is obtained from measurements on board passenger aircrafts in the frame of the In‐service Aircraft for a Global Observing System (IAGOS) project. There are two sub‐projects within IAGOS, namely IAGOS‐CORE (since 1994) and IAGOS‐CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container, since 2008). IAGOS‐CORE provides frequent measurements of reactive and greenhouse gases including upper tropospheric water vapor, while the less frequent IAGOS‐CARIBIC covers a much larger suite of tracer and aerosol measurements but is deployed only once per month on four long‐range flights [Petzold et al., 2015]. IAGOS‐CARIBIC provides water vapor measurements covering lower mixing ratios in particular as they prevail in the lowermost stratosphere [Zahn et al., 2014; Dyroff et al., 2015]. However, these measurements are restricted in height (in particular, they do not cover the tropical tropopause region) because most measurement time is spent at the crusing altitude of a commercial aircraft, and the measurements do not achieve true global coverage. Nonetheless, they should be sustained and extended in coverage. High‐quality measurements of water vapor in the UTLS are obtained regularly a few long‐term monitoring programs and in dedicated measurement campaigns. The climatologies deduced by combining the individual measurements are a valuable resource for investigations on trends and variability of UTLS water vapor [Rosenlof et al., 2001; Krämer et al., 2009; Tilmes et al., 2010; Kunz et al., 2014; Meyer et al., 2015]. An example of such a water vapor climatology for the tropics, subtropics, and the polar regions is shown in Figure 1. However, the spatial coverage of such measurements at a given point in time is limited, and data from many years of measurements have to be combined for a climatology. In this way, seasonal and inter‐annual variability cannot be represented in the dataset. Thus, again, climatologies based on collections of temporally and spatially sparse data cannot replace a strategy for long‐term monitoring of UTLS water vapor. Figure 1 Open in figure viewer PowerPoint Meyer et al., 2015 Potential temperature‐based (10 K bins) frequency distribution of water vapor mixing ratios from 23 aircraft campaigns measured with the Fast In‐situ Stratospheric Hygrometer (FISH) hygrometer [.,] from 1997 to 2014 divided into three latitude regimes: tropical (30°S to 30°N), sub‐tropical (60°S to 30°S and 30°N to 60°N), and polar regions (90°S to 60°S and 60°N to 90°N). The number of data points (measurements every second) is given at the top right of the respective panel. Furthermore, since 1978 upper tropospheric water vapor concentrations have been derived from high‐resolution infrared radiation sounder (HIRS) instruments aboard National Oceanic and Atmospheric Administration (NOAA) operational satellites based on the 6.7‐µm water vapor channel [Soden et al., 2005; Shi and Bates, 2011; Chung et al., 2014]. Unfortunately, these nadir‐looking sensors were not designed for climate monitoring and only provide a poor vertical resolution. Long‐term datasets for upper tropospheric humidity in mid‐latitudes have been constructed through a combination of different types of HIRS instruments [Gierens et al., 2014]. However, the HIRS instruments suffer from inter‐satellite biases, and the continuity of the 6.7‐µm water vapor channel ends in 2005 because of the shift in the central wavelength to 6.5‐µm for HIRS/3. HIRS instruments are only sensitive to water vapor in a rather deep layer in the upper troposphere (roughly 200–500 hPa) [Soden et al., 2005; Shi and Bates, 2011; Chung et al., 2014]. Since October 2011, measurements of upper tropospheric humidity with a comparable vertical resolution are also available from the microwave radiometer SAPHIR on board the Megha‐Tropiques satellite [Brogniez et al., 2015]. Satellite measurements of lower stratospheric water vapor have been conducted for many decades by a variety of instruments, but many had short (in the order of years) lifetimes, often without much overlap between different instruments. Satellite instruments may also be susceptible to perturbations by volcanic aerosol and degradation in performance [Fujiwara et al., 2010; Fueglistaler et al., 2013]. Results from model simulations nudged to observed meteorology have been used to address such problems in an approach merging a number of different satellite datasets [Hegglin et al., 2014], but only offsets were applied with no investigation of measurement drifts. Observations at ± 2 km around the tropopause in mid‐latitudes by the Boulder balloon soundings and the corresponding HALogen Occultation Experiment (HALOE) satellite measurements (Figure 2) do not show the much‐discussed rapid drop in stratospheric water vapor in the lower stratosphere at the beginning of 2001 and in 2013 [e.g., Hurst et al., 2011; Kunz et al., 2013]. On the other hand, the drops in stratospheric water vapor in 2001 and in 2013 are detectable in the tropics immediately above the tropical tropopause and are correlated with rapidly decreasing tropical temperatures. These are further detectable ≈1.5 years later at greater altitudes in the tropics [e.g., Randel et al., 2006; Urban et al., 2014]. These observations are important for deducing the radiative impact of water vapor changes close to the tropopause and show that simple extrapolations of water vapor trends detected at greater altitudes in the lower stratosphere to the tropopause or from the tropics to mid‐latitudes are problematic. They also reveal the importance of stratospheric circulation changes and variability [Stiller et al., 2012; Ploeger et al., 2015] for the behavior of water vapor in the lower stratosphere [Rosenlof et al., 1997; Randel et al., 2006; Kunz et al., 2013; Hegglin et al., 2014; Tao et al., 2015]. Figure 2 Open in figure viewer PowerPoint Hurst et al., 2011 Kunz et al., 2013 Kunz et al., 2013 Kunz et al. [ 2013 Stratospheric water vapor between 1981 and 2014 from Boulder sonde data (.,.,). Top panel (a) shows data for 75–85 hPa (≈18 km); the panels below show subsets of the data selected according to the altitude level of the tropopause at the individual sounding. Second panel (b) shows data for the tropical domain (tropopause greater than 14 km); third panel (c) for a transitional domain (tropopause between 14 and 12 km); and bottom panel (d) data for the extratropical domain (tropopause below 12 km). The black line in the top panel shows water vapor monthly means; 2‐year running means in all panels are shown in orange. Also shown are corresponding HALOE data for 1991–2005 (zonal average of the latitude band 35°N to 45°N); the white line shows the 2‐year running mean; the range of two standard deviations around the mean is shown as gray shading [.,]. Note that around the year 2000, no data are available for tropopause greater than 14 km (panel b). (Figure adapted from. []; see reference for further information on the analysis).

3 A Future Measurement Program of Atmospheric Water Vapor Here, we propose that a long‐term sustainable strategy for accurate global measurements of water vapor in the UTLS down to molar mixing ratios of parts per million (ppm; 10−6) should be developed. The backbone of this strategy should be a global network of stations for balloon‐borne instruments that are well inter‐calibrated. The chosen measurement technique will likely be frost point hygrometer sondes flown successfully but only infrequently, for many years, from a few stations [Oltmans and Hofmann, 1995; Fujiwara et al., 2010; Hurst et al., 2011, 2014; Davis et al., 2015]. The balloon measurements will allow an accurate determination of the tropopause height and thus a discrimination of upper tropospheric and lower stratospheric water vapor. Accurate and spatially well‐resolved water vapor measurements in the upper troposphere and lower stratosphere would also be helpful for improving data assimilation in this region [Kunz et al., 2014; Dyroff et al., 2015]. It will be a challenge to implement such a network with reasonable spatial coverage of the entire globe that will allow polar, mid‐latitude, and tropical air masses (see Figure 1) to be separated. The Global Climate Observing System (GCOS) Reference Upper‐Air Network (GRUAN) is an initiative [Seidel et al., 2009; Immler et al., 2010] that is making progress in establishing a global network of 30–40 measurement sites. Each site must meet stringent requirements regarding the accuracy and long‐term stability of its measurements of essential climate variables, including UTLS water vapor. GRUAN sites are required to perform a water vapor sounding with a balloon‐borne frost point hygrometer at least once per month. The progress of GRUAN in adding sites is slow because each site is responsible for procuring the funding needed to make such measurements. Even if GRUAN builds up to 30–40 sites in the near future, we believe there is a critical need for many more UTLS water vapor measurement sites around the globe. The self‐funding approach of GRUAN sites also casts doubt upon the stability of the network over the long term. Establishing such a measurement program will require a substantial financial commitment. For example, a payload consisting of a frost point hygrometer [Vömel et al., 2007], an electrochemical concentration cell (ECC) ozonesonde, a compact optical backscatter aerosol detector (COBALD) sonde [Brabec et al., 2012], a radiosonde, and meteorological standard equipment together with the balloon itself will cost about €7000 per launch. Assuming a launch schedule of two flights per month and a 50% chance of recovery of the payload, this results in costs for disposables of about €90 000 per year and station. The receiving equipment requires a further one‐time investment of €10 000–30 000 per station, depending on the type of radiosonde flown. Of course, considerable additional costs will be incurred for infrastructure and personnel. To make optimum use of these investments, the spatial distribution of the stations and the temporal resolution of the soundings required for the detection of trends and long‐term variability should be studied in advance, e.g., through theoretical studies based on hypothetical water vapor time series. It has been demonstrated that the time required to detect trends in highly variable upper tropospheric water vapor is reduced more by increasing the measurement frequency than by improving the precision of measurements [Whiteman et al., 2011]. As stratospheric water vapor is much less variable, the trend detection time would likely depend less on measurement frequency and more on the number and global distribution of measurement sites and on the measurement precision. Some work in this direction has also been done in the field of ozone trends, addressing the question of the circumstances under which a leveling off or recovery of stratospheric ozone levels is detectable in a statistically significant manner [e.g., Reinsel et al., 2002; Várai et al., 2015]. However, the ozone recovery problem is a simpler problem as only one turning point is expected (decline of ozone to the turning point and recovery thereafter), while stratospheric water vapor levels will certainly show more variability, even on longer time scales of many years. In an analysis of the Boulder balloon measurement time series at 40°N, Kunz et al. [2013] found that the measurements cover different water vapor reservoirs. Thus, depending on the day of launch, either tropical or mid‐latitude air masses are measured at the Boulder station. Therefore, we suggest that for the soundings in the proposed water vapor network, a launching strategy should be developed that takes into account the local meteorology on the day of the measurement rather than following a simple (e.g., twice per month) schedule. Important components of a global network of balloon‐borne measurements of water vapor would be quality assurance and inter‐calibration of the measurements at individual stations. Here, one could follow the example of the worldwide ozonesonde network, where comparable challenges exist. In the case of ozonesondes, these problems have been addressed by the standardization of procedures for preparing and data processing of the sonde data and by an environmental simulation facility for regular calibration of the ozonesondes used in the network [Smit et al., 2007]. GRUAN provides an additional example of homogenized data processing and quality control by requiring all measurements from a specific instrument type to be processed by a central processing center. This ensures that the measurements from different sites will receive the same treatment while being processed into final data products. It would be extremely valuable to augment the long‐term balloon network with satellite sensors, providing high vertical resolution (better than ≈2 km) measurements of water vapor in the UTLS on a time scale of decades. Such a high vertical resolution will very likely only be achievable through limb sounding satellite instruments. Satellite measurements would have the advantage of a good horizontal spatial coverage that cannot be achieved by a balloon sonde network. However, so far, it has proven very difficult to maintain satellite monitoring programs providing measurements at a high vertical resolution over periods of many years to decades [Urban et al., 2014]. Even if such a series of operational satellites existed, balloon‐borne water vapor measurements would provide a standard for long‐term stable calibration and a vertical resolution in the vicinity of the tropopause that cannot be achieved by satellite sensors. First steps have been taken to establish long‐term measurements of water vapor in the upper troposphere and lower stratosphere. In the IAGOS project, valuable water vapor measurements in the upper troposphere have been collected on board passenger aircrafts since 1994 [Smit et al., 2014; Zahn et al., 2014; Petzold et al., 2015]. GRUAN [Seidel et al., 2009; Immler et al., 2010] aims at implementing a global network of about 30–40 measurement sites. Moreover, since 2005, in the Ticosonde program, balloon‐borne measurements of tropical water vapor in the UTLS have been obtained in San Jose, Costa Rica (10°N) [Fujiwara et al., 2010; Davis et al., 2015]. Frost point hygrometer measurements have also been conducted since 2004 in Lauder, New Zealand (45°S) and since 2010 in Hilo, USA (20°N) [Hurst et al., 2014; Davis et al., 2015]. Still, neither the balloon‐borne water vapor measurements in San Jose, Hilo and Lauder nor the global GRUAN initiative currently have a stable, long‐term funding. For a reliable assessment of the climatic effect of water vapor in the UTLS, greater effort is required. We suggest that a global, long‐term balloon‐borne measurement program for UTLS water vapor should be established. Stations should cover the entire globe with an optimized distribution of sites; the launching strategy should take into account the local meteorology; and the inter‐calibration of the different stations should be ensured. This program will likely have to be sustained for hundreds of years.

Acknowledgments We are grateful to Andreas Petzold, Herman Smit, Michiel van Weele, and an anonymous reviewer for helpful comments on the paper. We thank J. Carter‐Sigglow for a grammatical and stylistic revision of the manuscript. This work was supported in part by the European Commission under grant number StratoClim‐603557‐FP7‐ENV.2013.6.1‐2. The data used in Figure 1 can be obtained by contacting the authors; data from the Boulder station (Figure 2) can be obtained from the following website: http://www.esrl.noaa.gov/gmd/dv/ftpdata.html.