Study design

In-situ observations of MAGT and active layer (seasonally thawed surface layer atop permafrost) thickness (ALT), geospatial environmental data, and a statistically based ensemble method were used to model the current (2000–2014) and future MAGT and ALT in the land areas north of 30 °N degrees latitude at 30 arc-second resolution23 (Supplementary Fig. 2). Whereas the ground thermal regime has commonly been examined using mechanistic transient models34, an approach based on statistical associations between dependent and predictor variables was used here22,23. This is because statistical models are computationally more cost-efficient than mechanistic models (which currently have limited high-resolution applicability in hemisphere scale investigations), and can account for variables related to topography and land cover that could be difficult to otherwise parametrize. Moreover, we focused on near-surface permafrost conditions (<15 m depth). Near-surface ground temperatures are strongly coupled with average atmospheric conditions, and are likely to adapt to prevailing climate conditions within a few years35. Cold permafrost (<−5 °C) sites and sites with low ice content and thin organic material layer normally have no substantial time lag between atmospheric forcing and ground temperature response at these depths36. On the contrary, ice-rich sites close to the melting point can experience a more substantial time lag; however, these sites have high potential for ground subsidence regardless of temperature development because of an abundance of excess ice in the soil37. More information on the strengths and weaknesses of statistical techniques in analysing permafrost in a changing climate can be found in refs 22,23.

Described below are: first, data for the modelling of MAGT and ALT, and infrastructure information; second, statistical analyses of MAGT and ALT; third, development and preparation of geohazard indices; and fourth, infrastructure hazard computations. Data and methods related to MAGT and ALT modelling are presented briefly and more information is presented in ref. 23.

Ground temperature and active layer data

Standardized observations of MAGT (n = 797) and ALT (n = 303) were compiled from the Global Terrestrial Network for Permafrost (GTN-P) database (gtnpdatabase.org)38 and additional datasets for a recent 15-year period (2000–2014). MAGT observations at or near (the closest to) the depth of zero annual amplitude (ZAA, annual temperature variation < 0.1 °C)3 were utilized. Whenever it was evident from source data or after a careful examination of the borehole location that a disturbance (e.g., the effect of geothermal heat on temperature−depth curves, recent forest fire, large water body, or anthropogenic heat source) had an effect on the ground thermal regime, the observation was excluded.

Geospatial environmental data

Physically relevant climate, ground material, water body, and topographic environmental (i.e., predictor) variables were used in the statistical analysis of MAGT and ALT23. All the data layers were sampled to the same spatial resolution (30 arc-seconds) and extent (north of 30 °N latitude). The high-resolution WorldClim data39 with a temporal adjustment23 was used to compute freezing (FDD) and thawing (TDD) indices (°C days), and precipitation (mm) in solid (precipitation sum for months below 0 °C; Prec T ≤ 0 °C ) and liquid form (precipitation sum for months above 0 °C; Prec T > 0 °C ). The creation and accuracy of the interpolated climate data are fully described in ref. 39. In brief, weather stations (n = 24,542) behind the data have relatively even spatial coverage (excluding Greenland) and the temperature and precipitation records have passed a quality control scheme. The production of the climate surfaces is based on spline interpolation where the spatial variation in average air temperature and precipitation sums were modelled as a function of latitude, longitude, and elevation. In general, the errors between the observed and the interpolated values were small, <0.3 °C for air temperature and mostly <5 mm for precipitation, when averaged over 12 months. Climate data for future conditions are based on downscaling of multiple global climate models (GCM) from the Climate Model Intercomparison Project Phase 5 database. The GCM outputs (15 models) have been downscaled and bias-corrected for several emission scenarios (RCP2.6, RCP4.5 and RCP8.5) using the WorldClim data for current conditions as a baseline. The GCM data used here are available alongside the data for baseline conditions in ref. 39. To control for inter-model variability in the analyses, ensemble averages over the GCM output were used for each time step and RCP scenario. Owing to a lack of suitable snow cover data for the RCP scenarios, the effect of snow cover on the ground thermal regime40 was indirectly illustrated with monthly precipitation−temperature-derived information. To account for the effect of organic material on the ground thermal regime37, soil organic carbon (SOC; g kg−1) content information was included from global SoilGrids1km data41. A global Water Bodies product (v 4.0) at 150 m resolution42 allowed us to determine the percentage cover of water bodies inside each 30 arc-second grid cell. Topography-derived solar radiation was computed using the NASA Shuttle Radar Topography mission (SRTM) digital elevation model (DEM) at a 30 arc-second spatial resolution43. ArcGIS 10.3 software was used to derive potential incoming solar radiation (PISR; MJ cm−2 year−1) for each grid cell44.

Infrastructure, population and hydrocarbon extraction fields

OpenStreetMap (OSM) data (www.openstreetmap.org) were the main source of infrastructure information. Considering spatial accuracy, the quality of these data is comparable to commercial geodata products45,46. Our goal of high-resolution (<1 km) analysis in this study demanded first-order positional accuracy for all data used. We extracted, merged, and reclassified buildings, roads, railways, industrial areas, and populated settlements from national and sub-national OSM data packages provided in ESRI Shapefiles by GeoFabrik47. Polygon footprints of buildings were converted to point features to examine their number within hazard zones. With roads, we adopted the reclassification method used by the International Transport Forum48 and chose only five top-tier road types. This was assumed to reduce the risk of data quality discrepancies between regions. We discarded winter roads whenever the seasonality of a road segment was evident from the data attributes. Tunnels and ruined or non-existent elements of roads, railways, and buildings were similarly excluded. Elements under construction, however, were included in all cases with the assumption that they will be in use during the hazard assessment periods. Industrial areas represent a single class in OSM land-use data, including areas and buildings used for industrial purposes (www.openstreetmap.org). Populated settlements are an excerpt from OSM places map feature49, which consists of a hierarchical set of locations with name and accompanying attributes. We included all types of populated settlements, encompassing isolated clusters of only a few houses to large cities.

The WikiProject Oil and Gas infrastructure served as a source for information about oil and gas pipelines50. The coverage and accuracy of pipeline data may vary regionally but we are not aware of any more consistent publicly available global product with comparably high spatial accuracy. In addition to OSM data, we included aviation transportation infrastructure, with ongoing global updating of data on locations of airports and airfields (OurAirports.com). Based on data attributes, currently inactive airports and seaplane bases were excluded. Human population for the year 2015 was determined from the Gridded Population of the World (GPW version 4) data in 30 arc-second resolution51. Population count estimates were based on national population censuses, which have been further adjusted to match 2015 Revision of United Nation World Population Prospects country totals51. As we are not aware of any population projection that would match with the extent of our study area, resolution, and periods analysed, we consider that using current population counts is the safest way to address human exposure to future hazards, even though changes in population, as well as in infrastructure, are probable but subject to (unpredictable) near-future socio-economic development. Hydrocarbon extraction fields in the Russian Federation were extracted in polygon shapefiles from Rosnedra’s online map interface (gis.sobr.geosys.ru). Prior to hazard computations spatially overlapping polygons were merged.

Geospatial data quality encompasses many aspects, e.g. location accuracy, completeness of elements or their attributes, which in the case of OSM have been extensively studied predominantly in highly developed areas45,46, whereas in remote regions these evaluations are scarce. The circumpolar applicability of OSM road and railway data has been demonstrated by their previous use in the production of global 100-m resolution grids for socio-economic/population (WorldPop Project52) and global travel time to cities mapping at 1 km resolution53. Reference 54 estimated that in 2015 the global OSM road network was ~83% complete albeit between-country differences were identified. Here, we included only five top-tier road types, as opposed to smaller roads included in their analysis54, which was assumed to reduce the risk of data quality discrepancies between regions48. Apart from roads, very few global-scale evaluations of the OSM data have been performed. Moreover, no systematic framework to evaluate OSM data yet exists55.

According to our calculations, the total length of WikiProject pipelines in Russia (baseline permafrost conditions) is ~5% greater than those in the federal Rosnedra database (gis.sobr.geosys.ru). This is attributed to higher spatial resolution and a more detailed presentation of pipeline networks within communities and oil/gas fields. Compared to the documented lengths of a few central pipeline systems (including non-permafrost areas), the data encompass 99.8% of TAPS (1285/1288 km, akpipelinesafety.org), 98.9% of ESPO (4702/4756 km, energybase.ru), 93.1% of Urengoy–Pomary–Uzhgorod pipeline (4142/4451 km) and 76.2% of Bovanenkovo–Ukhta–Torzhok (2009/2637 km), suggesting that they are geospatially mostly complete and accurate.

The analyses involving buildings presented here are preliminary, as the number of OSM buildings across our modelled permafrost domain was obviously much less than the actual number. Moreover, region-specific differences exist. A simple people-per-building -ratio (regional population divided by number of buildings) was calculated to provide a rough estimate of the validity of building counts in the geographical regions under consideration. Eurasia and North America had reasonable ratios, 23 and 12, respectively, while for central Asian mountains a ratio of nearly 700 indicated that a large number of buildings could be missing. Urban settlements, which contain the majority of buildings, had good coverage, while in some of the smaller populated places infrastructure may not have been mapped. In the context of this study, which includes all settlements ranging from isolated dwellings to cities, it is important to take into account the maximum extent of human activities. This was achieved with the OSM places map feature, which includes ca. 10 times more populated settlements than analogous open datasets (e.g., the Global Rural Urban Mapping Project, Naturalearthdata.com—Populated places) across the modelled permafrost region.

Statistical analysis

Observed MAGT and ALT were related to geospatial environmental variables using four statistical modelling techniques: generalized linear model56, generalized additive model57, generalized boosting method58 and random forest59. The detailed information of the model calibration and evaluation are provided in ref. 23.

The uncertainty of the model predictions (present and future) was assessed by producing 1000 ensemble predictions on 100,000 randomly chosen grid cells (glaciers masked out), at each time using a 70% random sample of observations60. The 95% prediction intervals for each cell over the 1000 replications were then calculated. The final uncertainty was considered as the 95th percentile of the prediction intervals across all 100,000 locations60. This procedure resulted in baseline prediction uncertainty of ±0.77 °C (MAGT) and ±37 cm (ALT) (Supplementary Fig. 5). For ALT, the spatial domain of the uncertainty analysis was limited to the modelled permafrost areas (i.e., MAGT ≤ 0 °C).

In the baseline period, permafrost was modelled to affect 15.1 ± 2.1×10−6 km2 (95% uncertainty range) and decreased by 39.5% to 9.1×10−6 km2 (7.5–11.2×10−6 km2) by the middle of the century61. These results are comparable with those presented recently62,63,64. However, an explicit comparison of the results of this study and the previous studies is difficult because of the differences in the spatial resolution of analyses (our ~1 km vs. common >100 km), extent of the study domain (e.g., our circumpolar vs. regional studies in Alaska, Siberia, Arctic Canada and Tibetan Plateau) and differences in basic settings in the analyses (e.g., depth of soil column considered, input parameters and baseline/projection periods).

Geohazard indices

We formulated a total of four indices (settlement index, risk zonation index, analytic hierarchy process (AHP) based index and a consensus of the former indices) depicting zones of hazard potential for infrastructure for periods 2041–2060 and 2061–2080 under three RCPs. First, the settlement index (I s ) was computed using the formula8:

$$I_{\mathrm s} = \Delta Z_{{\mathrm {ALT}}} \times V_{{\mathrm {ice}}},$$ (1)

where ∆Z ALT is the relative increase of ALT, and V ice is the volumetric proportion of excess ground-ice. We used volumetric ground-ice content (GIC) data65 and ALT modelling results23. Prior to index formulation, gridded (12.5 km resolution) GIC data were re-projected and class-specific values (5, 15 and 35%) were assigned for the original class intervals (0–10, 10–20, and over 20%). Resulting I s values were logarithmically transformed, and then reclassified into three classes using a nested-means procedure66 with the two lower classes combined for a conservative estimate8,67.

Second, the risk zonation index (I r )20 was developed using two-class data on surface properties (sediment/bedrock), frost susceptibility of ground material (high/low), ground-ice (high/low) and permafrost thaw potential (high/low). Assuming that permafrost thaw has a minor effect on exposed bedrock in the context of engineering, bedrock was directly assigned to the low-risk class20. To delimit exposed bedrock areas at 30 arc-second resolution, we used data on global soil thickness68. We used SoilGrids1km data41 to produce a variable separating coarse and fine sediments with varying frost susceptibility (high susceptibility = silt and finer particles, low susceptibility = sand and coarser particles). GIC65 was used to determine high (10–20% or more) and low (0–10%) ice content. Reference 20 developed a concept of permafrost thaw potential to describe the active layer depth increase between the present time and a future scenario (high > 2.5 m ≥ low). Owing to the greater uncertainties in ALT than in MAGT results23, we determined permafrost thaw potential using MAGT predictions (high potential = MAGT changed from ≤0 °C to >0 °C at the depth of ZAA; low potential = MAGT remained ≤0 °C at the depth of ZAA). To compile a three-class risk zonation (comparable to the other indices) we followed the decision flow diagram20 but merged the two classes of lowest risk (low risk and limited risk) together.

Third, AHP was used to produce a hazard index (I a ) that considers different factors with varying weights. The AHP developed is a multi-objective, multi-criteria decision-making approach to analyse complicated problems such as the determination of relative roles of factors affecting geohazards69,70. AHP requires the creation of a reciprocal pair-wise comparison matrix used to determine weighted coefficients for the computation of geohazard index. Entries into the matrix are determined by comparing each factor based on a 9-point rating scale71. If the factors are of equal importance for the final solution, a value of 1 is given, whereas 9 expresses the extreme importance of one criterion over another72.

We considered the following five factors in our circumpolar-scale AHP analysis: relative increase of ALT23; GIC; ground temperature (including permafrost thaw); fine-grained sediment content in the ground; and slope gradient. The relative importance of each variable was ranked using expert knowledge72 (see e.g., refs 21 and 70 for the application of AHP in geohazard context). Here, the ground temperature and thaw of near-surface permafrost was considered to be the most important factor affecting infrastructure hazard followed by GIC, relative increase of ALT, fine-grained sediment content and slope gradient. Using the expert judgement on the relative importance of factors and a reciprocal pair-wise comparison matrix we computed weighted coefficients for each factor (the resulting coefficients are shown in Eq. 2). However, the pair-wise comparison is subjective and the quality of the results is dependent on the expert’s judgement. To evaluate expert valuation, a consistency ratio was used to show the probability that the assessment matrix was randomly generated69. In a successful expert judgement, the consistency ratio should be ≤0.1 (ref. 69). In our AHP, the consistency ratio was 0.09, indicating acceptable assessment.

To compute I a the ground temperature, GIC, ALT, fine sediment content, and slope gradient variables were first classified into three classes (note that the GIC was originally a three-class factor) based on their corresponding contribution to infrastructure hazard in permafrost-controlled environment21 (3 = high hazard, 2 = moderate hazard, 1 = low hazard). The ground temperature factor was produced by reclassifying the MAGT predictions. The highest hazard value was assigned to areas where near-surface permafrost thaws (comparable to the permafrost thaw potential above). The areas where MAGT stays between –3 °C and 0 °C were considered moderate hazard areas, whereas areas with MAGT < –3 °C represented the lowest hazard level73. Due to the lack of applicable hazard-related threshold values the nested-means approach66 was utilized to classify the numerically continuous ALT, fine-grained sediment and slope variables into three-class factors. Fine-grained sediment content was derived from the SoilGrids1km41 and slope gradient from the 30 arc-second DEM43. The hazard potential of ALT, fine sediment and slope factors was considered to increase with increasing thaw depth, fine-grained sediment content, and slope inclination, respectively. We used the coefficients and three-class raster layers to compute the AHP-based geohazard index (I a ):

$${\begin{array}{l}I_{\mathrm a} = \left( {{\mathrm{ground}}\,{\mathrm{temperature}} \times 0.525} \right) + \left( {{\mathrm{GIC}} \times 0.248} \right) + \left( {{\mathrm{relative}}\,{\mathrm{increase}}\,{\mathrm{of}}\,{\mathrm{ALT}} \times 0.122} \right)\\ +\, \left( {{\mathrm{fine-grained}}\,{\mathrm{sediment}}\,{\mathrm{content}} \times 0.071} \right) + \left( {{\mathrm{slope}}\,{\mathrm{gradient}} \times 0.035} \right)\end{array}}.$$ (2)

The value of I a ranged from 1.0 (lowest hazard potential) to 3.0 (highest hazard potential). To achieve a three-fold classification comparable to the I s and I r , we used the nested-means procedure66. Finally, due to the different strengths and weaknesses of the I s , I r and I a we computed a consensus index (I c ) using information from all the three indices. For example, I s considers two important factors causing ground subsidence in permafrost areas. However, I s does not include information on permafrost thaw or other potential hazard-inducing factors and in our study, I s was based on relatively coarse-scaled GIC data. In contrast, I a considers different affecting factors and introduces case-specific expert knowledge into the process but represents a relatively complex and subjective approach in the development of an index for infrastructure hazard assessment. Thus, I c was computed using a majority vote procedure with ArcMap’s Cell Statistics tool. Whenever two or three of the indices shared a hazard value in a grid cell, the value was recorded to represent consensus. In draw situations, i.e., all three indices had different values, a moderate hazard value of 2 was forced manually. As a result, the high hazard zone comprised 13.8%, medium hazard 41.3% and low hazard 44.9% of the study area in a medium stabilization scenario RCP4.5 (ref. 25) for 2041–2060.

Basically, the comparison of spatial patterns of our geohazard results61 (I s , I r , I a , and I c ) to the previous studies17,19,20,21,67 is challenging owing to the geographical, scale, and methodological differences between the studies (see above). The main patterns of our indices and the results published in refs 17,]67 are comparable, although local differences exist.

Infrastructure hazard computations