In situ CO 2 mole fraction observations

We use discrete (weekly) air samples from 105 sites and continuous (hourly) observations from 52 sites that are part of the global atmospheric surface CO 2 observations network. These were taken from the Observation Package (ObsPack) obspack_co2_1_GLOBALVIEWplus_v2.1_2016_09_02 data product7 for 2015, and from obspack_co2_1_NRT_v3.3_2017–04–19 for 2016–20178; both datasets are produced by the National Oceanic and Atmospheric Administration (NOAA) Earth System Research Laboratory (ESRL).

Satellite observations of column CO 2

We use X CO2 data retrieved from the Japanese Greenhouse gases Observing SATellite (GOSAT) and the NASA Orbiting Carbon Observatory-2 (OCO-2). GOSAT11 was launched in January 2009 in a sun-synchronous orbit with an equatorial crossing time of 1300. We use two independent GOSAT XCO 2 data products: v7.1 full-physics retrievals from the University of Leicester30 (UoL), and B7.3 of the NASA Atmospheric CO 2 Observations from Space (ACOS31) activity. We use 10-s averages of the bias-corrected X CO2 B7.1r data product32 over land from OCO-2 that is the current version used by the OCO-2 science team.33,34

Enhanced Vegetation Index

The Enhanced Vegetation Index (EVI) is a composite property of leaf area, chlorophyll and canopy structure35. We use MOD13C2 (MODIS/Terra Vegetation Indices Monthly L3 Global 0.05° CMG V006)36 to get EVI information. The data are only retained with pixel reliability values masked as good data (0) or marginal data (1).

Gravity recovery and climate experiment

The Gravity Recovery and Climate Experiment (GRACE) provides information about changes in the water column37,38,39. Rooting depths of tropical terrestrial ecosystems will likely be sufficiently deep that we cannot establish a direct and immediate relationship between vegetation and changes in precipitation. Changes in gravity, due to changes in water column depth, provide a much stronger relationship with vegetation access to water. We use the surface mass change data based on the RL05 spherical harmonics from CSR (Center for Space Research at University of Texas, Austin), JPL (Jet Propulsion Laboratory) and GFZ (GeoforschungsZentrum Potsdam). The three different processing groups chose different parameters and solution strategies when deriving month-to-month gravity field variations from GRACE observations. We use the ensemble mean of the three data fields and multiply the data by the provided scaling grid. Data are available from http://grace.jpl.nasa.gov.

Formaldehyde columns

Formaldehyde (HCHO) columns are from the Ozone Monitoring Instrument40 (OMI) aboard the NASA Aura satellite, which was launched in a sun-synchronous orbit in 2009. We use the NASA OMHCHOv003 data product16 from the NASA Data and Information Services Center, which fits HCHO slant columns in the 328.5–356.5 nm window and accounts for competing absorbers, the Ring effect, and undersampling. HCHO is a high-yield product of hydrocarbon oxidation41,42. It is also emitted as a direct emission from incomplete combustion43,44. We use the active fire data product45 from the NASA Moderate Resolution Imaging Spectrometer (MODIS), derived from surface thermal IR anomalies, to isolate the pyrogenic HCHO signal.

Satellite observations of solar induced fluorescence

Satellite observations of solar induced fluorescence (SIF) are retrieved by the UoL from the GOSAT instrument46. SIF is a by-product of plant pigments absorbing incoming sunlight as part of photosynthesis. Of the solar radiation absorbed, ~20% is eventually dissipated as heat and typically <1–2% is emitted by SIF in the range 650–800 nm, peaking at 685–690 nm and 730–740 nm. GOSAT fits estimates of SIF at 755 nm47. We use the GOSAT SIF data product as a crude measure of photosynthetic capacity of regional ecosystems. We use a physically based retrieval scheme47 with a focus on the bias correction procedure. We use a two-stage method. First, we isolate GOSAT measurements over non-vegetated areas using the ESA CCI Land Cover product V2.0.748 at 300 m resolution. Second, we apply a bias correction as an explicit function of time to ensure that instrumental effects are accounted for the entire date range of the SIF product.

DM burned estimates

DM burned estimates are taken from the Global Fire Emission Database49 (GFED4). These estimates were derived by combined by satellite remote sensing observation of burned area and active fire data from MODIS.

Atmospheric transport models and inverse methods

To describe the relationship between surface fluxes of CO 2 and atmospheric CO 2 we use three atmospheric transport models: (1) GEOS-Chem global 3-D chemistry transport model50,51 v9.02; (2) GSFC parameterised chemistry and transport model52 (PCTM), and (3) Laboratoire de Météorologie Dynamique (LMDZ), version LMDZ353.

We run GEOS-Chem with a horizontal resolution of 4° (latitude) × 5° (longitude), driven by the GEOS-5 meteorological analyses (GEOS-FP from 2013) from the Global Modeling and Assimilation Office (GMAO) Global Circulation Model based at NASA Goddard Space Flight Center. We run the model using 47 vertical terrain-following sigma-levels that describe the atmosphere from the surface to 0.01 hPa, of which about 30 are typically below the dynamic tropopause. We use well-established emission inventories as our a priori flux estimates: (1) weekly biomass burning emissions49; (2) monthly fossil fuel emissions54,55; (3) monthly climatological ocean fluxes56; and (4) three-hourly terrestrial biosphere fluxes57.

The GEOS-Chem model uses an ensemble Kalman Filter (EnKF) framework18,58 to infer CO 2 fluxes from the ground-based or space-based measurements of atmospheric CO 2 . We use a total of 792 basis functions per month, split between 317 oceanic regions and 475 land regions. These regions are subdivisions of the 22 regions used in TransCom-39. We assume a 50% uncertainty for monthly land terrestrial fluxes, and 40% for monthly ocean fluxes49. We assume land (ocean) a priori fluxes are correlated with a correlation length of 500 (800) km. We assume no observation error correlations, but include an additional 1.5 ppm uncertainty to the reported observation errors to account for model transport errors. We determine the terrestrial biosphere flux by subtracting the fossil fuel and cement production emission estimate (FF). This is a common approach10,18,59, based on the assumption knowledge of FF flux is much better than that of the natural fluxes from the land and ocean.

The LMDZ model is run using a regular horizontal resolution of 3.75° (longitude) and 1.875° (latitude), with 39 hybrid layers in the vertical. Winds are nudged towards the 6-hourly ECMWF reanalysis60 with a relaxation time of three hours. Fossil fuel burning emissions from the ODIAC model54,55, including diurnal and day-of-week variability61. We also use monthly ocean fluxes56, three-hourly biomass burning emissions (GFED 4.1 s until 2015 and GFAS afterwards), and climatological three-hourly biosphere-atmosphere fluxes taken as the 1989–2010 of a simulation of the ORganizing Carbon and Hydrology In Dynamic EcosystEms model (ORCHIDEE62), version 1.9.5.2.

The LMDZ CAMS inversion tool currently generates the global CO 2 atmospheric inversion product of the Copernicus Atmosphere Monitoring Service63,64. The minimum of the Bayesian cost function of the inversion problem is found by an iterative process using the Lanczos version of the conjugate gradient algorithm65. The inferred fluxes are estimated at each horizontal grid point of the transport model with a temporal resolution of eight days, separately for day-time and night-time. The state vector of the inversion system is therefore made of a succession of global maps with 9200 grid points. Per month it gathers 73,700 variables (four day-time maps and four night-time maps). It also includes a map of the total CO 2 columns at the initial time step of the inversion window in order to account for the uncertainty in the initial state of CO 2 . Over land, the errors of the prior biosphere-atmosphere fluxes are assumed to dominate the error budget and the covariances are constrained by an analysis of mismatches with in situ flux measurements: temporal correlations on daily mean net carbon exchange (NEE) errors decay exponentially with a length of one month but night-time errors are assumed to be uncorrelated with daytime errors; spatial correlations decay exponentially with a length of 500 km; standard deviations are set to 0.8 times the climatological daily-varying heterotrophic respiration flux simulated by ORCHIDEE with a ceiling of 4 gC/m2/day. Over a full year, the total 1-sigma uncertainty for the prior land fluxes amounts to about 3.0 GtC/yr. The error statistics for the open ocean correspond to a global air-sea flux uncertainty about 0.5 GtC/yr and are defined as follows: temporal correlations decay exponentially with a length of one month; unlike land, daytime and night-time flux errors are fully correlated; spatial correlations follow an e-folding length of 1000 km; standard deviations are set to 0.1 gC/m2/day. Land and ocean flux errors are not correlated.

PCTM is run at a horizontal resolution of 2.0° (latitude) × 2.5° (longitude) with 40 hybrid sigma levels in the vertical, driven by winds, surface pressure, and vertical mixing parameters from NASA MERRA2 reanalyses66. A priori fluxes for gross primary productivity, gross respiration, wildfires and biofuel emissions are taken from CASA-GFED3 land biosphere model49,67,68. Fossil fuel burning emissions from the ODIAC model54,55, including diurnal and day-of-week variability61, and air-sea CO 2 fluxes from three different sources: the NASA Ocean and Biosphere Model (NOBM69), and two CO 2 climatological flux products56,70.