In this section we discuss in greater detail the 27 indicators presented in Table 2, providing an overview of their origins, the repository where they are available, and their characteristics in relation to their assumed contribution to the food system sustainability metric. We then discuss a critical issue in relation to the construction of global composite maps, that is, the trade-offs between the number of countries and the number of indicators that can be included in those maps.

Food system sustainability datasets – an overview

The 27 indicators that had been short-listed through steps 1–4 are listed in the fourth column (labelled ‘indicators’) of Table 2. Their detailed definition as well as where they can be retrieved is provided in the Harvard Dataverse database “Sustainable food systems global index”36. For the Environmental dimension, seven indicators that satisfied the inclusion/exclusion criteria had been identified from the literature. Those cover five sub-dimensions of the environmental dimension: the quality of air, the quality and use of water, the quality and use of soils and land, the level of wildlife biodiversity and crop diversity, and the use of energy. For the Economic dimension of the sustainability metric, three indicators that satisfy all the inclusion/exclusion criteria were identified from the literature. They cover the financial performance, level of employment, and economic distribution of the wealth generated by the food system. Likewise, for the Social dimension, only three indicators satisfy all the inclusion/exclusion criteria. They cover the gender/equity and the degree of inclusion of the system (both international and national levels). Finally for the Food & Nutrition dimension, a richer set of indicators is available from the literature and 14 indicators satisfying the inclusion/exclusion criteria were identified. They cover the four standard elements of food security (availability, access, utilization, and stability), plus food safety, food waste and use, and the four conventional dimensions of nutrition, that is, diet quality, undernutrition, overweight & obesity, and micronutrient deficiency.

The next column labelled “SR” in Table 2 indicates the expected sign of the relationship between the individual indicators and the resulting level of sustainability. A positive (+) sign would refer to situations where a positive relation is theoretically expected between the indicator under consideration and the overall level of sustainability. For instance, it is reasonable to assume that the higher the level of carbon in the soil, the higher the quality of the soil and the higher the sustainability of the system; likewise the higher the diet diversity index, the better the quality of the diet and the higher the sustainability of the food system. Those indicators are therefore associated with + signs. In contrast a negative (−) sign indicates a situation where a high value of the indicator is expected to be associated with a low level of sustainability of the food system. Examples include price volatility index or prevalence of obesity. Overall the SR column indicates that all indicators selected have an expected monotonic relationship with the sustainability of the system, which is an important property as it reduces the risk of complications that non-monotonic relations would introduce for the interpretation of the global index. Note that in that regard the data of the water pH (capturing the water quality sub-dimension) has been transformed using the absolute value of the difference between the actual pH value and 7 (reference value) so that the SR sign for this specific set of data is also monotonous and negative.

The next column, labelled “DP”, indicates the Degree of Proxy with respect to food system, that is, the extent to which the indicators included in the metric capture the process they are expected to measure in a comprehensive manner, or whether they only capture part of it. For instance the indicator used to reflect the degree of gender equity is the index “Female employment rate in agriculture” currently compiled by the World Bank, based on national statistics37. This index captures gender equity in the agriculture sector only. In its current form, it does not say, therefore, anything about the situation in the other sectors of the food system, such as processing, retailing or distribution. It means the indicator currently available for gender equity is only a proxy for the whole food system. As such it is associated with a “P” in the column DP in Table 2 (“P” for partial). In contrast the level of biodiversity, which by definition concerns the pre-production sector, is captured adequately in the metric by the biodiversity index as computed by the Global Environment Facility38. This indicator can therefore be considered as covering comprehensively the part of the system concerned with this specific issue. It is therefore associated with a “C” (for ‘comprehensive’) in the DP column.

The overall proportion of P’s and C’s in the column DP provides us with a qualitative indication of the level of ‘coverage’ provided by the indicators that were found in the literature. As far as Food & nutrition is concerned the situation is relatively satisfactory – since all the indicators are characterised as C-indicators. The situation of the Environmental dimension is more mixed, with 6 Cs- and 2 Ps. On the other hand, the Social and Economic dimensions of the metric are, at the present time, only partially captured. Both dimensions are represented only by P-indicators. This partial coverage is mainly due to the fact that all the indicators available at the present time at a global level are indicators that capture social or economic aspects of the agriculture sector; they do not include information related to the other sectors of the system, such as transport, distribution, transformation that are also part of food systems.

The next column in Table 2 indicates the original sources from which the indicators were retrieved. The large majority of them come from UN agencies – in particular the Food and Agriculture Organization – which generally collect information/data from their member countries’ national statistics. Exceptions to this are (i) the data related to the number of fair trade organizations and producers, which was compiled by the NGO Fairtrade International, (ii) the estimated travel time to the nearest city, made available by the European Commission, (iii) the Price volatility index computed by the International Center for Tropical Agriculture (CIAT), and (iv) the Crop diversity index39. In those last two cases however (price volatility and crop diversity) the initial datasets used to compute those higher level indicators were initially derived from UN-FAO datasets.

Last on the right-hand side of Table 2 are the columns that indicate the time-period and the number of countries for which these different datasets are available. The columns show that all datasets cover the period 2000–2017 of interest to us, and that (at the present time) the dataset with the lowest number of countries is the rate of under-employment in the agriculture dataset (72 countries), while the indicator with the largest number of countries and territories is the travel time (currently computed for 245 countries and territories).

Countries – optimizing indicator coverage

One critical issue, albeit rarely discussed in papers dealing with the construction of global metrics, is the trade-off that exists between the number of countries included in the analysis and the number of indicators used to build the metric. It is important to understand that each indicator in the metric is available for a specific subset of countries and that those countries are not always the same across indicators. For instance, although the FAO per capita food supply variability index and the Predominant fair trade organizations and producers dataset constructed by Fairtrade International both cover a very similar number of countries (162 and 160 respectively), the actual number of countries that is common to the two datasets is only 118. The implication is that it is not possible to maximize the two dimensions of the metric (countries and indicators) at the same time and a choice (trade-off) has to be made. In the present case for instance, the maximum number of countries for which at least one indicator in each of the four dimensions of the metric is available, is 164. On the other end, if we want to retain the 27 indicators initially identified, only 16 countries with complete datasets for all 27 indicators can be found. This situation creates a ‘trade-off frontier’ – displayed on Fig. 4. In parallel Fig. 5 shows the two extremity scenarios mentioned above: the maximum number of countries (164) for which at least one indicator is available in each of the four dimensions of the metric (Fig. 5a); and the set of 16 countries for which data are available for all 27 indicators (Fig. 5b).

Fig. 4 Trade-off ‘frontier’ between the number of countries for which the datasets of indicators are complete and the number of indicators included in the metric. The frontier shows that the larger the number of indicators considered, the smaller the number of countries for which those indicator datasets are complete, and vice versa. Full size image