Abstract Low crop yields in Sub-Saharan Africa are associated with low fertilizer use. To better understand patterns of, and opportunities for, fertilizer use, location specific fertilizer price data may be relevant. We compiled local market price data for urea fertilizer, a source of inorganic nitrogen, in 1729 locations in eighteen countries in two regions (West and East Africa) from 2010–2018 to understand patterns in the spatial variation in fertilizer prices. The average national price was lowest in Ghana (0.80 USD kg-1), Kenya (0.97 USD kg-1), and Nigeria (0.99 USD kg-1). Urea was most expensive in three landlocked countries (Burundi: 1.51, Uganda: 1.49, and Burkina Faso: 1.49 USD kg-1). Our study uncovers considerable spatial variation in fertilizer prices within African countries. We show that in many countries this variation can be predicted for unsampled locations by fitting models of prices as a function of longitude, latitude, and additional predictor variables that capture aspects of market access, demand and environmental conditions. Predicted within-country urea price variation (as a fraction of the median price) was particularly high in Kenya (0.77–1.12), Nigeria (0.83–1.34), Senegal (0.73–1.40), Tanzania (0.90–1.29) and Uganda (0.93–1.30), but much lower in Burkina Faso (0.96–1.04), Burundi (0.95–1.05), and Togo (0.94–1.05). The correlation coefficient of the country level models was between 0.17 to 0.83 (mean 0.52) and the RMSE varies from 0.005 to 0.188 (mean 0.095). In 10 countries, predictions were at least 25% better than a null-model that assumes no spatial variation. Our work indicates new opportunities for incorporating spatial variation in prices into efforts to understand the profitability of agricultural technologies across rural areas in Sub-Saharan Africa.

Citation: Bonilla Cedrez C, Chamberlin J, Guo Z, Hijmans RJ (2020) Spatial variation in fertilizer prices in Sub-Saharan Africa. PLoS ONE 15(1): e0227764. https://doi.org/10.1371/journal.pone.0227764 Editor: Gerald Forkuor, United Nations University Institute for Natural Resources in Africa, GHANA Received: July 11, 2019; Accepted: December 27, 2019; Published: January 14, 2020 Copyright: © 2020 Bonilla Cedrez et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: Data are available in: (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/E0EHLO). Funding: Funding for this project was provided by the Feed the Future Sustainable Intensification Innovation Lab (SIIL) through the USAID (grant # AID-OOA-L-14-00006), by the Bill and Melinda Gates Foundation through the Taking Maize Agronomy to Scale in Africa (TAMASA) project (grant OPP1113374), and the MAIZE CGIAR Research Program led by the International Maize and Wheat Improvement Center (CIMMYT). Competing interests: The authors have declared that no competing interests exist.

Introduction Crop yields in Sub-Saharan Africa (SSA) are generally very low [1, 2]. For example, the average cereal yield in SSA is 1266 kg ha-1 while the global average is 3745 kg ha-1 [3]. There are several factors contributing to this, but low soil fertility stands out and the increased use of inorganic fertilizers could strongly increase crop yields [4, 5, 6, 7]. The average application rate of inorganic fertilizer on arable land in Sub-Saharan Africa is 14 kg ha-1, much lower than the 141 kg ha-1 in South Asia, 154 kg ha-1 in the European Union, 175 kg ha-1 in South America, and 302 kg ha-1 in East Asia [3]. A number of hypotheses have been proposed to explain the low usage of inorganic fertilizer in SSA: (a) on many African soils, crops do not respond well to fertilizer, perhaps because of a low soil organic matter content or soil degradation [8, 9, 10, 11, 12]; (b) farmers are not aware of, or do not believe in the utility of inorganic fertilizer [13, 14, 15]; (c) farmers have a cash-flow problem, and need better access to credit to buy fertilizers [13, 16, 17]; (d) variability in rainfall makes it too risky to invest in fertilizers, and insurance programs may be needed to support fertilizer use [5, 18]; and (e) after accounting for agronomic responses to fertilizer; local input and output prices are such that fertilizer investments are insufficiently profitable for many farmers [18, 19, 20, 21]. It is challenging to evaluate these hypotheses, because each one of them may be true in some location or under certain conditions, and because the location specific data needed for analysis is not readily available. For example, to evaluate the economic returns to fertilizer in a particular location, we need to understand crop responses to fertilizer under local conditions, as well as the effective price of fertilizer, and the price that a farmer could get for the crop products. In this paper we focus on one particular data gap: the spatial variation in the price of fertilizer. While national level fertilizer prices may be available, we need to consider the extent to which prices vary within countries, reflecting transportation costs and other factors [22, 23, 24, 25]. In the absence of such data, simplifying assumptions are made that may not be valid. For example, evaluations of the returns to production technologies in developing country settings have often assumed spatially invariant input and output prices (e.g., [26, 27]) or modeled spatial variation in prices based on assumptions about price variation as a function of distance to markets (e.g. [24, 28, 29]). An obstacle to using empirical data on sub-national variation in fertilizer prices is the scarcity of such data. A few studies have utilized price variation obtained from averaging price survey data over arbitrary sub-national boundaries, such as districts or regions (e.g. [18, 30, 31]), but, to our knowledge, there has not been an attempt to systematically describe the variability of fertilizer prices within countries and our ability to estimate the price at unsampled locations. To address this gap, we compiled local fertilizer price data from eighteen countries in SSA. The main goal of this paper was to describe price variation within and between countries and to use spatial interpolation models to predict local prices of urea, a commonly used source of inorganic nitrogen. In addition to the main goal of investigating whether we can interpolate price data, we analyzed the relationship between urea and other fertilizer type and between subsidized and non-subsidized fertilizer prices. Our models enable us to predict prices at locations for which there are no empirical data available. The amount of sub-national price variation, and the performance of our interpolation models, differs strongly across countries in our sample. In countries where there is consistent regional price variation, we show that interpolation methods can be used to estimating unobserved local prices. The spatial price predictions–“price maps”–that are generated by our approach can be used to improve empirical research on technology use (by more accurately approximating local input and output prices faced by farmers), as well as incorporated into targeting and planning frameworks for agricultural investments (e.g. by targeting promoting efforts to areas where technologies are most profitable).

Materials and methods Price data We compiled fertilizer price data for eighteen countries in West and East Africa from two data sources: (1) Africa Fertilizer [32] and (2) the Living Standards Measurement Study-Integrated Surveys on Agriculture (henceforth LSMS) [33]. Africa Fertilizer reports prices over time for major towns in different countries. Prices are for 25 kg or 50 kg bags and expressed in the national currency. We compiled 7823 observations for 878 locations in seventeen countries: Benin, Burundi, Burkina Faso, Côte d’Ivoire, Ghana, Kenya, Mali, Malawi, Mozambique, Niger, Nigeria, Rwanda, Senegal, Togo, Tanzania, Uganda, and Zambia from 2010–2018. We used the town name to assign geographic coordinates to each location. Africa Fertilizer reported non-subsidized and subsidized prices for several countries. LSMS-ISA (hereinafter “LSMS”) is a nationally representative multi-topic household survey implemented in eight countries in SSA: Burkina Faso, Ethiopia, Malawi, Mali, Niger, Nigeria, Tanzania, Uganda [33]. For Ethiopia, Mali, and Malawi, prices were reported as the total amount paid for a quantity of fertilizer purchased expressed in the national currency. We used that information to calculate the price of fertilizer per kg. For Nigeria and Tanzania, the questions about the purchased fertilizer (i.e. type, source, and amount used) were at the plot level, while the reported price is for the total quantity of purchased fertilizer. To compute the fertilizer price, we assumed that the amount of fertilizer used on all plots was equal to the total amount purchased by the farmers. We did not use the LSMS data for Burkina Faso and Uganda, as the type of inorganic fertilizer purchased was not specified. We did not use the data for Niger as the locations of the households were not provided, and there were only 13 households that reported a price for urea, which is our main focus in this paper. Different types of fertilizer are reported by LSMS and Africa Fertilizer depending on the country, e.g. urea, calcium ammonium nitrate, diammonium phosphate, and ‘NPK’ (in the case of LSMS without specifying the N, P or K content). For this study, we focus on urea price because it is the product with most observations across all the countries studied and because it is an unambiguous product (unlike, for example, fertilizers labeled as “NPK” which may vary widely in composition percentages). Urea [CO(NH 2 ) 2 ] is the world’s most widely used form of inorganic nitrogen fertilizer. It contains 46% N, and no other plant nutrients. We compared the price of urea with that of “NPK” by including all the ‘NPK’ formulations reported by Africa Fertilizer, with CAN (calcium ammonium nitrate) and DAP (diammonium phosphate). CAN is also known as nitro-limestone. The chemical formula is variable (e.g. [NH 4 NO 3 + CaCO 3 * MgCO 3 ]) but in general it contains 8% of Ca and 21–27% of N as [NO 3 -] and [NH 4 +]. DAP [(NH 4 ) 2 HPO 4 ] is the world’s most widely used phosphorous fertilizer and is also a source of N. It contains 18% N and 20% P. Data manipulation To allow for comparison of prices between years and countries, we converted prices from national currency to inflation-adjusted purchasing power parity United States dollars (USD) by dividing them by the consumer price index (CPI) times the purchasing power parity (PPP) dollar value for each year according to the World Bank [3]. The CPI is defined as the changes in the cost to the average consumer of acquiring a basket of goods and services that may be fixed or changed between years. Our base year was 2010 (CPI = 1). A weakness of the CPI is that it may not be a very accurate measure of price changes in more rural areas. The PPP conversion factor represents how much of a country’s currency (expressed in USD using the international exchange rate) is required to buy the same amounts of goods and services in the domestic market as one USD would buy in the United States. To improve the quality of the data, we removed gross outliers that were clearly implausible. Outliers were defined as values that were more than 1.5 times the distance between the lower and upper quartiles away from the lower and upper quartile. Comparison of fertilizer prices Although we focus on the price for urea, we provide some comparison with the price for other fertilizers. For the Africa Fertilizer data set, we fitted: (a) a liner regression model with no intercept between the prices for urea and the other fertilizer types, for locations where both the urea price and that of another type were reported; (b) the same linear regression models for each country separately; and (c) a liner regression model with no intercept between subsidized and non-subsidized urea prices, for each country separately. We use no intercept because that gives a simple number that can be readily interpreted (the price of y is a*x). Also, the implicit (0, 0) intercept should be correct for prices. Spatial prediction To study spatial variation in prices, we first removed temporal variation by calculating a spatial price index computed as the price in a location divided by the national median price for that year. For spatial predictions we focused on the non-subsidized urea prices for 2014 to 2018. We build predictive models of the spatial price index using location (longitude and latitude) and additional predictor variables that capture aspects of market access (travel time to market and distance to port), demand for product (population density and cropland) and one of the environmental factors that influence crop responses to fertilizer (precipitation). All predictor variables were organized as spatial raster data sets with a 5 arc-minute (about 9 × 9 km) spatial resolution. Specific market access variables used were travel time in hours to (1) the nearest port, (2) the nearest town with over 50,000 inhabitants, (3) the nearest town with over 100,000 inhabitants and (4) over 250,000 inhabitants [34]. Population and rural population density estimates were measured in persons per km2 [35]. Cropland was computed by aggregating a 30 m spatial resolution crop mask data [36] so that the values represent the fraction of land that is used for crop production and smoothed by computing a (5x5 cells) focal average. We used annual precipitation (mm) from WorldClim version 2 [37]. The spatial data was handled with the “raster” package [38]. For the regional models we also used median country level price as a predictor variable. For each country and region (West and East Africa), Random Forest Regression and Thin Spline Plate algorithms were used to fit models. Random Forest has the benefit of flexibility for fitting potentially irregular surfaces resulting from complex interactions [39]. We used the Random Forest algorithm as implemented in the R-package ‘randomForest’ [40]. The tuneRF function was used to find the optimal number of variables available for splitting at each tree node (the ‘mtry’ parameter). Random Forest models tend to predict very well within the range of the observed values, but the predictions may show sudden breaks, and can be poor in extrapolating. Therefore, we also used Thin Plate Splines (TPS) to create smooth surfaces. TPS is a flexible local regression method that lends itself to interpolating noisy data with high levels of uncertainty [41]. The model is fit by minimizing the residual sum of squares subject to a constraint that the function has a certain level of smoothness (to avoid overfitting). The required level of smoothness is set by the roughness parameter which is determined through cross-validation. This, while producing smooth surfaces, TPS will not necessarily return the observed value for a sampled point. This helps guard against over-fitting especially for noisy datasets where the observations themselves are estimates based on small samples [42]. The TPS model was implemented using the ‘Tps’ function in the R-package ‘fields’ [43] with only longitude and latitude as predictor variables. We combined the Random Forest and TPS models into a single ensemble model using the weighted average of the inverse Root Mean Square Error (RMSE) of each model. We predicted the spatial price index and derived absolute prices from that by multiplying it with the current average price. We evaluated the models with five-fold cross-validation. We computed Pearson’s correlation coefficient and Root Mean Square Errors (RMSE) as test statistics. We compared the RMSE of the interpolation (E I ) with that of a Null-model (E 0 ), in which we assumed no spatial price variation, by computing relative RMSE as E R = 1-(E I / E 0 ). E R expresses how much better the interpolated predictions are relative to the using a single national price. We build separate models for the Africa Fertilizer and LSMS data. For Tanzania, we also fitted a third model by combining Africa Fertilizer and LSMS data, used a sample from LSMS with the sample size equal to number of observations in the Africa Fertilizer data for Tanzania. For the regional model, there were many more samples in the countries for which we had LSMS data. To avoid an imbalance in the model evaluation, we used a sub-sample of the test data for these countries, with the sample size equal to the mean number of observations for the non-LSMS countries in the region. Variable importance was assessed for the Random Forest model using the percentage increase in Mean Squared Error, that is, the increase in MSE if the variable had no information. Transportation costs For Ethiopia, Nigeria, and Tanzania, LSMS reported where the farmers purchased the fertilizer (i.e. within or near the village, within or near the nearest urban center, outside the district or region). In addition, for Nigeria and Malawi LSMS reported transportation used by farmers to get to the market and transportation cost. We calculate the proportion of different transportation type used by farmers and computed average transportation cost of each transportation type. We fitted a linear regression model with no intercept between transportation cost and travel time (in hours) to market.

Discussion and conclusions We have compiled a large amount of retail fertilizer price data to understand spatial variation in urea prices in eighteen countries in SSA. Our data covered much of the SSA cropland area, but we did not have data for a few large countries including South Africa, Sudan, and Zimbabwe. We found a considerable amount of spatial variation in urea prices in many countries. In most countries our spatial models fit the data reasonably well and outperformed the Null model of no spatial variation. Countries where the model performed very well, such as Côte d’Ivoire, Kenya, Senegal, Tanzania, and Uganda had a relatively high range in observed prices. Countries where the spatial prediction model did not perform very well were relatively small (e.g. Rwanda), and thus one might not expect as much spatial price variation. In Rwanda and Malawi, the fertilizer subsidy program may also be lowering the variation and complicating the pattern of spatial price variation. For instance, in Rwanda the price for subsidized prices was the same in all locations and the number of observations was higher than for non-subsidized urea. Nigeria was the only large country where the spatial prediction model did not perform well. While the predictions based on the LSMS showed spatial variation; the model predictions were poor, perhaps due to low data quality. The model fitted with Africa Fertilizer data had also a poor performance, perhaps due to the sparse spatial coverage. The LSMS would appear to be an attractive data source by virtue of the large number of records of prices paid by farmers. However, models fit with LSMS data generally performed poorly, possibly due to data quality issues which are not immediately apparent from examination of the values reported, and/or the fact that farmers in a given locality may purchase fertilizer from different market locations which are imperfectly identified in the data. For countries where Africa Fertilizer data was also available, models fit with these data generally did better (or both models were very poor). This suggests that the LSMS as a source of local price data is compromised by data quality issues. While we were able to remove outliers, we had no way of correcting error for prices that were within a reasonable range of values. This problem has also been reported by others [21, 44]. The high number of unrealistic prices (excluded in our study as outliers) and poor model fit, suggest the need for better quality control mechanisms in this survey. The consistency of the Africa Fertilizer data was generally much better. However, it is worth noting that these price data are only collected in towns and may not reflect the likely higher prices in more isolated rural areas. Also, because of the relatively low sample size, some important agricultural regions within countries were not sampled, making the predictions for these regions highly uncertain. An expansion of the Africa Fertilizer observation network within countries, and to other countries currently not covered would be very helpful to more fully understand fertilizer price variation. Using regional models (i.e. pooled observations from multiple countries within a region), we predicted prices for the whole region with a single model. Regional models are attractive as they could be used to predict price variation in countries for which we have limited local market price data, but we did not investigate that here. It has been reported that fertilizer prices in SSA increase when moving away from main markets due to transportation costs from the port or blending facilities [45, 22, 23, 24]. For example, a price gradient was reported for Uganda with retail prices steadily rising when moving away from Kampala [30]. We found this pattern in a number of countries, including Côte d’Ivoire, Ethiopia, Kenya, Mali and Uganda. However, there were clear exceptions too. For example, in Nigeria, Ghana, and Benin, prices went down when moving away from the coast. Possible factors influencing this pattern could be that: market prices in areas with higher demand are lower [46, 47, 27]. Aggregate fertilizer use in northern Nigeria is higher than in the south because it has a larger cultivated area and produces high value crops [48, 49]. In addition, northern states have traditionally provided greater fertilizer subsidies [50]. In some cases, imports of fertilizer from neighboring countries with lower prices may affect prices. For example, imported fertilizer from Malawi could be reducing the price in south-west Tanzania. The allocation of subsidized fertilizer, and the degree to which this leak into the non-subsidized market could also influence prices [51]. Political influence can affect the allocation of subsidies, with politically well-connected villages receive more input compared to less connected villages [52, 53, 54] or to reward loyalty [55, 56, 57]. It is important to know the price at the farmgate which includes the costs of the fertilizer per se plus the cost of reaching the market to get the product. Some authors reported little effect of changes in the cost of shipping fertilizer from the distributor to local retailer [21,58]. However, like [59], they found that farmers were sensitive to the costs of reaching retailers. We only had local transportation cost data for Malawi and Nigeria; our results suggest that the additional costs were 2% and 5.8% of the total amount paid for the fertilizer. Our estimated prices can be adjusted when using them in computations for on-farm costs. Many studies of technology profitability have assumed spatially-constant input and output prices (e.g., [26, 27]), which is at odds with rural economic reality. A few studies have utilized “local” prices calculated as sub-national mean prices (e.g. at district or province level) from survey data (e.g. [18, 30, 60]). The sizes of such sub-national units often vary widely, and the boundaries are somewhat arbitrary, and there may be strong unobserved price variation within such areas. Our interpolation method is very likely an improvement over such approaches, particularly in countries where available spatial covariates perform well as predictors. We used an ensemble model of Random Forest Regression and Thin Plate Spline methods. A downside of using such algorithmic methods is that there is no direct way to estimate uncertainty. However, elaborate Bayesian geostatistical approaches have been developed to estimate uncertainty in similar contexts (e.g., [61, 62]) and these methods could be applied to price data as well. This is particularly relevant to account for the large uncertainty of the predictions in areas far away from any observations. It is important to note that cross-validation results may be inflated for such regions because, on average, model skill should decrease with the distance from a location to locations with known prices (cf., [63]). An important avenue for future empirical work would be to evaluate the extent to which the subnational price variation we have documented is a useful explanatory factor for observed variation in smallholder fertilizer use in Sub-Saharan Africa, after controlling for local agronomic responses and output prices. One way to do that may be to integrate input (and output) price predictions into spatial crop models, and then evaluate the degree to which modeled fertilizer use profitability predicts observed fertilizer use rates across different locations. Farmers in remote rural settings generally face less favorable input-output price ratios than farmers in less remote settings [58, 64]. These differences can likely help explain differential patterns of agricultural input use. Yet the specific ways in which these spatial patterns play out is often obscured by insufficient data on local prices, and the patterns we have revealed did not always follow simple expectations (e.g., prices being higher in coastal Ghana). The evidence we have compiled in this paper suggests that, while investments in more comprehensive and spatially representative price data collection would be very useful, we may utilize spatial price prediction models to extend the value of existing data to better reflect local price variation through interpolation. Even if imperfect, such estimates are almost certainly more usefully reflective of farmers’ economic realities than assumptions of spatially constant prices within a given country, for all but the smallest countries. We propose that spatial price estimation methods such as the ones we employ here may serve for better approximating heterogeneous economic market landscapes, until such time as truly comprehensive local market price information systems become available.

Acknowledgments We thank Drs. Aniruddha Ghosh and Travis J. Lybbert for helpful suggestions.