In our analysis, the P value for variable selection for binary partitioning was 0.05, and the minimum number of fields required to create intermediate or terminal nodes was different for the analysis of grain yield affected by weather variables or by management practices. For the weather analysis, a minimum of 20 fields were required to create an intermediate node, and 10 fields were required to create a terminal node (i.e., 20 and 10% of total observations). For the management practices analysis, a minimum of 10 fields (i.e., 10% of total observations) were required to create an intermediate node, and a minimum of five fields (i.e., 5% of the total observations) were required to create a terminal node. These criteria were established after performing a sensitivity analysis of different combinations of minimum numbers of plots per intermediate node (5–40% of total observations) and per terminal node (5–20% of total observations). We also allowed tree depth to vary during the sensitivity analysis (up to 30 nodes), but this parameter did not affect model selection. Fit statistics for model selection were R 2 and RMSE. For both analyses, the selected model improved R 2 in 0.07 (∼13 and 17% for weather and management) and decreased RMSE in 0.06 Mg ha −1 (∼7.4 and 5.5% for weather and management) as compared with the next best model.

The regression models ranked management factors in order of importance relative to the frequency of statistical significance of each factor, and ANOVA and regression helped interpret the individual effect of each management practice on wheat yield. However, one weakness of this approach is that the analysis of interactions between management effects was limited to hypothesis testing, which can be subjective in nature and depend on the researcher's understanding of the crop's response to input. Thus, we used conditional inference trees to better explore Level 1 factor interactions. This method is used to evaluate unbalanced data, and its use has recently increased in agricultural sciences (e.g., Tittonell et al., 2008a ; Mourtzinis et al., 2018a , 2018b ) due to properties such as the ability to explore interactions, lack of statistical distribution assumptions, the ability to handle both continuous and categorical variables, and the ability to handle outliers, multicollinearity, and heteroscedasticity ( Tittonell et al., 2008a ). Conditional inference trees consider data distribution to avoid overfitting issues and selection bias towards covariates with many splits or missing values ( Hothorn et al., 2006b ).

Management factors most often associated with wheat yield were further evaluated individually. Analysis of variance in PROC GLIMMIX was used for categorical variables considering year, region, and region nested within year as random effects, and computing df according to the Kenward–Rogers method. We used the LINES statement for comparisons between all pairs of least square means. Linear and nonlinear regression were performed for continuous variables either across the entire dataset or as quantile regression ( Koenker and Bassett, 1978 ) to explore the relationship between maximum yields (quantile = 0.99) and different management factors. These approaches have been largely used in the literature analyzing similar on‐farm data (e.g., Grassini et al., 2015 ; Long et al., 2017 ; Rattalino Edreira et al., 2017 ).

The second step was to perform a series of statistical procedures commonly used to analyze unstructured data ( Mourtzinis et al., 2018b ). We used stepwise, forward selection, backward elimination, least angle regression (LAR), least squared shrinkage operator (LASSO), elastic net, random forest regression, and conditional inference trees to rank the 23 management variables in descending order according to the frequency with which they were identified as significant ( Mourtzinis et al., 2018b ). If a variable was considered significant across all models and at the multilevel model, then its frequency would sum to nine. These models represent a range of traditional to modern regression methods, and as suggested by Mourtzinis et al. (2018b) , some have properties that can mitigate data multicollinearity (e.g., LASSO, LAR, and elastic net; Zou and Hastie, 2005 ; Dormann et al., 2013 ). Random forest and conditional inference trees were built using ‘randomForest’ and ‘partykit’ packages in R, and the remaining models using PROC GLMSELECT. Akaike's information criteria (AIC) was used for model selection.

We used descriptive statistics (i.e., mean, SD, minimum, and maximum) and calculated frequencies of adoption of agronomic practices to demonstrate the range of variation associated with the different variables. Then, causes of yield variation due to management or weather factors were evaluated using different approaches.

First, we calculated cumulative R s , cumulative precipitation, and average daily T min and T max for each field‐year for the intervals of (i) sowing to physiological maturity, (ii) sowing to anthesis, (iii) jointing to anthesis (period in which potential kernels per square meter are determined in wheat; Fischer, 1985 ), and (iv) anthesis to physiological maturity. Dates for phenological events were defined by field‐year from crop simulations. After defining relevant weather parameters for these four developmental phases, we evaluated the effects of weather on wheat Y a using two different approaches. We first compared the weather between the highest‐ vs. lowest‐yielding groups of fields determined as yield terciles in the dataset ( n = 33; Wiese, 1982a ) using posterior Bayesian estimates for group means and SDs ( Kruschke, 2013 ) using the ‘BEST’ package in the R software ( R Development Core Team, 2013 ). Then, we developed conditional inference trees for grain yield as affected by weather ( Hothorn et al., 2006a ). The final step was to quantify the frequency that each weather pattern typically occurs in Kansas, using significant weather thresholds for wheat Y a developed in the previous analyses. Here, we simulated wheat development using SSM‐Wheat for 30 weather stations across Kansas during the 1987 to 2016 harvest years ( n = 896) using actual weather and soil data to define dates for anthesis by site‐year. Then, we plotted mean grain fill temperature vs. total growing season precipitation and calculated the percentage of years in each quadrant for both semiarid and subhumid regions. For more details on crop modeling approach or geographic distribution of the 30 weather stations, please refer to Lollato et al. (2017) .

We used the Simple Simulation Model‐Wheat (SSM‐Wheat; Soltani and Sinclair, 2012 ) and field‐specific soil physical characteristics from the Web Soil Survey ( USDA‐NRCS, 2017 ), daily weather data, and optimum sowing date and plant population to simulate Y w for each field year ( n = 100), to define dates for stem elongation and anthesis, and to calculate ET c (simulated as function of crop biomass; Tanner and Sinclair, 1983 ). The SSM‐Wheat model has been previously validated for wheat development and Y w in the US Southern Great Plains ( Lollato et al., 2017 ). We derived average values for daily maximum ( T max ) and minimum ( T min ) temperatures and precipitation by interpolating daily data from three weather stations positioned near the geographic coordinate of each field ( Fig. 1 ) using inverse distance weighting. Weather data interpolation was performed using the inverse distance weight function ( Tovar, 2014 ) in MatLab R2017b ( MathWorks, 2017 ). Daily solar radiation ( R s ) was derived from the NASA Prediction of Worldwide Energy Resource (NASA POWER, http://power.larc.nasa.gov/ ) for the geo‐coordinate of each field‐year. We estimated plant available water at wheat sowing for each field‐year using non‐growing‐season precipitation and the soil's available water‐holding capacity ( Lollato et al., 2016 ). After simulating Y w , we calculated Y G for each field‐year as the difference between simulated Y w and measured Y a , and we also expressed it as a percentage of Y w .

Map of Kansas showing wheat area in green ( USDA‐NASS, 2016 ). Locations of fields entered in the Kansas Wheat Yield Contest during the 2010 to 2017 harvest seasons (yellow circles and triangles), weather stations used to derive mean values for different meteorological variables (solid triangles), and state boundaries are also shown. Geographical location of the study area within the continental United States is shown in the inset.

We used a database composed of 100 field‐years entered in the Kansas Wheat Yield Contest ( http://kswheat.com/growers/wheat‐yield‐contest ) during the 8‐yr period spanning the harvest years of 2010 through 2017 ( Fig. 1 ). All fields entered in the yield contest were managed under dryland conditions and for grain only. The short timeframe included in this study allowed us to assume negligible genetic gains in grain yield and time trends in agronomic technology. To characterize the typical climate experienced in each field, we determined aridity index (i.e., the ratio the between annual ET o and precipitation; Barrow, 1992 ) for 30 consecutive years and 68 weather stations across the US Great Plains, 30 of which were located in Kansas. We used thresholds of <0.5 for semiarid and <0.65 for subhumid regions, as suggested by Barrow (1992) , which resulted in 36 fields in the semiarid region and 64 fields in the subhumid region ( Fig. 1 ).

Kansas is the largest winter wheat producing state in the United States, with a yearly planted area ranging between 3.0 and 4.3 Mha and total production ranging from 6.7 to 13 million Tg ( USDA‐NASS, 2017 ). Average wheat grain yield was 3.3 Mg ha −1 in the 2008 to 2017 period, showing the same decreasing trend in yearly yield gains as many other wheat regions in the world: grain yields increased linearly at 43 kg ha −1 yr −1 (2% yr −1 ) from 1955 to 1980, and the rate has since decreased to 13 kg ha −1 yr −1 (0.5% yr −1 ) ( USDA‐NASS, 2017 ). Wheat fields in the region typically have mild slopes (1–6%) and common soil textures are silt loam and silty clay loam ( USDA‐NRCS, 2017 ). About 94% of Kansas wheat fields are managed under dryland conditions ( USDA‐NASS, 2018 ), and the majority are cultivated for grain only, although about 12% are grown for dual purpose (i.e., grazing and grain; Herbel and Dikeman, 2018 ). Cumulative precipitation during the winter wheat growing season (October–June) ranges from ∼200 mm in the west to >600 mm in the east, whereas cumulative reference evapotranspiration (ET o ) follows the opposite gradient, increasing from 600 mm in the east to ∼850 mm in the west ( Lollato et al., 2017 ). The climate is subhumid in the east and semiarid in the west ( Barrow, 1992 ).

RESULTS

Winter Wheat Grain Yield and Yield Gaps Mean Y a across all fields was 5.5 Mg ha−1 and ranged from 2.2 to 8.3 Mg ha−1 (Fig. 2a). No Y a values in the database exceeded literature‐reported transpiration efficiency (Fig. 2b). Year‐to‐year variability in Y a was greater than within‐year variability, as the mean yield difference between highest and lowest years was 2.1 Mg ha−1, whereas differences between semiarid and subhumid regions within a given year ranged from 0.4 to 1.6 Mg ha−1. Wheat Y a in our database was greater than the mean wheat yield for the state of Kansas across all years included in the study (5.5 vs. 2.96 Mg ha−1; USDA‐NASS, 2017), but yields were correlated (r = 0.88, P < 0.05). This correlation indicates a good distribution of fields within a given year and suggests that the interannual variation in regional Y w was reflected across different spatial scales. Figure 2 Open in figure viewer (a) Cumulative frequency distribution for actual yield measured from the yield contest (Y a ) and simulated water‐limited yield (Y w ) corresponding to the harvest seasons of 2010 to 2017. Mean and SE are shown in the inset table. (b) Growing season crop evapotranspiration (ET c ) simulated for each field‐year using the Simple Simulation Modeling–Wheat (SSM‐Wheat) model vs. Y a in the yield contest dataset, compared with published literature for wheat maximum transpiration efficiency to assess farmer‐reported data for quality. (c) Mean Y a and Y w by year and subregion (SH = subhumid, SA = semiarid) and yield gap (Y G ). The asterisks denote statistical differences between Y a and Y w for each particular region‐year combination using Bayesian posterior mean and confidence interval estimates, whereas “ns” stands for nonsignificant and “na” stands for for not applicable. Simulated Y w averaged 6.4 Mg ha−1 and ranged from 2.7 to ∼10 Mg ha−1 (Fig. 2a). Mean Y G was 0.8 Mg ha−1 (15% of Y w ) and ranged from 2% in 2014 to 24% in 2017. The high end of simulated Y w (>8.5 Mg ha−1) only occurred 9% of the time and agrees with yield measured in variety performance tests in Kansas (Lingenfelser et al., 2016). Simulated Y w was greater than the Y a in the subhumid region in 2010, 2011, 2012, and 2017 and in the semiarid region in 2012 (Fig. 2c). However, Y a was statistically the same as Y w in the semiarid region in 2010, 2011, 2013, and 2016 and in the subhumid region in 2013 and 2016 (Fig. 2c), suggesting that the fields entered in the Kansas wheat yield contest narrowed the Y G in most years, but not consistently.

Environmental Conditions and Weather Effects on Wheat Yield Across all field‐years, in‐season precipitation ranged from 172 to 751 mm, ET c ranged 335 to 760 mm, cumulative R S ranged from 2770 to 4400 MJ m−2, T min ranged from −1.7 to 4.5°C, and T max ranged from 12.5 to 18.2°C. Comparison between high‐ and low‐yielding field‐years (6.8 vs. 4.3 Mg ha−1) suggested that the weather during the jointing to anthesis and the anthesis to physiological maturity intervals (i.e., reproductive period) had greater influence on grain yield than the weather during the entire growing season or vegetative stages (Fig. 3). During the reproductive period, high‐yielding fields had greater precipitation, lower T max , lower T min , and greater R S than lower yielding fields (Fig. 3). During the vegetative phases, high‐yielding fields had greater T max than low‐yielding fields (Fig. 3). Analysis of a subset of the database from which TKW was available (n = 43) suggested that both T min and T max during grain fill decreased TKW (p < 0.01). We note in passing that there was lower R S during seed filling in the tercile of fields with lower TKW as compared with the higher counterpart (843 MJ m−2 in fields averaging TKW of 29.1 g vs. 926 MJ m−2 in fields averaging TKW of 36.9 g, data not shown). Simulated available water in the soil profile at sowing ranged from 5 to 159 mm and affected yield positively and linearly in the 5‐ to 23‐mm range. Beyond 23 mm, yield plateaued in response to increases in profile moisture (data not shown). Figure 3 Open in figure viewer Cumulative frequency distributions of growing season weather variables for the 33 low‐yielding (LY) field‐years (abbreviated as Reg.) and the 33 high‐yielding (HY) field‐years identified as terciles in the dataset. Solid lines represent no statistical difference between groups (determined using Bayesian posterior mean and confidence interval estimates), and dotted (LY) and dashed (HY) lines denotes statistical differences between groups. Cumulative precipitation (Precip.) and crop evapotranspiration (ET c ; Panels a, b, c, and d), average maximum (T max ) and minimum (T min ) temperatures (Panels e, f, g, and h), and cumulative solar radiation (R S , Panels i, j, k, l) are shown for the entire growing season (Panels a, e, and i), for the sowing to anthesis interval (Panels b, f, and j), for the jointing to anthesis interval (Panels c, g, and k), and for the anthesis to physiological maturity interval (Panels d, h, and l). The conditional inference tree of weather variables and their effect on grain yield (R2 = 0.59, RMSE = 0.75 Mg ha−1) is shown in Fig. 4a. Maximum temperature during grain fill was the most important meteorological variable influencing wheat yields across our dataset. The highest yields (7.2 Mg ha−1) were achieved in fields in which mean T max during grain fill was <27°C and growing season precipitation was <440 mm. Growing season precipitation >440 mm under similar cool grain‐filling conditions resulted in grain yield of 5.6 to 6.9 Mg ha−1, depending on cumulative R S during the growing season. Fields with mean grain filling T max >27°C had lower grain yield (∼4.5 to 5.1 Mg ha−1), a scenario within which greater R S (>798 MJ m−2) resulted in greater grain yield. Evaluation of the weather observed during the 1987 to 2016 harvest seasons indicated that the majority (∼75%) of the growing seasons in Kansas were characterized by mean grain fill T max >27°C, regardless of the region in the state (Fig. 4c and 4d). Only ∼25% of the growing seasons during the harvest years of 1987 to 2016 had mean grain‐filling T max <27°C, which was associated with increased grain yields in the yield contest fields. A total of 32.9 and 37.5% of the growing seasons included in our long‐term analysis had >440 mm of rainfall in the semiarid and subhumid regions. Comparison of the weather conditions observed in the field‐years included in this study with the range of values observed in the last 30 yr in Kansas suggests that fields entered in the contest were representative of the long‐term mean for the majority of the weather variables evaluated. For instance, the interquartile range for cumulative precipitation during the winter wheat growing season for the 30 weather stations included in the analysis of Fig. 4b to 4d ranged from 273 to 488 mm. Meanwhile, the yield contest fields ranged from 214 to 376 mm. Likewise, the interquartile range for long‐term T max and T min during grain filling ranged from 26.9 to 29.5°C and from 23.8 to 15.9°C in the long‐term data, and from 26.6 to 30.3°C and from 11.3 to 16.2°C in the yield contest data. Perhaps the most contrasting values were those observed for growing season ET c , as the interquartile range for the long‐term data ranged from 612 to 703 mm, whereas yield contest fields ranged from 506 to 599 mm. Nonetheless, none of the observations in this study were outside the values observed in the long term for the region (n = 896). Figure 4 Open in figure viewer (a) Conditional inference tree of meteorological variables’ impact on wheat grain yield. Each boxplot represents the interquartile range (gray box), mean (solid line), fifth and 95th percentiles (whiskers), and outliers (empty circles). The number of observations (n), mean, and model fit statistics (R2 and RMSE) are shown. Legend: Tmax_GF refers to maximum temperature during the grain filling period; Rain_GS refers to cumulative rainfall during the growing season; and Rs_GF and Rs_GS refer to cumulative solar radiation during the grain‐filling period and the entire crop cycle, respectively. Panels b through d show maximum temperature during grain filling (Grain filling T max ) vs. growing season precipitation (Precip.) for (b) the 100 fields entered in the Kansas wheat yield contest during the 2010 to 2017 harvest years, and for the historical comparison with 1987 to 2016 harvest years for 30 locations across Kansas for the (c) semiarid region (n = 483) and (d) subhumid region (n = 413). Dashed red lines denote the critical thresholds developed in Panel a of 27°C and 440 mm. Solid black lines denote mean for each particular dataset. The number and proportion of observations in each quadrant in reference to the red lines are also shown.

Wheat Management in Yield Contest Fields Mean sowing date was DOY 279 and ranged from 262 to 314 (Table 1). Although the majority of the fields were sown within the optimum window (Witt, 1996), the fifth percentile (n = 6) was sown at least 12 d earlier than the optimum date, and the 95th percentile (n = 6) was sown at least 21 d after the optimum date. Seeding rate averaged 285 seeds m−2 (85.1 kg ha−1), with first and third quartiles of 246 and 304 seeds m−2 (73.5 and 90.8 kg ha−1). Although a high proportion of the seeding rates was within the recommended optimum (i.e., 180–330 seeds m−2; Shroyer et al., 1996), seeding rates ranged from 93 to 714 seeds m−2 (27.8–213.3 kg ha−1). The highest seeding rates (667 and 714 seeds m−2, or 199.2 and 213.3 kg ha−1) were adopted in fields sown beyond the end of the optimum window. Six fields adopted extremely low seeding rates (approximately 93–121 seeds m−2 or 27.8–36.1 kg ha−1) and sustained high grain yields (>6.5 Mg ha−1). Table 1. Frequency of adoption and rate of a given agronomic practice among the 100 field‐years entered in the Kansas Wheat Yield Contest for the 2010, 2011, 2012, 2013, 2016, and 2017 growing seasons. Data for 2014 and 2015 harvest years are not shown because only information from the three contest winning fields were available. Number of fields (n) is also shown Harvest year Variable 2010 2011 2012 2013 2016 2017 n 18 13 21 12 21 9 Tillage method (%) No till 39 15 76 50 81 78 Reduced till 28 31 14 17 10 11 Conventional till 33 54 10 33 10 11 Previous crop (%) Wheat –† – 29 50 14 11 Fallow – – 19 25 57 22 Other crop‡ – – 52 25 29 67 Manure (%) 6 15 0 0 0 11 N fertilizer Frequency (%) 94 100 95 100 100 100 Rate (kg N ha−1) 75 101 88 91 91 132 P fertilizer Frequency (%) 67 85 90 83 71 56 Rate (kg P 2 O 5 ha−1) 24 37 32 37 34 38 K fertilizer Frequency (%) 17 8 24 42 24 22 Rate (kg K 2 O ha−1) 24 90 12 22 23 27 S fertilizer Frequency (%) 44 38 67 67 62 56 Rate (kg S ha−1) 16 7 15 12 14 12 Micronutrients (%) 0 46 19 33 14 44 Seed treatment (%) Insecticide 6 0 0 8 0 0 Fungicide 17 23 0 0 19 0 Insecticide + fungicide 56 31 62 75 81 89 None 22 46 38 17 0 11 In‐season pesticide (%) Insecticide 17 8 24 17 33 22 Herbicide Fall 11 23 33 17 5 33 Spring 56 69 43 83 71 56 Fungicide Feekes GS 6§ 17 62 29 67 38 67 Feekes GS 9–10.5¶ 94 54 90 75 100 100 No‐tillage practices were predominant during 2012, 2016, and 2017 (i.e., 76–81%), whereas the remaining years had frequencies of adoption of no till, reduced till, and conventional till ranging between 15 and 54% (Table 1). Canola (Brassica napus L.), 14‐mo fallow, maize (Zea mays L.), soybeans [Glycine max (L.) Merr.], or wheat were previous crops in 42 to 57% of the times. The majority of the studied fields (i.e., 94–100%) received N application either as preplant or topdressed during the growing season with an average rate of 94 kg N ha−1, ranging from 0 to 268 kg N ha−1. A relatively high frequency of fields (56–90%) received P fertilizer, either broadcast and incorporated or as a starter fertilizer, at an average rate of 34 kg P 2 O 5 ha−1. Depending on year, 17 to 67% received a foliar fungicide application at jointing, and 54 to 100% of the fields received a foliar fungicide application after flag leaf emergence.