The BNNs demonstrated that the average yields of cacao farmer groups, in Sulawesi over distinct time periods, are closely associated with the ENSO OI patterns 9 to 25 months before harvest. The ENSO OI short term pattern explained slightly less (69%) of the variation in the average yield than the long term pattern (77%). We consider both these levels of prediction to be high, however, the short term pattern level was simpler and was used for further analyis. The linear regression between predicted and actual yields indicates that the model will tend to underestimate cacao productivity at high yields (e.g. in excess of 100 kg ha−1 month−1).
The predictions made by the BNNs indicated that cacao yields are substantially impacted by ENSO conditions, which accords with prior observations21. The fertilizer response varied according to the ENSO profile: the greatest predicted response was in the Neutral ENSO profile with a smaller response under the MinCent ENSO profile, especially when unfertilized yields were low, and essentially no response under the MaxCent ENSO profile. Hence, the analysis provides insights into the appropriate fertilizer regime for distinct ENSO OI patterns in the period 9 months before harvest. We also note that recent methods to improve prediction of future ENSO OI patterns make it possible to predict them with reasonable accuracy for up to 1 year3. Thus, it is possible to relate average cacao crop performance and management practices directly to ENSO patterns in a given region without the need for weather data when the following conditions are met: (1) data exist on crop performance in any given site over time with distinct management practices; and (2) the weather patterns are driven by ENSO OI. We have used cacao as proof-of-principle, and suggest that this principle can readily be applied to other crops.
A great advantage that Bayesian methods have over other machine learning approaches is that they can utilise variance based probability distributions to predict the likelihood of any given outcome. The model was used to predict the most likely monthly yield and expected standard deviation from each farm group under a specific ENSO profile when either fertilized or unfertilized. The standard deviations attained across all predicted responses was remarkably low, typically less than 1 kg ha−1 per month. Both the construction of the model and the subsequent predictions were based upon the mean yield data from 10 farms in each group at each monthly harvest under a single management type. As a result, all variations in yield across those 10 farms would have been excluded from the network constructed. As a consequence, while the predictions returned by the model might precisely reflect the mean response from each group, the limited input data will mean that the range of possible outcomes under any predicted scenario is likely to be underestimated. Up to now we have established proof-of-principle stage, the next stage will be first to improve the assessment of the predicted probability distributions and then to develop channels for communicating the results of the analysis to farmers followed by appraisal of their opinions and use of the information provided. Options for improving estimates of the probability distribution include both incorporating all observations from within each group, to ensure that farm-to-farm variance is adequately captured, and to extend the observations across more seasons to ensure that the variability of response to contrasting ENSO profiles is better represented.
The analysis presented here is based on the average yields for each group of farmers. However, previous analysis indicates much variation in yield within the farmers groups20. Furthermore, those farmers with higher average yields tended to maintain their yield advantage relative to those with lower yields, even when conditions were adverse. This supports the view that the differences in yield between the high average yield and the low average yield farmers are due to management skills, rather than more favorable soils and weather conditions20. This suggests that if the average yields of individual farmers relative to the mean of all farmers are known, then the ENSO predictions can be used to predict their yield levels, and also their response to fertilizer applications.
The demonstration that on farm yields and response to one management variable, fertilizer, can be linked directly to ENSO OI data supports the view that, in the future, with cacao or other crops, data on farm yields obtained with distinct management practices can be coupled with ENSO OI data to both determine probable crop yields and also to define differential crop response to management at specific sites under distinct ENSO OI patterns without the need for accurate weather data. The ENSO OI data exists, what is often lacking is data on yield with distinct management practices. To obtain this type of information in heterogeneous growing environments using traditional Randomized Control Trials is simply not possible. However, we suggest that schemes, such as those to collect the cacao data we have here with distinct management treatments superimposed on farmers fields20, can be used. Furthermore, even without superimposing management practices, simply monitoring crop performance, weather and the variation in management practices of farmers can be used to relate yield to variation in weather patterns and management28,29,30. However, this is only effective if the data of a large number of cropping events is brought together for analysis, which requires social organization and the willingness to share data28. Our experience with cacao indicates that small farmers are willing to share data, but an external agency is required to manage the overall process of data collection and compilation20. Similar experiences with CropCheck and in Australia and Chile support this point of view31,32. The value of shared information through formation of farmer groups is well established33,34 and we suggest that the methodology described here could be implemented through farmer groups. Hence, through monitoring of crop performance and management coupled with Bayesian based machine learning tools and currently available ENSO OI information and predictions, farmers and agronomists can adjust management practices, in this case fertilizer applications, according to ENSO profiles. This will require social organization and support for the collection, compilation and analysis of the data; however, we believe it offers a route to provide farmers with an improved and cost effective knowledge base, derived from sparse data resources, to better manage their crops.
Social organization is not only required for the collection of data to be analysed, but also for the disemination to farmers of the knowledge generated though its interpretation. Current tendencies of providing farmers with the basis to make better decisions recognise the restrictions of the linear model for extension and tend towards active farmer participation in the interpretation of data through such mechanisms as farmers field schools35, formation of farmers groups (see for example Montaner 200434) and innnovation networks (see for example Klerkx et al. 201036, Wood et al. 201437, World Bank, 200838). Further development of farmers´organizations and innovation networks will be required to effectively deploy the concepts presented in this paper.
The principles developed here could be applied to other crops, such as coffee, olive and oil palm, and this type of analysis could be extended to other regions, such as Africa where data on crop response to management and weather variation is sparse. At the same time, we note that additional information on, inter alia, crop management, topography and soil types could substantially improve the predictive power of the networks. Furthermore, these machine learning techniques can be used to mine existing big data sets collected by large commercial interests, to discover relationships between environment, management and crop production, and thereby supplement, at low cost, the findings generated by formal controlled scientific experiments. In the case of small farmers, social organization and external support will be required.
There are several caveats on the use of this proposed methodology. First, the relationship between the ENSO phenomenon and the weather patterns will be specific to each location or recommendation domain. Hence, models and inferences for management cannot be readily transferred from one recommendation domain to another. Furthermore, the definition of the area that comprises a recommendation domain is not simple. Thus, whilst we consider the principles developed here to be universal, the models themselves will be specific to each recommendation domain, which are currently still difficult to define but new approaches are becoming increasingly available to do so (e.g. Rubiano et. al. 201618; Rattalino Edreira et al. 201817).
A further complication of the suggested approach is the lack of understanding of the underlying mechanisms that establish the associations. This deficiency limits the ability to identify the specific causes of different crop productivities, and thus limits our ability to resolve these unidentified problems.
Growers decisions on how much to invest in their crop production practices depends on the expected prices of the commodities they produce: when prices are expected to be high, they will invest more, and when prices are low they may even abandon their crops. It has not escaped our notice that the predictive power of the machine learning resources would also provide the cacao industry as a whole with insights into the fluctuations in future cacao supply and hence prices. This would allow farmers and others in the cacao supply chain to minimize uncertainty and better manage the overall industry. The experiences strongly support the idea that machine learning is a useful tool in our armoury opening the opportunity to utilize information from on farm performance coupled with publicly available data to improve agricultural management.
Source: Ecology - nature.com