CropYield—Towards pre-harvest crop yield forecasts with satellite remote sensing : Final report
Yli-Heikkilä, Maria; Wittke, Samantha; Kokkinen, Mirva; Partala, Anneli (2021)
Yli-Heikkilä, Maria
Wittke, Samantha
Kokkinen, Mirva
Partala, Anneli
Julkaisusarja
Natural resources and bio-economy studies
Numero
80/2021
Sivut
28 p.
Natural Resources Institute Finland (Luke)
2021
© Natural Resources Institute Finland (Luke)
Julkaisun pysyvä osoite on
http://urn.fi/URN:ISBN:978-952-380-308-4
http://urn.fi/URN:ISBN:978-952-380-308-4
Tiivistelmä
The general objective of this project was to enhance the crop statistics. To this end, we established a pilot case for an automated process for improved crop yield statistics by merging Earth observation (EO) data, the administrative data, agro-meteorological data and historical crop statistics survey data. The significance of the approach is that the previously very laborious data acquisition process from different sources and the processing of multistep modelling is now by design fully automated and can thus reduce spending on professional surveying. The main achievement is that a new artificial intelligence-based crop yield forecasting system can produce pre-harvest yield predictions for four main cereals (oats, barley, wheat, rye).
Surveys are very costly in terms of time and expense. The same is true of gathering expert estimates on regional crop yield forecasts. During the last decade, EO systems have been shown to provide an effective means for large-scale crop monitoring and yield estimations. In this sense, this project has fulfilled its promise to establish a pilot case for an automated process of improved crop yield statistics by merging EO data and a data-driven modelling approach. As a result, we can produce several in-season crop yield forecasts, the first already in late June, around the same time as the Joint Research Centre’s European-wide forecast. From then on, the forecasts can be published, for example, at 10 day-intervals.
The machine learning models implemented in this project achieved a highly promising level of accuracy in pre-harvest yield predictions for four main cereals (oats, barley, wheat, and rye) when compared to the Joint Research Centre’s and LUKE’s seasonal forecasts. However, the problem of choosing the best model remains. There was no clear winning model that reliably predicts yields at all times. Therefore, a model comparison will be the most important developmental task ahead.
In the context of agricultural statistics, more accurate in-season forecasts of crop yields benefit sustainable agriculture and food security with better informed political decisions. In addition, reliable crop forecasts have market impacts. Moreover, EO-based applications can be globally applicable. We expect that within a few years our EO-based crop forecasting will be proven to be a sound method to replace in-season regional expert estimates, and in the foreseeable future it will also gradually replace annual farm surveys.
The uptake of EO as a new data source in statistical production was more complex than initially expected. There are a myriad of approaches to monitoring crop yields, the main decisions to make being whether to utilise: i) optical or radar satellite data or both, ii) image mosaics or single images, iii) pixel-based or object-based image analysis. In addition, remote sensing requires specialized expertise, not to mention the specialized expertise needed in predictive modelling. We acknowledged the lack of remote sensing expertise and made a decision at the start to outsource the pre-processing of satellite images. With outsourcing the sustainability of the project may be jeopardized if the know-how outsourced cannot be fully transferred to the statistical production. In this sense, one significant achievement in the project has been the uptake of EO knowledge, with a substantial contribution from the National Land Survey of Finland, which became a sound part of our production system. As a result, we have the in-house readiness to apply EO as a new data source also to other statistical themes.
It was concluded that country-wide forecasts seemed to work already in June, probably due to the inherited sampling weights from the crop production surveys. However, for the regional forecasts the sampling data was inadequate. For regional forecasts, we would need to sample fields to gain an equal spatial coverage. Moreover, for the northern regions the crop forecasting is reasonable only from July on due to the later sowing dates. Therefore, further study is needed to evaluate the best physiologically grounded observation window for each region.
Deploying the forecasting pipeline requires further automatization. Especially at the end of the pipeline the validation of the results needs further scrutiny. Uploading the predictions to statistical production databases requires modifications to existing ICT-systems. In addition, prediction model architectures need to be revised and improved along with the new data from the coming years.
Surveys are very costly in terms of time and expense. The same is true of gathering expert estimates on regional crop yield forecasts. During the last decade, EO systems have been shown to provide an effective means for large-scale crop monitoring and yield estimations. In this sense, this project has fulfilled its promise to establish a pilot case for an automated process of improved crop yield statistics by merging EO data and a data-driven modelling approach. As a result, we can produce several in-season crop yield forecasts, the first already in late June, around the same time as the Joint Research Centre’s European-wide forecast. From then on, the forecasts can be published, for example, at 10 day-intervals.
The machine learning models implemented in this project achieved a highly promising level of accuracy in pre-harvest yield predictions for four main cereals (oats, barley, wheat, and rye) when compared to the Joint Research Centre’s and LUKE’s seasonal forecasts. However, the problem of choosing the best model remains. There was no clear winning model that reliably predicts yields at all times. Therefore, a model comparison will be the most important developmental task ahead.
In the context of agricultural statistics, more accurate in-season forecasts of crop yields benefit sustainable agriculture and food security with better informed political decisions. In addition, reliable crop forecasts have market impacts. Moreover, EO-based applications can be globally applicable. We expect that within a few years our EO-based crop forecasting will be proven to be a sound method to replace in-season regional expert estimates, and in the foreseeable future it will also gradually replace annual farm surveys.
The uptake of EO as a new data source in statistical production was more complex than initially expected. There are a myriad of approaches to monitoring crop yields, the main decisions to make being whether to utilise: i) optical or radar satellite data or both, ii) image mosaics or single images, iii) pixel-based or object-based image analysis. In addition, remote sensing requires specialized expertise, not to mention the specialized expertise needed in predictive modelling. We acknowledged the lack of remote sensing expertise and made a decision at the start to outsource the pre-processing of satellite images. With outsourcing the sustainability of the project may be jeopardized if the know-how outsourced cannot be fully transferred to the statistical production. In this sense, one significant achievement in the project has been the uptake of EO knowledge, with a substantial contribution from the National Land Survey of Finland, which became a sound part of our production system. As a result, we have the in-house readiness to apply EO as a new data source also to other statistical themes.
It was concluded that country-wide forecasts seemed to work already in June, probably due to the inherited sampling weights from the crop production surveys. However, for the regional forecasts the sampling data was inadequate. For regional forecasts, we would need to sample fields to gain an equal spatial coverage. Moreover, for the northern regions the crop forecasting is reasonable only from July on due to the later sowing dates. Therefore, further study is needed to evaluate the best physiologically grounded observation window for each region.
Deploying the forecasting pipeline requires further automatization. Especially at the end of the pipeline the validation of the results needs further scrutiny. Uploading the predictions to statistical production databases requires modifications to existing ICT-systems. In addition, prediction model architectures need to be revised and improved along with the new data from the coming years.
Collections
- Julkaisut [85847]