Accurate prediction of sugarcane yield using a random forest algorithm

Everingham, Yvette, Sexton, Justin, Skocaj, Danielle, and Inman-Bamber, Geoff (2016) Accurate prediction of sugarcane yield using a random forest algorithm. Agronomy for Sustainable Development, 36 (27). pp. 1-9.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website:


Foreknowledge about sugarcane crop size can help industry members make more informed decisions. There exists many different combinations of climate variables, seasonal climate prediction indices, and crop model outputs that could prove useful in explaining sugarcane crop size. A data mining method like random forests can cope with generating a prediction model when the search space of predictor variables is large. Research that has investigated the accuracy of random forests to explain annual variation in sugarcane productivity and the suitability of predictor variables generated from crop models coupled with observed climate and seasonal climate prediction indices is limited. Simulated biomass from the APSIM (Agricultural Production Systems sIMulator) sugarcane crop model, seasonal climate prediction indices and observed rainfall, maximum and minimum temperature, and radiation were supplied as inputs to a random forest classifier and a random forest regression model to explain annual variation in regional sugarcane yields at Tully, in northeastern Australia. Prediction models were generated on 1 September in the year before harvest, and then on 1 January and 1 March in the year of harvest, which typically runs from June to November. Our results indicated that in 86.36 % of years, it was possible to determine as early as September in the year before harvest if production would be above the median. This accuracy improved to 95.45 % by January in the year of harvest. The R-squared of the random forest regression model gradually improved from 66.76 to 79.21 % from September in the year before harvest through to March in the same year of harvest. All three sets of variables—(i) simulated biomass indices, (ii) observed climate, and (iii) seasonal climate prediction indices—were typically featured in the models at various stages. Better crop predictions allows farmers to improve their nitrogen management to meet the demands of the new crop, mill managers could better plan the mil's labor requirements and maintenance scheduling activities, and marketers can more confidently manage the forward sale and storage of the crop. Hence, accurate yield forecasts can improve industry sustainability by delivering better environmental and economic outcomes.

Item ID: 43901
Item Type: Article (Research - C1)
ISSN: 1773-0155
Keywords: APSIM; agriculture; nitrogen; fertilizer; value chain; random forest
Funders: Sugar Research Australia (SRA)
Date Deposited: 20 Jul 2016 04:44
FoR Codes: 49 MATHEMATICAL SCIENCES > 4905 Statistics > 490599 Statistics not elsewhere classified @ 50%
30 AGRICULTURAL, VETERINARY AND FOOD SCIENCES > 3004 Crop and pasture production > 300402 Agro-ecosystem function and prediction @ 50%
SEO Codes: 82 PLANT PRODUCTION AND PLANT PRIMARY PRODUCTS > 8203 Industrial Crops > 820304 Sugar @ 100%
Downloads: Total: 5
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page