Ensemble data mining approaches to forecast regional sugarcane crop production

Everingham, Y.L., Smyth, C.W., and Inman-Bamber, N.G. (2009) Ensemble data mining approaches to forecast regional sugarcane crop production. Agricultural and Forest Meteorology, 149 (3). pp. 689-686.

[img] PDF (Published Version)
Restricted to Repository staff only

View at Publisher Website: http://dx.doi.org/10.1016/j.agrformet.20...
 
26
7


Abstract

Accurate yield forecasts are pivotal for the success of any agricultural industry that plans or sells ahead of the annual harvest. Biophysical models that integrate information about crop growing conditions can give early insight about the likely size of a crop. At a point scale, where highly detailed knowledge about environmental and management conditions are known, the performance of reputable crop modelling approaches like APSIM have been well established. However, regional growing conditions tend not to be homogenous. Heterogeneity is common in many agricultural systems, and particularly in sugarcane systems. To overcome this obstacle, hundreds of model settings (‘models’ for convenience) that represent different environmental and management conditions were created for Ayr, a major sugarcane growing region in north eastern Australia. Statistical data mining methods that used ensembles were used to select and assign weights to the best models. One technique, called a lasso approximation produced the best results. This procedure, produced a predictive correlation (rcv) of 0.71 when predicting end of season sugarcane yields some 4 months prior to the start of the harvest season, and 10 months prior to harvest completion. This continuous forecasting methodology based on statistical ensembles represents a considerable improvement upon previous research where only categorical forecast predictions had been employed.

Item ID: 11085
Item Type: Article (Research - C1)
ISSN: 1873-2240
Keywords: predict; forward stagewise; simulation; top-down; machine learning; lasso;
Date Deposited: 16 May 2010 23:43
FoR Codes: 01 MATHEMATICAL SCIENCES > 0104 Statistics > 010401 Applied Statistics @ 34%
04 EARTH SCIENCES > 0401 Atmospheric Sciences > 040105 Climatology (excl Climate Change Processes) @ 33%
07 AGRICULTURAL AND VETERINARY SCIENCES > 0701 Agriculture, Land and Farm Management > 070199 Agriculture, Land and Farm Management not elsewhere classified @ 33%
SEO Codes: 82 PLANT PRODUCTION AND PLANT PRIMARY PRODUCTS > 8203 Industrial Crops > 820304 Sugar @ 40%
96 ENVIRONMENT > 9609 Land and Water Management > 960999 Land and Water Management of Environments not elsewhere classified @ 30%
97 EXPANDING KNOWLEDGE > 970104 Expanding Knowledge in the Earth Sciences @ 30%
Downloads: Total: 7
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page