A comparison of non-linear regression methods for improved on-line near infrared spectroscopic analysis of a sugarcane quality measure

Sexton, Justin, Everingham, Yvette, Donald, David, Staunton, Steve, and White, Ronald (2018) A comparison of non-linear regression methods for improved on-line near infrared spectroscopic analysis of a sugarcane quality measure. Journal of Near Infrared Spectroscopy, 26 (5). pp. 297-310.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: https://doi.org/10.1177/0967033518802448


Abstract

On-line near infrared (NIR) spectroscopic analysis systems play an important role in assessing the quality of sugarcane in Australia. As quality measures are used to calculate the payment made to growers, it is imperative that NIR models are both accurate and robust. Machine learning and non-linear modelling approaches have been explored as methods for developing improved NIR models in a variety of industrial settings, yet there has been little research into their application to cane quality measures. The objective of this paper was to compare chemometric models of commercial cane sugar (CCS) based on four calibration techniques. CCS was estimated using partial least squares regression (PLS), support vector regression (SVR), artificial neural networks (ANNs) and gradient boosted trees (GBTs). Model performance was assessed on an independent validation data set using root mean square error of prediction (RMSEP) and r(2) values. SVR (RMSEP = 0.37%; r(2) = 0.92) and ANN (RMSEP= 0.36%; r(2) = 0.93) performed similarly to PLS (RMSEP = 0.37%; r(2) = 0.92) on the validation data set, while GBT exhibited a much lower skill (RMSEP = 0.51%; r(2) = 0.85). Analysis of important wavelengths in each model showed that PLS regression, SVR and ANN techniques emphasized the importance of similar spectral regions. Future research should consider testing model robustness over seasons and/or regions. Comparisons of chemometric models should consider reporting variable importance as a way of understanding how models use spectral information.

Item ID: 56063
Item Type: Article (Research - C1)
ISSN: 1751-6552
Keywords: commercial cane sugar, sugarcane, gradient boosting, neural networks, cane analysis system, variable importance
Copyright Information: © The Author(s) 2018.
Funders: Sugar Research Australia (SRA), James Cook University (JCU)
Date Deposited: 07 Nov 2018 08:40
FoR Codes: 01 MATHEMATICAL SCIENCES > 0104 Statistics > 010401 Applied Statistics @ 50%
07 AGRICULTURAL AND VETERINARY SCIENCES > 0701 Agriculture, Land and Farm Management > 070199 Agriculture, Land and Farm Management not elsewhere classified @ 50%
SEO Codes: 82 PLANT PRODUCTION AND PLANT PRIMARY PRODUCTS > 8203 Industrial Crops > 820304 Sugar @ 100%
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page