SMOTE-ENN resampling technique with Bayesian optimization for multi-class classification of dry bean varieties
Mukherjee, Arnab, Chalak Qazani, Mohamadreza, Rana, B.M. Jewel, Akter, Shahina, Mohajerzadeh, Amirhossein, Sathi, Nusrat Jahan, Ali, Lasker Ershad, Khan, Md Salauddin, and Asadi, Houshyar (2025) SMOTE-ENN resampling technique with Bayesian optimization for multi-class classification of dry bean varieties. Applied Soft Computing, 181. 113467.
|
PDF (Published Version)
- Published Version
Available under License Creative Commons Attribution. Download (7MB) | Preview |
Abstract
The imbalanced classification problem poses a significant challenge in machine learning, often resulting in biased models and poor performance for minority classes. This study introduces an innovative hybrid resampling technique combining Synthetic Minority Oversampling Technique and Edited Nearest Neighbours (SMOTE- ENN), optimized using Bayesian Optimization, to address these limitations. The proposed framework integrates advanced feature pre-processing, hybrid resampling, and machine learning models to enhance classification performance. Using the publicly available dry bean dataset containing 16 geometric features of seven seed varieties, the methodology demonstrates remarkable improvements in predictive accuracy and class balance. Employing cutting-edge classifiers, the improved Light Gradient Boosting Machine (LBM) with Bayesian optimization achieved an unprecedented accuracy of 99.59 %, outperforming traditional approaches. Results reveal the potential of hybrid resampling techniques and Bayesian optimization in effectively capturing feature patterns, enhancing model diversity, and ensuring robust classification of imbalanced datasets. This research underscores the application of soft computing methods to real-world multi-class classification challenges, offering practical insights for similar domains.