SMOTE-ENN resampling technique with Bayesian optimization for multi-class classification of dry bean varieties

Mukherjee, Arnab, Chalak Qazani, Mohamadreza, Rana, B.M. Jewel, Akter, Shahina, Mohajerzadeh, Amirhossein, Sathi, Nusrat Jahan, Ali, Lasker Ershad, Khan, Md Salauddin, and Asadi, Houshyar (2025) SMOTE-ENN resampling technique with Bayesian optimization for multi-class classification of dry bean varieties. Applied Soft Computing, 181. 113467.

[img]
Preview
PDF (Published Version) - Published Version
Available under License Creative Commons Attribution.

Download (7MB) | Preview
View at Publisher Website: https://doi.org/10.1016/j.asoc.2025.1134...


Abstract

The imbalanced classification problem poses a significant challenge in machine learning, often resulting in biased models and poor performance for minority classes. This study introduces an innovative hybrid resampling technique combining Synthetic Minority Oversampling Technique and Edited Nearest Neighbours (SMOTE- ENN), optimized using Bayesian Optimization, to address these limitations. The proposed framework integrates advanced feature pre-processing, hybrid resampling, and machine learning models to enhance classification performance. Using the publicly available dry bean dataset containing 16 geometric features of seven seed varieties, the methodology demonstrates remarkable improvements in predictive accuracy and class balance. Employing cutting-edge classifiers, the improved Light Gradient Boosting Machine (LBM) with Bayesian optimization achieved an unprecedented accuracy of 99.59 %, outperforming traditional approaches. Results reveal the potential of hybrid resampling techniques and Bayesian optimization in effectively capturing feature patterns, enhancing model diversity, and ensuring robust classification of imbalanced datasets. This research underscores the application of soft computing methods to real-world multi-class classification challenges, offering practical insights for similar domains.

Item ID: 86689
Item Type: Article (Research - C1)
ISSN: 1872-9681
Copyright Information: © 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Date Deposited: 27 Aug 2025 01:11
FoR Codes: 30 AGRICULTURAL, VETERINARY AND FOOD SCIENCES > 3004 Crop and pasture production > 300408 Crop and pasture post harvest technologies (incl. transportation and storage) @ 40%
40 ENGINEERING > 4007 Control engineering, mechatronics and robotics > 400702 Automation engineering @ 20%
46 INFORMATION AND COMPUTING SCIENCES > 4611 Machine learning > 461104 Neural networks @ 40%
SEO Codes: 24 MANUFACTURING > 2401 Agricultural chemicals > 240103 Crop and pasture protection chemicals @ 30%
28 EXPANDING KNOWLEDGE > 2801 Expanding knowledge > 280101 Expanding knowledge in the agricultural, food and veterinary sciences @ 20%
22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220402 Applied computing @ 50%
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page