Clustered Automated Machine Learning (CAML) model for clinical coding multi-label classification
Mustafa, Akram, and Rahimi Azghadi, Mostafa (2025) Clustered Automated Machine Learning (CAML) model for clinical coding multi-label classification. International Journal of Medical Informatics, 16. pp. 1507-1529.
|
PDF (Published Version)
- Published Version
Available under License Creative Commons Attribution. Download (929kB) | Preview |
Abstract
Clinical coding is a time-consuming task that involves manually identifying and classifying patients’ diseases. This task becomes even more challenging when classifying across multiple diagnoses and performing multi-label classification. Automated Machine Learning (AutoML) techniques can improve this classification process. However, no previous study has developed an AutoML-based approach for multi-label clinical coding. To address this gap, a novel approach, called Clustered Automated Machine Learning (CAML), is introduced in this paper. CAML utilizes the AutoML library Auto-Sklearn and cTAKES feature extraction method. CAML clusters binary diagnosis labels using Hamming distance and employs the AutoML library to select the best algorithm for each cluster. The effectiveness of CAML is evaluated by comparing its performance with that of the Auto-Sklearn model on five different datasets from the Medical Information Mart for Intensive Care (MIMIC III) database of reports. These datasets vary in size, label set, and related diseases. The results demonstrate that CAML outperforms Auto-Sklearn in terms of Micro F1-score and Weighted F1-score, with an overall improvement ratio of 35.15% and 40.56%, respectively. The CAML approach offers the potential to improve healthcare quality by facilitating more accurate diagnoses and treatment decisions, ultimately enhancing patient outcomes.
Item ID: | 85958 |
---|---|
Item Type: | Article (Research - C1) |
ISSN: | 1872-8243 |
Copyright Information: | This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
Date Deposited: | 24 Jun 2025 22:54 |
FoR Codes: | 46 INFORMATION AND COMPUTING SCIENCES > 4611 Machine learning > 461199 Machine learning not elsewhere classified @ 100% |
SEO Codes: | 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220403 Artificial intelligence @ 100% |
Downloads: |
Total: 1 Last 12 Months: 1 |
More Statistics |