Predicting Overall Survival Using Machine Learning Algorithms in Oral Cavity Squamous Cell Carcinoma

Tan, JIa Yan, Adeoye, John, Thomson, Peter, Sharma, Dileep, Ramamurthy, Poornima, and Choi, Siu-Wai (2022) Predicting Overall Survival Using Machine Learning Algorithms in Oral Cavity Squamous Cell Carcinoma. Anticancer Research, 42 (12). pp. 5859-5866.

[img] PDF (Accepted Publisher Version) - Published Version
Restricted to Repository staff only

View at Publisher Website:


Background/Aim: Machine learning (ML) models are often modelled to predict cancer prognosis but rarely consider spatial factors in a region. Hence this study explored machine learning algorithms utilising Local Government Areas (LGAs) in Queensland, Australia to spatially predict 3- and 5-year prognosis of oral cancer patients and provide clinical interpretability of the predicted outcome made by the ML model.

Patients and Methods: Data from a total of 3,841 oral cancer patients were retrieved from the Queensland Cancer Registry (QCR). Synthesizing minority oversampling technique together with edited nearest neighbours (SMOTE-ENN) was used to pre-process unbalanced datasets. Five ML models: logistic regression, random forest classifier, XGBoost, Gaussian Naïve Bayes and Voting Classifier were trained. Predictive features were age, sex, LGAs, tumour site and differentiation. Outcomes were 3- and 5-year overall survival of patients. Model performances on test set were evaluated using area under the curve and F1 scores. SHapley Additive exPlanations (SHAP) method was applied to the best performing model for model interpretation of the predicted outcome.

Results: The Voting Classifier was the best performing model with F1 score of 0.58 and 0.64 for 3- and 5-year overall survival, respectively. Age was the most important feature in the Voting Classifier in 3- and 5-year prognosis prediction. LGAs at diagnosis was the top 3 predictive feature for both 3- and 5-year models.

Conclusion: The Voting Classifier demonstrated the best overall performance in classifying both 3- and 5-year overall survival of oral cancer patients in Queensland. SHAP method provided clinical understanding of the predictive features of the Voting Classifier.

Item ID: 76952
Item Type: Article (Research - C1)
ISSN: 1791-7530
Copyright Information: Copyright © 2022 International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
Date Deposited: 07 Dec 2022 00:29
FoR Codes: 32 BIOMEDICAL AND CLINICAL SCIENCES > 3203 Dentistry > 320305 Oral and maxillofacial surgery @ 50%
32 BIOMEDICAL AND CLINICAL SCIENCES > 3211 Oncology and carcinogenesis > 321199 Oncology and carcinogenesis not elsewhere classified @ 50%
SEO Codes: 20 HEALTH > 2001 Clinical health > 200105 Treatment of human diseases and conditions @ 100%
Downloads: Total: 2
Last 12 Months: 1
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page