Ensemble Regression Modelling for Genetic Network Inference

Gamage, Hasini Nakulugamuwa, Chetty, Madhu, Shatte, Adrian, and Hallinan, Jennifer (2022) Ensemble Regression Modelling for Genetic Network Inference. In: Proceedings of the IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology. From: CIBCB 2022: IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, 15-17 August 2022, Ottawa, Canada.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: https://doi.org/10.1109/CIBCB55180.2022....


An accurate reconstruction of Gene Regulatory Networks (GRNs) from time series gene expression data is crucial for discovering complex biological interactions. Among many different approaches for inferring GRNs, there are several methods which produce high false positive interactions, and are unstable, requiring fine tuning for many of their parameters. In this paper, we consider the GRN inference problem as a regression problem, and propose a simple ensemble regression-based feature selection model which is a combination of cross-validated Lasso and cross-validated Ridge algorithms for reconstructing GRNs. Due to the novelty of the proposed ensemble model, it is able to eliminate overfitting, multi co-linearity issues, and irrelevant genes within one computational approach. While observing the type of gene-gene regulatory interactions the regression model also identifies the direction of these interactions. A new coefficient of determination (R 2 )-based approach identifies the best model to fit the data among LassoCV and RidgeCV, and evaluates the model importance in term of gene-wise maximum in-degree which decides the maximum number of regulatory genes including self-regulations that can be selected from a given method. Then, an evaluated gene score-based majority voting technique aggregates the selected gene lists from each method. In our experiments, the performance of the proposed ensemble approach was evaluated using gene expression datasets from three small-scale real gene networks. Our proposed model outperformed other state-of-the-art methods, producing high true positives, reducing false positives, and obtaining high Structural Accuracy, while maintaining model stability and efficiency.

Item ID: 81625
Item Type: Conference Item (Research - E1)
ISBN: 9781665484626
Keywords: Systematics; Computational modeling; Biological system modeling; Time series analysis; Feature extraction; Stability analysis; Data models
Copyright Information: Copyright © 2022, IEEE
Date Deposited: 14 Feb 2024 23:34
FoR Codes: 31 BIOLOGICAL SCIENCES > 3102 Bioinformatics and computational biology > 310202 Biological network analysis @ 60%
46 INFORMATION AND COMPUTING SCIENCES > 4611 Machine learning > 461199 Machine learning not elsewhere classified @ 40%
SEO Codes: 28 EXPANDING KNOWLEDGE > 2801 Expanding knowledge > 280102 Expanding knowledge in the biological sciences @ 100%
Downloads: Total: 1
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page