Prediction and validation of protein–protein interactors from genome-wide DNA-binding data using a knowledge-based machine-learning approach
Waardenberg, Ashley J., Homan, Bernou, Mohamed, Stephanie, Harvey, Richard P., and Bouveret, Romaric (2016) Prediction and validation of protein–protein interactors from genome-wide DNA-binding data using a knowledge-based machine-learning approach. Open Biology, 6. 160183.
|
PDF (Published Version)
- Published Version
Available under License Creative Commons Attribution. Download (1MB) | Preview |
Abstract
The ability to accurately predict the DNA targets and interacting cofactors of transcriptional regulators from genome-wide data can significantly advance our understanding of gene regulatory networks. NKX2-5 is a homeodomain transcription factor that sits high in the cardiac gene regulatory network and is essential for normal heart development. We previously identified genomic targets for NKX2-5 in mouse HL-1 atrial cardiomyocytes using DNA-adenine methyltransferase identification (DamID). Here, we apply machine learning algorithms and propose a knowledge-based feature selection method for predicting NKX2-5 protein: protein interactions based on motif grammar in genome-wide DNA-binding data. We assessed model performance using leave-one-out cross-validation and a completely independent DamID experiment performed with replicates. In addition to identifying previously described NKX2-5-interacting proteins, including GATA, HAND and TBX family members, a number of novel interactors were identified, with direct protein: protein interactions between NKX2-5 and retinoid X receptor (RXR), paired-related homeobox (PRRX) and Ikaros zinc fingers (IKZF) validated using the yeast two-hybrid assay. We also found that the interaction of RXRα with NKX2-5 mutations found in congenital heart disease (Q187H, R189G and R190H) was altered. These findings highlight an intuitive approach to accessing protein–protein interaction information of transcription factors in DNA-binding experiments.
Item ID: | 55656 |
---|---|
Item Type: | Article (Research - C1) |
ISSN: | 2046-2441 |
Keywords: | machine learning, protein–protein interactions, transcription factors, gene regulatory networks |
Copyright Information: | Copyright © 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited. |
Funders: | National Health and Medical Research Council (NHMRC), Australian Research Council (ARC), University of New South Wales (UNSW) |
Projects and Grants: | NHMRC 573703, NHMRC 1061539, NHMRC 573705, ARC DP0988507 |
Date Deposited: | 28 Sep 2018 00:29 |
FoR Codes: | 31 BIOLOGICAL SCIENCES > 3102 Bioinformatics and computational biology > 310202 Biological network analysis @ 100% |
SEO Codes: | 97 EXPANDING KNOWLEDGE > 970106 Expanding Knowledge in the Biological Sciences @ 100% |
Downloads: |
Total: 737 |
More Statistics |