Reliably detecting clinically important variants requires both combined variant calls and optimized filtering strategies

Field, Matthew A., Cho, Vicky, Andrews, T. Daniel, and Goodnow, Chris C. (2015) Reliably detecting clinically important variants requires both combined variant calls and optimized filtering strategies. PLoS ONE, 10 (11). e0143199. pp. 1-19.

PDF (Published Version) - Published Version
Available under License Creative Commons Attribution.

Download (1MB) | Preview
View at Publisher Website:


A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant calls is highly variable however, depending on factors such as sequence library quality as well as the choice of short-read aligner, variant caller, and variant caller filtering strategy. Here we present a two-part study first using the high quality 'genome in a bottle' reference set to demonstrate the significant impact the choice of aligner, variant caller, and variant caller filtering strategy has on overall variant call quality and further how certain variant callers outperform others with increased sample contamination, an important consideration when analyzing sequenced cancer samples. This analysis confirms previous work showing that combining variant calls of multiple tools results in the best quality resultant variant set, for either specificity or sensitivity, depending on whether the intersection or union, of all variant calls is used respectively. Second, we analyze a melanoma cell line derived from a control lymphocyte sample to determine whether software choices affect the detection of clinically important melanoma risk-factor variants finding that only one of the three such variants is unanimously detected under all conditions. Finally, we describe a cogent strategy for implementing a clinical variant detection pipeline; a strategy that requires careful software selection, variant caller filtering optimizing, and combined variant calls in order to effectively minimize false negative variants. While implementing such features represents an increase in complexity and computation the results offer indisputable improvements in data quality.

Item ID: 43589
Item Type: Article (Research - C1)
ISSN: 1932-6203
Additional Information:

© 2015 Field et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funders: National Institutes of Health (NIH), National Health and Medical Research Council of Australia (NHMRC), National Collaborative Research Infrstructure Structure, Melanoma Institute of Australia, Bioplatforms Australia
Projects and Grants: NIH Grant ID: U19 AII00627, NHMRC Grant ID Australia Fellowship 585490
Date Deposited: 13 Apr 2016 04:03
FoR Codes: 06 BIOLOGICAL SCIENCES > 0601 Biochemistry and Cell Biology > 060102 Bioinformatics @ 25%
06 BIOLOGICAL SCIENCES > 0604 Genetics > 060408 Genomics @ 25%
08 INFORMATION AND COMPUTING SCIENCES > 0803 Computer Software > 080301 Bioinformatics Software @ 50%
SEO Codes: 89 INFORMATION AND COMMUNICATION SERVICES > 8902 Computer Software and Services > 890299 Computer Software and Services not elsewhere classified @ 100%
Downloads: Total: 919
Last 12 Months: 51
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page