Performance and precision of double digestion RAD (ddRAD) genotyping in large multiplexed datasets of marine fish species

Maroso, F., Hillen, J.E.J., Pardo, B.G., Gkagkavouzis, K., Coscia, I., Hermida, M., Franch, R., Hellemans, B., Van Houdt, J., Simionati, B., Taggart, J.B., Nielsen, E.E., Maes, G.E., Ciavaglia, S.A., Webster, L.M., Volckaert, F.A.M., Martínez, P., Bargelloni, Luca, Ogden, R., and AquaTrace Consortium, (2018) Performance and precision of double digestion RAD (ddRAD) genotyping in large multiplexed datasets of marine fish species. Marine Genomics, 39. pp. 64-72.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: https://doi.org/10.1016/j.margen.2018.02...
15


Abstract

The development of Genotyping-By-Sequencing (GBS) technologies enables cost-effective analysis of large numbers of Single Nucleotide Polymorphisms (SNPs), especially in "non-model" species. Nevertheless, as such technologies enter a mature phase, biases and errors inherent to GBS are becoming evident. Here, we evaluated the performance of double digest Restriction enzyme Associated DNA (ddRAD) sequencing in SNP genotyping studies including high number of samples. Datasets of sequence data were generated from three marine teleost species (> 5500 samples, > 2.5 x 10(12) bases in total), using a standardized protocol. A common bioinformatics pipeline based on STACKS was established, with and without the use of a reference genome. We performed analyses throughout the production and analysis of ddRAD data in order to explore (i) the loss of information due to heterogeneous raw read number across samples; (ii) the discrepancy between expected and observed tag length and coverage; (iii) the performances of reference based vs. de novo approaches; (iv) the sources of potential genotyping errors of the library preparation/bioinformatics protocol, by comparing technical replicates. Our results showed use of a reference genome and a posteriori genotype correction improved genotyping precision. Individual read coverage was a key variable for reproducibility; variance in sequencing depth between loci in the same individual was also identified as an important factor and found to correlate to tag length. A comparison of downstream analysis carried out with ddRAD vs single SNP allele specific assay genotypes provided information about the levels of genotyping imprecision that can have a significant impact on allele frequency estimations and population assignment. The results and insights presented here will help to select and improve approaches to the analysis of large datasets based on RAD-like methodologies.

Item ID: 54594
Item Type: Article (Research - C1)
ISSN: 1874-7478
Keywords: ddRAD, European sea bass, GBS, Gilthead sea bream, sequencing precision, turbot
Funders: European Community's Seventh Framework Programme, Flanders Innovation & Entrepreneurship, Onassis Foundation
Date Deposited: 18 Jul 2018 07:51
FoR Codes: 31 BIOLOGICAL SCIENCES > 3102 Bioinformatics and computational biology > 310207 Statistical and quantitative genetics @ 20%
31 BIOLOGICAL SCIENCES > 3105 Genetics > 310509 Genomics @ 30%
31 BIOLOGICAL SCIENCES > 3105 Genetics > 310599 Genetics not elsewhere classified @ 50%
SEO Codes: 83 ANIMAL PRODUCTION AND ANIMAL PRIMARY PRODUCTS > 8302 Fisheries - Wild Caught > 830204 Wild Caught Fin Fish (excl. Tuna) @ 30%
83 ANIMAL PRODUCTION AND ANIMAL PRIMARY PRODUCTS > 8399 Other Animal Production and Animal Primary Products > 839902 Fish Product Traceability and Quality Assurance @ 30%
96 ENVIRONMENT > 9608 Flora, Fauna and Biodiversity > 960808 Marine Flora, Fauna and Biodiversity @ 40%
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page