Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

Edwards, Richard J., Field, Matt A., Ferguson, James M., Dudchenko, Olga, Keilwagen, Jens, Rosen, Benjamin D., Johnson, Gary S., Rice, Edward S., Hillier, La Deanna, Hammond, Jillian M., Towarnicki, Samuel G., Omer, Arina, Khan, Ruqayya, Skvortsova, Ksenia, Bogdanovic, Ozren, Zammit, Robert A., Aiden, Erez Lieberman, Warren, Wesley C., and Ballard, J. William O. (2021) Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome. BMC Genomics, 22. 188.

[img]
Preview
PDF (Published Version) - Published Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview
View at Publisher Website: https://doi.org/10.1186/s12864-021-07493...
 
37


Abstract

Background: Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness.

Results: Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection.

Conclusions: The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Item ID: 72406
Item Type: Article (Research - C1)
ISSN: 1471-2164
Keywords: Canine genome; Domestication; Comparative genomics; Artificial selection
Copyright Information: © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Funders: National Health and Medical Research Council (NHMRC), Australian Research Council (ARC)
Projects and Grants: NHMRC APP5121190, ARC LP160100610, ARC LP180100721
Date Deposited: 16 Feb 2022 02:22
FoR Codes: 31 BIOLOGICAL SCIENCES > 3102 Bioinformatics and computational biology > 310204 Genomics and transcriptomics @ 50%
31 BIOLOGICAL SCIENCES > 3104 Evolutionary biology > 310405 Evolutionary ecology @ 50%
SEO Codes: 20 HEALTH > 2099 Other health > 209999 Other health not elsewhere classified @ 100%
Downloads: Total: 37
Last 12 Months: 15
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page