Assessing the Effects of the New Atlantic Salmon (Salmo salar) Genome Assembly on Imputation Accuracy
Master thesis
View/ Open
Date
2021-06-01Metadata
Show full item recordCollections
- Master’s theses (BioVit) [397]
Abstract
The Atlantic salmon is one of the most economically important species in modern-day aquaculture. For this reason, a lot of effort has been put into implementation and improvement of breeding programs for this species, achieving vast genetic process in a considerably short period of time. Improvements in sequencing technologies have facilitated the use of genomic selection, integrating molecular genetic information and increasing selection response for key production traits of polygenic architecture. However, implementation of genomic selection requires large, densely genotyped populations, which can prove challenging, especially considering aquatic populations. Genotype imputation therefore, constitutes a cost-efficient method that amplifies the genotyping density of large populations, allowing them to be analyzed in low-density and cost genotyping platforms. Although at the time of the first Atlantic salmon genome assembly leading sequencing and bioinformatic methods were applied to assemble the genome reference, the high genomic complexity of the species severely impacted the quality of the produced assembly. Assembly errors are expected to primarily affect genotyping quality and consequently all downstream analyses. The recent release of a new genome assembly for Atlantic salmon (NCBI GeneBank reference: GCA_905237065.2), constructed using long-read sequencing technologies, is expected to improve our understanding of salmon genetics and genomics as well as contribute to the application of higher-quality genomic data in salmon breeding. In this study we explored the improvements achieved in the new genome assembly, as these were realized through a genotype imputation analysis using a small sample of immediate relatives. We report large structural changes occurring in the new genome assembly and discuss their impact on imputation accuracy as well as on currently available genotyping platforms. We also provide potential considerations regarding local heterogeneity of imputation accuracy in relationship to salmon’s high genomic complexity as well as occurrence of structural variation elements. Finally, we discuss possible strengths and weaknesses of different imputation approaches relative to our experimental sample limitations.