Genomics exploring population structure and sex determination in Atlantic cod (Gadus morhua)
Abstract
Atlantic cod (Gadus morhua) is a benthopelagic cold-water marine species found across the North Atlantic Ocean. Thanks to its historical and current significance as an important food resource it is one of the most well-known and well-studied teleost fish species. Genetic studies of Atlantic cod have been conducted since the 1930s, but with the advent of highthroughput DNA sequencing technologies investigations have been reinvigorated with highresolution data allowing us to address previously unanswerable questions. This thesis explores the genome biology of Atlantic cod with the specific goals of; i) understanding the genomics underlying population divergence between cod populations in Norwegian waters, ii) using long-read sequencing technology to build a genome assembly representing the southern cod populations, and iii) characterizing genomic regions associated with sex determination. The knowledge generated in this thesis can enhance the understanding on how genomic architecture influence genome-wide variations contributing to population structures and sex determination. Atlantic cod in the Norwegian Sea are classified into one of two different populations; migratory Northeast Arctic cod (NEAC) and stationary Norwegian coastal cod (NCC). In Paper I, we sought to understand how phenotypic and genetic differences are maintained despite interbreeding between NEAC and NCC. Utilizing genotype data from 192 parents of farmed families of NEAC, NCC or NEACxNCC crosses, we identified extended linkage disequilibrium (LD) in a 17.4Mb region on linkage group 1 (LG01). Furthermore, linkage analysis revealed two adjacent inversions within the region that repress meiotic recombination in NEACxNCC crosses. The haplotype block harbours 763 genes, including candidates regulating swim bladder pressure, heme synthesis and skeletal muscle organization conferring NEAC adaptation to long-distance migration and vertical movements to large depths. Our results document that inversion is the genetic mechanism that maintains the genetic differentiation despite interbreeding and we hypothesize the cooccurrence of possibly multiple adaptive genes forming a ‘supergene’ advantageous to NEAC. The public reference genomes for Atlantic cod have all been derived from NEAC samples and therefore representing the northernmost cod population, adapted to near freezing temperatures. Several studies have demonstrated regions of genomic differentiation on linkage groups (LGs) 1, 2, 7 and 12 associated with adaptation to temperature along the 4 north-south gradient (Bradbury et al. 2010; 2013). In paper II, we generated a highly contiguous genome assembly representing the southern Celtic population of Atlantic cod using long-read nanopore sequencing data. By comparing this to the latest NEAC assembly gadMor3 we were able to characterize in detail the rearrangements creating the ‘islands of genomic divergence’ on LGs 1, 2, 7, and 12. The long contiguous genome assembly also facilitated the identification of a putative centromere-specific repeat. In paper III, by comparing and contrasting whole genome short-read sequencing data from 49 male and 53 female cod, we detected a male specific region of 9,149 bp on LG11. A diagnostic PCR test was developed and confirmed the sex-specific nature of this male