Vis enkel innførsel

dc.contributor.authorMekonnen, Yonatan Ayalew
dc.date.accessioned2015-05-28T11:11:04Z
dc.date.available2015-05-28T11:11:04Z
dc.date.copyright2015
dc.date.issued2015-05-28
dc.identifier.urihttp://hdl.handle.net/11250/284181
dc.description.abstractGenetic association studies are primarily used to identify genes associated with complex disease. It can be conducted by genotyping intentionally selected or randomly chosen markers. Numerous statistical and computational algorithms have been developed in the past to analyze the genome wide association study (GWAS) dataset. These are classified as parametric, non-parametric and Bayesian methods. However, there are methodological and computational challenges related with population stratification and the vast volume of data generated by chip and sequencing based technologies. The packages, SNPRelate and GenABEL, are built to overcome this burden. SNPRelate uses parallel computing and loads genotypes block by block to optimize high-speed cache memory. It is designed for principal component analysis (PCA) and identity by descent (IBD) analyses which are used for correcting population structure. Whereas, GenABEL incorporates genome wide rapid association using mixed model and regression (GRAMMAR). It is developed to overcome the limitation of efficiently storing, handling and analyzing data in GWAS by integrating a data format called gwaa.data. In order to evaluate and compare these packages, this study obtained PLINK formatted data from heritable dog osteosarcoma study. PLINK data format is then changed into a genomic data structure (GDS) file format for SNPRelate and gwaa.data file for GenABEL. Using GenABEL, data analysis was performed by ignoring population structure and taking into account population structure. In SNPRelate, LD based pruning is performed prior to PCA and IBD calculation. For three dog breeds, the first and the second PCs have almost 50% of the information. IBD interpretation of PCA indicate that Irish wolfhounds are inbred compared to the other two dog breeds. PCA correction on population structure has the most accurate estimates compared with genomic control and PCs as a predictor correction methods. Comparing SNPRelate and GenABEL, SNPRelate method used for PCA calculation is faster and allows larger data sets than GenABEL which use EIGENSTAR for PCA calculation.nb_NO
dc.language.isoengnb_NO
dc.publisherNorwegian University of Life Sciences, Ås
dc.subjectGenABELnb_NO
dc.subjectGWASnb_NO
dc.subjectIBDnb_NO
dc.subjectSNPRelatenb_NO
dc.subjectparallel computingnb_NO
dc.subjectPCAnb_NO
dc.subjectpopulation structurenb_NO
dc.titleEvaluation of GWAS Method Performance Focusing on Population Stratification and Cryptic Relatednessnb_NO
dc.typeMaster thesisnb_NO
dc.subject.nsiVDP::Mathematics and natural science: 400::Basic biosciences: 470::Genetics and genomics: 474nb_NO
dc.source.pagenumber35nb_NO
dc.description.localcodeM-BIASnb_NO


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel