Show simple item record

dc.contributor.advisorKristian Hovde Liland
dc.contributor.advisorLars-Gustav Snipen
dc.contributor.authorSteffenssen, Halvor Hauge
dc.date.accessioned2024-08-23T16:28:50Z
dc.date.available2024-08-23T16:28:50Z
dc.date.issued2024
dc.identifierno.nmbu:wiseflow:7110333:59110572
dc.identifier.urihttps://hdl.handle.net/11250/3147984
dc.description.abstractWith the amount of genetic data we can extract from nature with modern sequencing technology, there is a growing need for tools to help classify and analyze this data. Machine learning algorithms like Random Forest and Artificial Neural Networks are already in use in this field of bioinformatics. Tsetlin Machine is a new type of machine learning that has shown much promise in DNA classification. It uses binary representation and logic that are close to how a computer operates to create models. This thesis will try to test the Tsetlin Machine’s ability to classify genetic data. A database with the DNA of 709 species commonly found in deep-sea sediments that were picked based on the results of the AQUAeD project. Will be split up into different datasets. The Tsetlin Machine, together with a random forest model, a Convolutional neural network, and a model that counts the number of GC bases, gets these datasets and tries to classify different classes on multiple taxonomic ranks. They are then evaluated based on the accuracy of their classification and the speed of training. The results show that the Tsetlin Machine has great promise in this field and acquired similar scores to the Random Forest Classifier and the convolutional Neural Network in accuracy and speed.
dc.description.abstract
dc.languageeng
dc.publisherNorwegian University of Life Sciences
dc.titleTsetlin machine for classifying genetic data from sea-floor species
dc.typeMaster thesis


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record