Vis enkel innførsel

dc.contributor.advisorHabib Ullah
dc.contributor.advisorFadi Al Machot
dc.contributor.authorSivathas, Sathuriyan
dc.date.accessioned2024-08-23T16:28:31Z
dc.date.available2024-08-23T16:28:31Z
dc.date.issued2024
dc.identifierno.nmbu:wiseflow:7110333:59110598
dc.identifier.urihttps://hdl.handle.net/11250/3147971
dc.description.abstractAbstract Research on deep learning models is constantly advancing, and has emerged as an important field to study. Deep learning models have a vital role in the development of self-driving cars, making it essential to thoroughly understand their performance. This thesis focuses on the performance of selected deep learning models on various datasets related to traffic objects and participants. The deep learning models evaluated in this thesis include VGG-16, ResNet-50, WideResNet-50-2, EfficientNetB0 and Vision Transformer, all explored within the context of transfer learning. These models have been further trained on three datasets based on the images from the Audi Autonomous Driving Dataset (A2D2), specifically prepared for classifying traffic participants and objects. The first dataset comprises of normal and augmented images, and is referred to as NAI dataset. The second dataset, referred to as the NANI dataset, comprises of normal, augmented and noisy mixed-class images. The third dataset consists of normal, augmented and synthetic images generated from a Deep Convolutional Generative Adversarial Network (DCGAN), and is referred to as NASI dataset. All deep learning models were trained on the NAI, and NANI datasets, while VGG-16 and Vision Transformer are the only models to be trained on the NASI dataset. Through these datasets, the impact of normal and augmented images, the impact of noisy mixed-class images and the impact of images generated by a DCGANs are evaluated. The VGG-16 model outperformed all others, and achieved consistent performance across all three datasets. The WideResNet-50-2 model also performed well on the two datasets it was evaluated on, but did not match the performance of the VGG-16 model. The Vision Transformer model also showed promising results across all datasets, particularly in terms of consistency and stability. These results indicate that deep learning models can perform effectively and consistently across diverse datasets involving traffic participant and objects, whether the images are normal, augmented, noisy, or synthetic.
dc.description.abstract
dc.languageeng
dc.publisherNorwegian University of Life Sciences
dc.titleDeep Learning based Models for Traffic Participant and Object Classification
dc.typeMaster thesis


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel