Vis enkel innførsel

dc.contributor.advisorKathrine Frey Frøslie
dc.contributor.advisorOliver Tomic
dc.contributor.authorJentoft, Susannah Lucy
dc.date.accessioned2024-02-23T17:27:14Z
dc.date.available2024-02-23T17:27:14Z
dc.date.issued2023
dc.identifierno.nmbu:wiseflow:6983034:56771808
dc.identifier.urihttps://hdl.handle.net/11250/3119743
dc.description.abstractData underlying any machine learning model is prone to change over time in a process called model drift. The extent of change and effect on the model performance should be monitored in production settings to avoid decreasing predictive performance. This thesis explores model drift in a small case study of occupation classifications based on text variables from the Norwegian Labour Force Survey. Kolmogorov-Smirnoff drift detection is tested, together with a novel multivariate approach, using the RV coefficient, to explore local changes within occupation classes. Drift mitigation is explored using four adaptive methods: fixed-windows, weighting, Hoeffding adaptive trees and a new targeted matching approach to create training data. Feature drift was detected using both descriptive and statistical methods for one of the groups explored; a model with occupations of very different natures. Using RV values, drift was visualized and seen within classes of several of the occupations investigated. Slight decreases in model performance were observed when models were trained on a fixed, early period. Specific adaptive methods to learn under drift did not perform better than a generic approach using all data. However, within classes where gradual drift was visually seen, an adaptive weighting algorithm performed best. In the occupation class that showed a recurrent drift pattern, the novel targeted matching algorithm performed slightly better than other methods. Further investigations on how these methods perform on larger classification models are recommended to generalize these findings.
dc.description.abstract
dc.languageeng
dc.publisherNorwegian University of Life Sciences
dc.titleNavigating Model Drift: A Case Study on Classifying Occupations Using Textual Data
dc.typeMaster thesis


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel