Vis enkel innførsel

dc.contributor.advisorTomic, Oliver
dc.contributor.advisorLiland, Kristian Hovde
dc.contributor.authorAzadi, Ghazal Gazelle
dc.date.accessioned2021-08-26T10:26:11Z
dc.date.available2021-08-26T10:26:11Z
dc.date.issued2021
dc.identifier.urihttps://hdl.handle.net/11250/2771366
dc.description.abstractGastrointestinal neuroendocrine tumors (NETs) are slow-growing tumors. In this type of cancer, survival rate is an important factor. The current study considers the number of survival days as the target variable and tries to spot important features impacting this variable. Applying preprocessing steps, the dataset was prepared to be used in the machine learning algorithms. Moreover to that, using Repeated Elastic Net Technique (RENT), some of the relatively important features were selected and our relatively wide dataset with high number of features and low number of samples changed into a more stable dataset. However since we wanted to select the features based on a model which was relatively reliable in terms of error (RMSEP) and R^2, we examined three different complementary approaches. In the first approach, we considered our full dataset without any missing items. However RENT models selected features based on average R^2 of -47% and -40% for the first and second block, respectively. In the second approach, we include two more features which caused our dataset to lose 9 samples, since these features include 9 missing items. However this change helped our RENT models’ R^2’s to experience improvements until 20% and -36%. In the last approach, we excluded some samples causing too much noise. Moreover to that, consulting with experts, we decided to remove some features which we already knew are not important and lastly having a Box- Cox transformation of the target we started working with a normalized response vector which had symmetric distribution. This approach helped us achieving aver- age R^2’s of 34% and 21% for the first and second block respectively. In the last step, multi block method of ROSA (Response Oriented Sequential Alter- nation) was applied to analyze our dataset obtained from the last steps. Modeling our problem with ROSA, this method gave us an acceptable R^2 of 74% on the cross validated data. ROSA also helped us ordering the features based on their importances.en_US
dc.language.isoengen_US
dc.publisherNorwegian University of Life Sciences, Åsen_US
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/deed.no*
dc.subjectBox-Coxen_US
dc.subjectCross validated dataen_US
dc.subjectRepeated Elastic Net Technique (RENT)en_US
dc.subjectResponse Oriented Sequential Alternation (ROSA)en_US
dc.titleMulti block analysis of gastrointestinal neuroendocrine tumors data using response oriented sequential alternation (ROSA)en_US
dc.typeMaster thesisen_US
dc.subject.nsiVDP::Mathematics and natural science: 400en_US
dc.description.localcodeM-DVen_US


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal