• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Norges miljø- og biovitenskapelige universitet
  • Faculty of Chemistry, Biotechnology and Food Science (KBM)
  • Master's theses (KBM)
  • View Item
  •   Home
  • Norges miljø- og biovitenskapelige universitet
  • Faculty of Chemistry, Biotechnology and Food Science (KBM)
  • Master's theses (KBM)
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Truncation-PLS for Variable Selection : a simulation study

Yao, Ying
Master thesis
Thumbnail
View/Open
Master thesis (2.717Mb)
URI
http://hdl.handle.net/11250/216776
Date
2014-08-06
Metadata
Show full item record
Collections
  • Master's theses (KBM) [742]
Abstract
Partial least squares (PLS) is a class of statistical methods for multivariate data analysis. In the PLSR algorithm, regression, reducing dimensions and analyzing correlations among variables are simultaneously performed. In the recent 20 years, as high-dimensional data have emerged in large numbers, PLS has been improved and applied in many fields.

In this research, a variable-selection procedure, which is derived from Lenth method, was embedded into PLSR. This algorithm known as Truncation PLS was tried out on several simulated datasets with different designs for the parameters. In order to simulate dataset with different properties, an R package relsim was applied. Another well-known wrapper method Jackknife PLS was also applied to the same datasets as a reference. The purpose of this research is to evaluate these two methods and explore how the properties of dataset will affect the performance of a specific method.

After applying these two PLS methods to different datasets, the value of root mean squared error of prediction (RMSEP) for every parameter setting was obtained through cross validation. RMSEP is a statistic indicating the capability of a model for prediction. In addition, by comparing the beforehand known relevant variables in the datasets, the accuracies of variable selection were calculated to evaluate the capability of a method for variable selection.

Considering the results, both of these two methods performed well and produced satisfying values of RMSEP and accuracy. However, the truncation PLS showed a better capability of dealing with datasets of high multicollinearity in X-variables and smaller variance in its relevant component. Besides, Truncation-PLS method is more efficient than Jackknife PLS from the aspect of calculation and time consumption.

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit