Variable Selection
Course Description
Variable Selection deals with one of the most difficult problems in chemometrics, selecting variables for regression and classification. In many situations in model building, variable selection, is useful for improving predictions, or for minimizing the number of variables and for other purposes such as reducing costs. But how to do it? Genetic algorithms, forward selection, and jack-knifing are just few of the possible ways to do variable selection. In this short course, the theory behind when to use what is given and an outline of different possible approaches is given. Through examples and exercises, it is shown how some approaches work well in some situations and not in others. The course includes hands-on computer time for participants to work example problems using PLS_Toolbox or Solo.
Prerequisites
MATLAB for Chemometricians and Chemometrics II--Regression and PLS or equivalent experience.
Course Outline
- Motivation and Preliminary Examples: Why select variables?
- Available Variable Selection Methods:
A priori
A posteriori
Model based, e.g. on loadings
Genetic algorithms
Classical
-
Forward, backward selection
Best subset selection
Significance tests
Significance based on Jack-knife
- How to choose a variable selection method
- Variable selection in practice