|
SAS® partial least squares for discriminant analysis James B. Reeves, IIIa and Stephen R. Delwicheb aUSDA, ARS, Environmental Management and Byproduct Utilization
Laboratory, Bldg 306, BARC East, Beltsville, MD 20705, USA. E-mail: james.reeves@ars.usda.gov bUSDA, ARS, Food Safety Laboratory, Bldg 303, BARC
East, Beltsville, MD 20705, USA
ABSTRACT:
The objective of this work was to implement discriminant analysis using SAS® partial least squares (PLS) regression for analysis of
spectral data. This was done in combination with previous efforts, which implemented data pre-treatments including scatter correction, derivatives, mean centring and variance
scaling for spectral analysis. Partial least squares analysis is implemented in SAS® as type 2 where a solution for multiple analytes ( Y-variables) is determined
simultaneously, but cannot work with non-numeric analyte values. For discriminant analysis, samples belonging to one of Z classes are coded for Z analytes with all but one
(class to which sample belongs coded as 1) coded as being a 0. Thus, for four classes, all samples are coded with one of four analyte combinations (1,0,0,0; 0,1,0,0; 0,0,1,0; or
0,0,0,1). This paper discusses a SAS® program designed to perform classification/discriminant analysis using SAS® PLS, and to a smaller extent, principal
component analysis and reduced rank regression. The authors’ previously written SAS® macros for pre-treatment of spectral data are implemented. Examples are
presented using two datasets: forages and by-products, and grains. The program allows for testing of multiple spectral pre-treatments in a one-step fashion with summary of all
results. The macro coding for the program and test data sets is available at: http://www.impublications.com/nir/page/software. Please note that the program will not work
properly on Unix-based systems due to DOS calls. Download documentation, program and
data files related to this paper.
Keywords: PLS, partial least squares, principal components, PCA, SAS®, discriminant analysis
|