Preprocessing NIR spectra Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » Chemometrics » Preprocessing NIR spectra « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

David W. Hopkins (dhopkins)
Senior Member
Username: dhopkins

Post Number: 163
Registered: 10-2002
Posted on Friday, September 03, 2010 - 11:55 am:   

Hi Brendon,

Welcome to the discussion group!

The use of pretreatments should never become default, they need to be tailored to the particular measurement of interest.

I think there are 2 main reasons to use a 2Der in preference to 1Der. 1) The 2Der is easier for me to interpret, because there is a (negative) peak where the original spectrum has a (positive) peak. This facilitates spectral interpretation. The 1Der goes through a reverse S shape that crosses the zero line at the position of the peak in the parent spectrum, and one can train oneself to read such spectra, it just takes practice. 2) The 2Der is able to compensate for sloping baselines better than the 1Der.

Reason (1) holds little water for me, because the main objective of using pretreatments at all is to produce calibrations that obtain optimal SEP(s) on test set(s), with models that are employing as few factors as possible. Fewer factors generally translates to calibrations that will be more 'robust' to future sample set variability. The quotes indicate that there is still considerable debate about defining the term robust, and recognizing when you have a robust calibration.

Reason (2) is critical, and in situations where the predominant spectral variation is a baseline offset, the use of 1Der is ideal. If the spectra indicate a considerable tilting effect, this suggests an involvement of a multiplicative effect, and the 2Der may offer some advantage. In that case, I usually find it useful to follow the 2Der with MSC or SNV pretreatment. There is considerable discussion of MSC and SNV in this site, you may wish to search for these items.

So, you hit the nail on the head, the prime factor in your choice of 1Der or 2Der needs to be the performance of the models on test sets. 1Der has a better chance of success with noisy data, because of the greater influence of noise in the 2Der. With high quality spectra, you will often get good success with 2Der, and you may be able to interpret your loadings and B-vectors. Interpretation is more difficult as you move from smoothing convolutions to 1Der to 2Der convolutions, in my experience.

Yes, there is a great correlation of variations of particle size with slope and offset of spectra. That has recently been the subject of an interesting thread on the effect of particles in the coagulation of milk in cheese manufacturing. It is a strong issue in understanding the spectra of powders and tablets in the pharmaceutical industry.

I am a firm believer in pragmatism. You should use the derivatives best suited to your application, and you need to optimize the derivatives to your particular samples and the wavelength sampling interval of your spectrometer.

Best wishes,
Dave
Top of pagePrevious messageNext messageBottom of page Link to this message

Brendon Lyons (b_lyons)
New member
Username: b_lyons

Post Number: 1
Registered: 8-2010
Posted on Friday, September 03, 2010 - 11:05 am:   

Hello all,

Do NIR spectroscopists prefer the second derivative transform over other options (SNV, MSC, etc.)? I'm relatively new to this field (~8 mos) and after reading multiple journal articles and posts on this forum, I get the impression that 2nd der. is the default option for NIR data processing. Now, I understand we should always keep our options open, but why would someone prefer to use second derivative to process NIR spectra?

Some background: I have been working with PLS models based on diffuse reflectance NIR spectra of pharmaceutical excipient powders. I found that first derivative vastly improved my RMSEP based on predictions on an independent test set. That's a pragmatic solution, but in principal are there any benefits to be gained from using a second derivative?

Are the constant and linear offset of NIR spectra related to physical properties of the probed material (i.e. particle size, density, etc.)? If so, is the relationship direct or complicated?

Thanks,
Brendon
Top of pagePrevious messageNext messageBottom of page Link to this message

David W. Hopkins (dhopkins)
Senior Member
Username: dhopkins

Post Number: 157
Registered: 10-2002
Posted on Tuesday, July 20, 2010 - 12:30 pm:   

Hi all,

I am pleased to be corrected. I thought the "4T's" did discuss derivatives, but when I looked in the index, I couldn't find derivatives. So I looked on my bookshelf for another reference.

My apologies, Tony.

Best regards,
Dave
Top of pagePrevious messageNext messageBottom of page Link to this message

Tony Davies (td)
Moderator
Username: td

Post Number: 237
Registered: 1-2001
Posted on Tuesday, July 20, 2010 - 3:25 am:   

Just to put the record straight.

Near-Infrared Spectroscopy by Siesler, Ozaki, Kawata & Heise is an excellent book; it attempts to give a complete introduction to all aspects of NIR spectroscopy and applications. Smoothing and derivatives are described in four pages.

Multivariate Calibration and Classification describes derivatives and smoothing on pages 107-114. It also covers Savitzky-Golay, MSC, PMSC, PLC-MC, OSC, OS, and SNV methods. Do you need to know what all these abbreviations stand for? Not if your are just starting out in your discovery of the wonders of NIR spectroscopy!

Tony
Top of pagePrevious messageNext messageBottom of page Link to this message

David W. Hopkins (dhopkins)
Senior Member
Username: dhopkins

Post Number: 156
Registered: 10-2002
Posted on Monday, July 19, 2010 - 10:55 pm:   

Hi Pati,

In addition to the book Tony recommended, I would recommend the book Near-Infrared Spectroscopy by Siesler, Ozaki, Kawata & Heise. They discuss smoothing and derivatives, which is not presented in Tony's book.

If you send me your email address, I would be glad to send you some articles I wrote that go into greater detail on Savitzky-Golay smoothing and derivatives.

Best wishes,
Dave
Top of pagePrevious messageNext messageBottom of page Link to this message

Tony Davies (td)
Moderator
Username: td

Post Number: 236
Registered: 1-2001
Posted on Monday, July 19, 2010 - 5:11 pm:   

Hello Pati,

Welcome to the group!

AS this is an NIR Publications website it seems reasonable to recomend one of their publications (I suppose I should admit to being one of the authors!)
Click on the NIR publications logo to your left then go to "Book Shop" then select NIR: the first book is Multivariate calibration and classification. That's the one you need. It does cover a lot more.

Happy reading!

Tony
Top of pagePrevious messageNext messageBottom of page Link to this message

patrycja (pati)
New member
Username: pati

Post Number: 1
Registered: 7-2009
Posted on Monday, July 19, 2010 - 12:00 pm:   

Could you please recomend a book on NIR spectra preprocessing?


Thanks,
Patrycja
Top of pagePrevious messageNext messageBottom of page Link to this message

Klaas Faber (faber)
Senior Member
Username: faber

Post Number: 28
Registered: 9-2003
Posted on Monday, March 10, 2008 - 9:32 am:   

Hi Christian,

Autoscaling is seldom indicated for spectral data. If nevertheless you see little difference, that just means that the noise in the spectra has little impact on the model results (parameters, predictions).

For a detailed discussion of the various contributions to PLS model results, see e.g:

R. Wolthuis, G.C.H. Tjiang, G.J. Puppels, T.C. Bakker Schut
Estimating the influence of experimental parameters on the prediction error of PLS calibration models based on Raman spectra
Journal of Raman Spectroscopy, 37 (2006) 447-466

A good article that shows how to estimate spectral noise and utilize that knowledge to improve on PLS, which is merely a black box in that respect, see:

N.P. Bhatt, A. Mitna, S. Narasimhan
Multivariate calibration of non-replicated measurements for heteroscedastic errors
Chemometrics and Intelligent Laboratory Systems 85 (2007) 70�81

Just for the perspective; there is really much more than pulling x,y-data through PLS without any considerations about input noise.

Best regards,

Klaas Faber
Top of pagePrevious messageNext messageBottom of page Link to this message

Scott Ramos (lsramos)
New member
Username: lsramos

Post Number: 2
Registered: 1-2007
Posted on Wednesday, February 06, 2008 - 5:30 pm:   

Christian,

You need to be careful when applying a preprocessing method like autoscaling (dividing by stdev) because regions of low intensity are magnified to the same magnitude as those of high intensity. This is particularly of concern in spectroscopy and chromatography where there are likely regions of baseline (you don't want to magnify the baseline!). After a 2nd derivative, it can be even worse.

You should look at the preprocessed data (capability should be available in most chemometrics packages). It is very instructional. Compare the profiles after mean centering and after autoscaling (with or without the derivatives). You will see what I mean.

The issue of interpretability of the loadings, the x residuals, even the regression vector, is clearly compounded by taking derivatives. But, it is still possible to do so by recalling the effect--positions of main 2nd derivative peaks still correspond to the original spectral peaks. Thus, you can still associate major features in the loadings with features in the raw spectra.

Will you always get improved standard errors? Maybe not. You get a noise reduction with derivatives but at the expense of signal-to-noise. It would be interesting not only to look at the model error but also at the prediction error.

Scott
Top of pagePrevious messageNext messageBottom of page Link to this message

Dongsheng Bu (dbu)
Intermediate Member
Username: dbu

Post Number: 20
Registered: 6-2006
Posted on Wednesday, February 06, 2008 - 4:34 pm:   

Hi Christian,

I forwarded your message to Prof. Kim Esbensen who is the author of the book.

I think it makes sense to autoscale the data after applying other pretreatment such as 2nd derivative. Like FT-IR and Raman, 2nd derivative NIR spectra show sharp bands with bigger magnitude difference among variables.

My understanding from the book is that: loading plot is hardly interpreted when 1/SDev is applied, such as the band profiles in 2nd derivative NIR spectra no long able to see in the loading plot. I am not sure the other reason "The main danger is that noise variables are over emphasized".

I tested different scaling settings with a public dataset (IDRC 2002 shootout). I found RMSEP is better when 1/SDev is applied. I experienced difficulty with loading interpretation, but did not find noise variables over emphasized. Of course, it is case based judgment. As the book says, "not always" "sometimes".

One thing I would like to remind: S-G derivative in The Unscrambler has been modified in v9.5. You may find some new features and lots of improvements in the new version of The Unscrambler.

Regards,
Dongsheng
Top of pagePrevious messageNext messageBottom of page Link to this message

Christian Mora (cmora)
Intermediate Member
Username: cmora

Post Number: 18
Registered: 2-2007
Posted on Tuesday, February 05, 2008 - 1:18 pm:   

Dear all;

I'm kind of confused with the scaling issue of NIR spectra. I've been working with data collected on wood samples to predict wood density. The usual pretreatment we use is 2nd derivative by the SG method.

I found that if I request, after the SG transformation, the model to be fitted using mean-centered and scaled data, the results are somewhat better (lower RMSEP for instance).

I'm using Unscrambler 9.2. The thing is that according to the book "Multivariate Data Analysis" published by CAMO on page 77: "In spectroscopy scaling with 1/SDev is not always considered exclusively advantageous, because...scaling and standardization may result in losing out somewhat with respect to interpretability of the loadings..."

I'd like to have comments on this. Does it make sense to autoscale the data after applying other pretreatment to the data?

Thanks
Christian

Add Your Message Here
Posting is currently disabled in this topic. Contact your discussion moderator for more information.