"shrinking effect" of calibration... Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » Chemometrics » "shrinking effect" of calibration « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

Yang LIU (Angela)
Posted on Thursday, March 07, 2002 - 5:16 am:   

I'm modeling sensory porperties with NIRS using PLS and ANN. The calibrations for some properties show the "shrinking effect", i.e., they tends to predict the mean value for those high and low value samples. And they didn't show obviously improvements with subset of the calibration data.

In Martens's " Multivariate Calibration", there is comment about "inverse calibration" and point out that shrink effect is the pitfall of inverse calibration compare with classical calibration. I also find on this forum, Prof. Chirs Brown post a very enlightening comments for this problem (topic: Algorithm to Correct for Systematic Error)as below:

> Briefly, one reason may simply be a crummy model (with useless model coefficients, your model will tend to predict the mean of the >data, which can end up looking like over-predicting the low values and under-predicting the high values). Or, you may be suffering >from 'slope-depression' (*sigh*), which is a result of having a large amount of unmodelable error carried into your model coefficients >(regression vector). This is a so-called "errors-in-variables" problem. It may also just be reference error (although often this can be >ruled out based on good analytical knowledge). All of these will result in a model which tends to over-predict low values and under->predict high values, but there are a number of other reasons as well. You might start by building your model on a subset of your >calibration data, and seeing if the effect is even more pronounced.


As to my case, I think this shrinking is probably "slope-depression' and may be caused by (or indicates):1. some factors related to the sensory properties are not captured by NIRS or there are too many irrelevant variables involved into the model fitting. 2. the observed sensory values are noisy comparing to usual chemical analysis results, which add more complexities to the model fitting.

Anyone want to give some more comments, suggestions or provide some related references?


Yang Liu
Top of pagePrevious messageNext messageBottom of page Link to this message

Pierre Dardenne (Dardenne)
Posted on Thursday, March 07, 2002 - 5:57 am:   

Angela,

The shrinkage effect is a fondamental feature of the (inverse) least square fit. With 2 dimensions:

the slope is
byx=cov(x,y)/sdx

and
correlation is
r=cov(x,y)/(sdx.sdy)

The effect varies just with r due to a lack of fit and/or to a too small range of ref values.

It is similar with PLS, PCR and so on.

Pierre
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Thursday, March 07, 2002 - 7:42 am:   

Angela - I'm afraid there's no magic. The mean is itself a least squares estimator, and is what you'll get if your model uses zero wavelengths or factors; this is the fundamental reason for the phenomenon we call "shrinkage". From this we can deduce that one of two things is happenning (which is putting your own words in more blunt language):

1) There is no information in the NIR spectrum that is related to your sensory measurements.

2) Since Gauss first invented Least Squares estimation, statisticians have known that the error or noise in the X variables causes the coefficients to be "biased toward zero", which is the statistical jargon for the effect that causes "shrinkage". This is known both from the mathematical theory of regression, as well as having bee well-tested in practice. This is not unrelated to item #1, since the part of the noise that matters is the part that interferes with whatever information is in the spectrum. There's a good literature on the topic of calibration when there is error in the X variables, a reasonable starting point is the disucssion in Draper and Smith, "Applied Regression Analysis" 2nd ed., Wiley, (1998).

A number of papers have appeared where various statisticians have tackled the problem in general, and usually wind up with a set of complicated matrix equations that are intractable and couldn't be solved. As far as I know, there is only one general statistical procedure that is reasonably simple, in both theory and practice, although time-consuming and resource-intensive (aside from the possibility of reducing the error of the NIR measurements in the first place - you should try doing that as well). The procedure is to measure the spectrum of each sample many times, and average together the spectra corresponding to each sample; if the noise and error is random and independent from spectrum to spectrum, then this procedure will reduce the noise in accordance with the usual statistical criterion: i.e., as the square root of the number of readings that are averaged together.

Note that you must capture not only the instrument noise in the multiple readings, but ALL noise sources affecting the spectra. This includes sample noise: sampling error, particle size variation, etc. Sample noise is often far larger than instrument noise, and therefore the dominant noise source to reduce. This also applies to attempts to reduce the initial noise level as discussed above.

Hope this helps

Good luck

Howard
Top of pagePrevious messageNext messageBottom of page Link to this message

Christopher Brown
Posted on Thursday, March 07, 2002 - 9:16 am:   

For clarity of the discussion, we should separate "shrinkage" from what I've previously called "depression". Shrinkage, in the common calibration/regression usage, is highly useful, and refers to the enhanced precision of "shrunken estimators" such as PCR, PLS, ridge regression, etc. (while suffering from bias). The coefficient vector that is estimated by these techniques is 'shrunken' relative to the least-squares estimate. This is highly advantageous in most chemometric applications because a shorter regression vector will propagate less stochastic noise through to predictions (i.e., greater precision).

"Depression" as I used it previously, is certainly not desirable, as it is indicative of problems. Angela -- does your model have any predictive ability? In other words, is the RMSEP/CV any better than the standard deviation of the reference values? If so then your model IS predicting _something_, but as you suggest, it is most likely 'depressed' due to the junk (noise) in the sensory data. (If it isn't predicting anything, then the noise is overwhelming your estimates, and your model is meaningless.) The other important question is whether your reference values are known very accurately?? That can cause an _apparent_ depression in the slope, altough this effect is not usually as clearly pronounced as slope depression due to junk in your X-block.

You do have some options, though. Howard's suggestion will help out in many situations, although multiple insertions/samples can be inconvenient in prediction. More calibration samples will inevitably improve the slope issue, since the depression arises from spurious correlations of the 'junk' with your reference values, and with more calibration samples the chances of those spurious correlations will decrease. Other options include errors-in-variables methods such as maximum likelihood PCR, for which you'll have to collect replicates, but is straightforward to impliment.

Happy hunting,
Christopher D. Brown
(not Prof. Chris W. Brown! ;-) )
Top of pagePrevious messageNext messageBottom of page Link to this message

Yang LIU (Angela)
Posted on Friday, March 08, 2002 - 9:48 am:   

Dear All (Pierre, Howard, Chris,Richard)¡ªthanks for your informations and suggestions! (And sorry for my embarrassing mistake, Chris :-p)

There are some more details about my problem: I have 75 samples in my calibration set and 25 in my validation set. I look it as my models are not sample-starving. But the low and high value samples are fewer (It¡¯s hard for me to collect more in practical). I have used their spectra to build chemical components calibrations, and they work well. Can I deduce that there is no serious error in my spectra? As to the predicting ability, RMSEVs are close to or little bigger than RMSECs. And in sensory models for 8 properties, some of them show good predicting abilities, while others are poor.

Chris¡ªyou are right¡ªmy reference values are not so accurate or I can say they are somewhat noisy, since they estimated by sensory panel. So the junks are in both X and Y. And maybe Y has more junk.

From the post I feel that I made a misunderstanding of the concept ¡°shrinkage¡±. The ¡°shrinkage¡± does not a derogatory term, but just a feature of the inverse least square regression. Is that right? And if my problem is caused by the error in Y matrix and lack of related information in spectra (X matrix), I should call it "depression" (or any other term?)

And Chris¡ªone more question. In your previous post, you mentioned "It may also just be reference error", could you give a little more explanation about that? Thanks!

Regards,
Angela
Top of pagePrevious messageNext messageBottom of page Link to this message

Christopher Brown
Posted on Friday, March 08, 2002 - 4:22 pm:   

Reference error is an oddball issue in inverse calibration. The form of the equation (whether it be via latent variable techniques, or straight-up MLR) infers that the bulk of the error exists in the reference values, and very little or none exists in the spectra. If this is the case, estimation of the model coefficients is fine and dandy, and all appears well, except your evaluation of the precision of the technique is limited by the error in the reference values (you are comparing your flawed model estimates to already flawed reference values, so the observed RMSECV/RMSEP reflects this combination of errors). So one question you likely want to investigate is the magnitude of the reference error in each of these sensory variables. If you find that the magnitude of your sensory variables is nearly comparable to the RMSECV's you're seeing, you're stuck. The other reference-related bug is that in the good 'ol plots of "reference vs. predicted", we often put a least-squares line through the cloud to get a slope/R2/intercept estimate. Least-squares is invalid in this case because the x variable has appreciable error. The result is that the slope is artificially low, the intercept is artificially high, and the R2 is REALLY artificially low. The correct procedure in this case is to use Deming regression (aka orthogonal distance regression), or maximum likelihood estimation to get your slope/bias estimates. These are no more difficult to calculate nowadays than good ol' least-squares.

If you have a lot of noise (anything not completely related to your sensory variables) also in your spectra, then we inevitably suffer from what I've referred to as a slope depression, because the inverse model inevitably regresses upon some noise in addition to the desired responses. Slope 'depression' isn't a wide-spread term in the literature so I wouldn't strive to adopt it. (Although there is no widespread term for this effect in the literature, since in different disciplines it is called by different names.)

"Shrinkage" is an extremely common term in the literature, and you're right -- it's a good thing (usually), and not a bad thing. It's not so much a characteristic of inverse regression (MLR provides no shrinkage) as a characteristic of the so-called biased regression methods, of which PCR, PLSR, ridge regression, continuum regression are examples (of many).

Happy hunting,
Chris
Top of pagePrevious messageNext messageBottom of page Link to this message

paulgeladi
Posted on Tuesday, August 13, 2002 - 4:13 am:   

Technical question: Technicon 450

Hello All,
I have a Technicon 450 (very robust, 19 filters), but the computer and RS232 interface cable are lost. I have had a hard time finding the following:
-DOS
-BASICA
-Books about these
At the moment I have BASICA and GW-BASIC, but without manuals. Does anybody have a suggestion of how to get the L-values to a computer? Basic programs or other suggestions are welcome.
Top of pagePrevious messageNext messageBottom of page Link to this message

yonll
Posted on Thursday, October 03, 2002 - 12:07 pm:   

"And in sensory models for 8 properties, some of them show good predicting abilities, while others are poor."

The possible problem is that the samples for NIR measurement and sensory evaluation are different and of course the condition of samples is different. Currently, I am trying to develop two NIR models for the prediciton of sensory properties. Details will be soon.

Add Your Message Here
Posting is currently disabled in this topic. Contact your discussion moderator for more information.