Visual examination of residuals Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » General, All others » Visual examination of residuals « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

Jim Burger (jburger)
New member
Username: jburger

Post Number: 3
Registered: 11-2010
Posted on Saturday, January 01, 2011 - 4:42 pm:   

Bruce - Visual interpretations of residuals is also used in Multivariate Image Analysis (MIA). In a hyperspectral image, sample surface variations (texture, shadows, and sample heterogeneity) may contribute to significant spectral variations. The �hypercube� of spectra is typically �unfolded�, processed as a massive matrix, with resulting prediction vectors �refolded� to produce chemical prediction image maps. The residual vectors can also be refolded to yield spatial residual maps. Examination of these maps can often assist with the determination of the optimal number of factors for a calibration model. Visual interpretation of patterns exhibited in the residual maps may indicate whether residual variance is caused by scene features (objects) or within object variation (texture).

Since hyperspectral images provide potentially tens or even hundreds of thousands of sample spectra, residuals can also be visually examined as univariate histogram distributions. Again, the structure of these distributions can yield information relevant to the spatial variation of sample.

Both of these visual interpretation techniques (residual image maps and histograms) are unique to hyperspectral image data, and certainly provide information to supplement the more traditional RMSEP.
Top of pagePrevious messageNext messageBottom of page Link to this message

Bruce H. Campbell (campclan)
Moderator
Username: campclan

Post Number: 127
Registered: 4-2001
Posted on Saturday, January 01, 2011 - 3:30 pm:   

Howard,
Let me get back to you on your proposal.
Bruce
Top of pagePrevious messageNext messageBottom of page Link to this message

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 388
Registered: 9-2001
Posted on Friday, December 31, 2010 - 4:31 pm:   

Bruce - something that occured to me was: do you remember that tutorial we wrote a few years ago: "An Introduction to Near-Infrared Spectroscopy and Associated Chemometrics"? It's still posted on my web site at http://www.nearinfrared.com/nirandchemometrics.pdf and is available as a free download, just as we originally set it up.

I always thought that it was too wordy, and didn't have enough graphics. One way we could help promote the use of residual analysis is to put some examples into that tutorial. Would you want to take the lead in updating it, to add some information to the chemometrics section of it, that would discuss use of graphics in general, and plotting and analysis of residuals, as an aid to understanding calibration data?

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 387
Registered: 9-2001
Posted on Friday, December 31, 2010 - 4:05 pm:   

Bruce - yes, you're right on the ball. Using analysis of residuals to examine and diagnose the data is an old and hallowed operation in statistical circles. I don't know how much it is used, or advocated, in routine NIR analysis, or in NIR courses these days, or in Chemometrics courses, but you're absolutely right: it should be.

When I learned about calibration, it was from an old-timer statistician who taught me to plot residuals against everything is sight: against predicted values, against calibration lab values, against all the X-variables in the calibration (we were using MLR at the time), against time, against sample number, against any other data, including extraneous data, you might have about the samples.

The first thing that pops out are any outliers; they are much more obvious when you plot residuals than when you plot variables. Then you can also see drift, how good or bad your sample distribution is, and all sorts of other effects.

I did include some discussion of it in my book "Principles and Practice of Spectroscopic Calibration" but now there are so many competing books where residuals are hardly mentioned that I don't know if anyone pays any cognizance to them any more.

The only redeeming thing that I know of is that several of the software packages that are available do include residual plotting capability. But having the capability available isn't of much use if that capability is not used, or the results of using it aren't understood. So it's probably more important that the people teaching NIR and chemometrics courses start to advocate it.

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

Bruce H. Campbell (campclan)
Moderator
Username: campclan

Post Number: 126
Registered: 4-2001
Posted on Friday, December 31, 2010 - 8:04 am:   

Howard,
I realize that multidimensional "fits" are difficult/impossible to read and that there are programs to look at residuals in three dimensions, so lets back off. What about plotting residuals for the first factor to verify a straight line approximation is correct? Would that be possible in a MLR type situation, and not only possible but of utility?

Yet another use of residuals could be to relate their size to the standard deviation (SD) of the reference method. If the residuals are of about the same size as the reference SD, then use of more factors is getting into the noise contributions and that the SD from the NIR is smaller than that of the reference method. If the residuals never get smaller than the reference SDs than wouldn't that mean it is time to re-examine the NIR approach to discover variations that shouldn't be there, such as with time?

What I'm really getting at is to use residuals to understand the NIR method better and to improve it so it becomes more robust and precise.

I don't remember seeing much, if any, discussion on this in this forum. I was wondering how many users include detailed examination of residuals in any way possible.
Top of pagePrevious messageNext messageBottom of page Link to this message

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 384
Registered: 9-2001
Posted on Thursday, December 30, 2010 - 3:08 pm:   

Bruce - nice point. Curve-fitting is very analogous to multivariate calibration. In fact, you could actually do curve-fitting with an MLR program, by generating a quadratic, cubic, quartic, etc. and using those as the "variables". You run into problems with that approach, however, in the correlations that occur between the various different functions, so it's better to use the standard method. Othewise, however, it's theoretically correct.

In both cases, however (the curve fitting and the MLR), the underlying concept is the minimization of the sum of squares of the errors, between the constructed model and tha actual data. In this sense, the "goodness of fit" leads to small residuduals as well as it does in our standard calibration algorithms.

The problem with applying the concept to multidimensional chemometrics is simply that we can't "see" more than three dimensions. We can't even do that in three dimensions, much less when it's compressed onto the two dimensions of a computer display. There are, of course several program packages, such as Infometrix's Pirouette, that display a three-dimensional data cloud on the two-dimensional screen, and I know that even there I have trouble keeping track of data points, and which are in front and which are in back.

The best we do is to use the calibration model to project all those dimensions onto the concentration axis, so that we can display it as a 2-dimensional plot. Statisticians do a bit more than that, and they will plot predicted values against individual variables, or residuals against individual variables, and several other possibilities. Some of these capabilities are entering the chemometric world, and you can find the in the software packages. But those are still attempts to squeeze the information into the two dimeneions that we can see and deal with, rather than trying to plot four or more dimensions.

I have no idea as to how you'd go about trying to squeeze four or more dimensions into such a display. If you can solve that problem, you'll really have something!!

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

Bruce H. Campbell (campclan)
Moderator
Username: campclan

Post Number: 125
Registered: 4-2001
Posted on Thursday, December 30, 2010 - 2:22 pm:   

With simple, one-dimensional curve-fitting, a plot of residuals vs. sample intensity (percentage, weight, etc) allowed the examiner to make a decision on the goodness of fit. Here, the goodness of fit isn�t how small the residuals are but how close the underlying equation is to the actual one. I know there have been statistical examinations to buttress the visually based decision(s).

Has this ever been done for multidimensional chemometrics? If so, how usable was it?

Add Your Message Here
Posting is currently disabled in this topic. Contact your discussion moderator for more information.