What are wavelets; and secondary calibrations

ianm's picture

16. What are wavelets; and secondary calibrations.

Two questions. The first one deals with "what are wavelets" and the second with indirect or secondary calibrations (where the regression, etc is on a spectral characteristic that is only related to the consitituent of interest. That constituent doesn't give a band.

A reply from Howard..

Too bad you weren't at the last Chambersburg meeting. ONe presentation was an absolutely fantastic presentation of wavelets: the kind that's all done in words of one syllable or less, as I like to put it. I can give the briefest of synopses here: basically, like Fourier & a whole slew of other transformations, it's a way of decomposing a signal (in spectroscopy read "spectrum" for "signal") into a set of more fundamental signals (called basis functions). In Fourier analysis, those are sinusoids. In Taylor analysis, those are polynomials, etc. Wavelets are another set of these basis functions. There are apparently many kinds, but one of the more common, & I guess more useful, is a very highly damped sinusoid:


>               xx
>              x  x
>             x    x
>            x      x
> xx        x        x       xx
>     x   x            x   x
>      xx                xx

Like a sine wave, the frequency, expressed as the width, can vary, and thus different degrees of detail in the original signal can be expressed. Unlike a sine wave, however, it is "finite", or at least localized, so that diffferent parts of the original signal can be decomposed into different sets of wavelets. Thus, wavelets can express both the frequency and location of the different parts of the signal. Another way to express it is that it;s as if each different section of the signal can have its own "frequency" (actually wavelet) analysis, rather than a single Fourier analysis serving for the entire signal. In terms of written descriptions, I vaguely recall seeing some recently, and not too bad, but I really don't recall where. Try the usual suspects: Anal. Chem., Appl. Spect., Spectrosocpy, Photonics Spectra, Laser Focus World, etc, within the last year or two. What I saw has to be somewhere in that batch. If I come across it again, I'll try to remember to send it. Have you tried the Internet?

Re: your second topic: depending on a secondary correlation, which is what happens in such cases, is always risky. The correlation can hold up through calibration and any number of validation runs, and then break down later on for some unforeseen reason. A classic case is the measurement of salt (NaCl) in water based on the shift of the peak with ionic strength. Different ions can have different effect, which can be sorted out to some extent - - - but heaven help you if the temperature changes at some later time! There's simply no absolute way to be sure. All you can do is do your best and try to think of all the possible problems in advance, so you can guard against them.

Keep an ongoing QC program for the measurements: a known "unknown" a couple of times a day should show up major discrepancies which will tell you to do some troubleshooting.



And a response from Emil

I would agree on the second topic: trace components and secondary effects. In as far as not including a trace material beacuse it is not "seen" by NIR, this IS a fatal mistake. However, in addition to Howard's remarks, a shift in the spectra may be caused by any number of trace ingerdients. In the case of a pharmaceutical mix (read: "solution"), degradation products, still too dilute to be determined directly, could give the same kinds of shifts as, say, salt or pH shifts in a prdouct. While it makes a nice paper for PittCon, I would be hesitant to try "indirect" calibrations in a production setting.


p.s., I know from nothing about wavelets... thought that's what you got from a baby saying "bye-bye."


From: Gonzalez Panyko, Ana

In regards to your second topic, I have had experience with trying to create a calibration for a trace substance, albeit unsuccessful. I attempted to calibrate NIR with the percentage of ash in flour, which according to some can be done, although the publications I have seen on this topic have been unhelpful. When I generated the calibration, my error, though small and in itself acceptable, was almost as large as the entire range of the ash that was represented in my teaching set. Normally this would indicate that the calibration's predictions were generated randomly. So to add to your question, is it necessary to "spike" samples to get a good error to range correlation, or is the error to range correlation not significant when you are dealing with a trace substance? Like you, I also am interested in what kind of validation would be required. Please post this to the group.



From: Keith L. Miller

On your second topic -

One must be extra careful with this approach as you stated. As an example, measuring NaOH (or NaCl) in water (this has been published) would be done by measuring the changes in the location and shape of the water peak (multivariate analysis). However, there can be other changes in the system that have an identical effect on the water peak. These "other things" may be other salts,temperature, etc.

I've seen this technique before and always warn potential customers as to the pitfalls and dangers. It can work in some applications if approached properly.

Keith L. MillerProduct Marketing Manager

UOP/Guided Wave


From: Jim Reeves, NCML, B-200

Another good example of this and the problems it can cause is the determination of minerals in animal feeds by NIR. Since the minerals don't have absorptions in the NIR, the correlations are only as good as the relationships between the minerals and organic constituents holds, which isn't very long in general. In the book by Burn's and Ciurczak there's some data presented which shows that you can predict P and K as well using measures of protein and fiber as you can with NIR. My feeling would be that unless you have a very controlled situation, for example, a company making salt solutions using very pure salt and water, where only the concentration is varying and needs to be measured, it's best to avoid this if possible. I suspect there's probably more of this occuring than is realized and may account for problems. For example, at several meetings, NIR researchers have stated that the absorptions used in the calibration is due to S groups, only to be shot down by mid-infrared spectroscopists who point out that S bands in the mid-infrared are very weak and the overtones would be non-existant in the NIR region. In this case, one would assume that the rest of the molecule is what is being detected, but who knows for sure?

Jim Reeves


And another response from Howard in response to Ana's question:

As with most operations, "spiking" samples has both its advantages and dangers.

On the plus side, it may very well improve the model, and give a better fit to the data even in the restricted range that represents the "real" samples.

On the minus side: first of all, that is not guaranteed. Secondly, in order to accomplish that, you need to spike many, if not most, or even all, your "real" samples. This is necessary to avoid introducing a chance correlation between the constituent level and some other (possibly unknown) characteristic of the samples; i.e., to inadvertently create a correlation where one didn't exist already: this would definitely be counterproductive. You would also want to spike to a couple of different levels, so as to maintain a more-or-less uniform distribution of constituent values. This will increase the amount of work required by a factor of two or three or more.

Third, and worst, of all, is the fact that even with all this, you will probably not improve the absolute error of the analysis. This being the case, the correlation with the "real" samples over the range of the real samples will not improve, either, despite the improvement in the modelling process over the extended range.

As far as the particular analysis is concerned (ash in flour): many years ago Dave Wetzel of KSU did a lot of work in this ares; basically his final conclusion was that what the NIR was actually measuring was the bran (fiber) content, since that is where most of the inorganics that constitute the ash are present. On the other hand, it is the bran that affects that baking quality, and is what should be being measured, anyway. By historical quirk, it was a lot easier to measure ash than fiber directly using wet chemistry, so that "ash" analysis was used as the measure of baking quality. Now we are stuck with the situation where we try to use the NIR to measure something which is only indirectly correlated with what is of actual interest, when now we could more easily measure the quantity itself!

Dave is still around, why not contact him and get the information first-hand? If nothing else, I'm sure he will be at Chambersburg next summer.


From: de Noord, Onno

Subject: RE: Wavelets and ?

About wavelets:

Two general tutorials have appeared recently: - B. Walczak and D.L. Massart, "Noise suppression and signal compression using the wavelet packet transform", Chemometrics and Intelligent Laboratory Systems, 36 (1997) 81-94 - B.K. Alsberg, A.M. Woodward and D.B. Kell, "An introduction to wavelet transforms for chemometricians: A time-frequency approach", Chemom. Intell. Lab. Syst., 37 (1997) 215-239

These tutorials include references to different applications in analytical chemistry.

Directly related to (qualitative and quantitative) NIR spectroscopy are:

- B. Walczak, B. van den Bogaert and D.L. Massart, "Application of wavelet packet transform in pattern recognition of near-IR data", Anal. Chem., 68 (1996) 1742-1747

- D. Jouan-Rimbaud, B. Walcak, R.J. Poppi, O.E. de Noord and D.L. Massart, "Application of wavelet transform to extract the relevant component from spectral data for multivariate calibration", Anal. Chem., scheduled for November.