Intermediate qualification by NIR Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » I need help » Intermediate qualification by NIR « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

Nuno Matos (Nmatos)
Posted on Tuesday, February 22, 2005 - 3:55 am:   

Hello everybody.

At the moment, I'm trying to use NIR to qualify production intermediates. The flow is: calibration set with "good" samples and validation set with "good"+out of spec samples.

Does anyone have any suggestions to help me? I'm working with powders and I already have positive results.

Looking forward for your contribution.

Gratefully

Nuno
Top of pagePrevious messageNext messageBottom of page Link to this message

Noah
Posted on Tuesday, February 22, 2005 - 4:14 am:   

Nuno,
I've little experience using NIR, but I think that you may mix, for example in calibration set, "good" and out of spec in order to cover well all the range.
Am I wrong?
Regards

Noah
Top of pagePrevious messageNext messageBottom of page Link to this message

Nuno Matos (Nmatos)
Posted on Tuesday, February 22, 2005 - 4:18 am:   

Noah

If I would add the out of spec samples in the calibration I would be teatching the model that those were wrong. But what I want the model to learn is what are the good samples so, every time it "sees" a out of sepc sample the "alarm" rings.

I haven't mentioned but I'm using the Mahalanobis distance over PCA scores.
Top of pagePrevious messageNext messageBottom of page Link to this message

Noah
Posted on Tuesday, February 22, 2005 - 5:15 am:   

Nuno,

May I ask you what kind of instrumento are you using?
Because probably, some comments may be usefull for my projects

Noah
Top of pagePrevious messageNext messageBottom of page Link to this message

Nuno Matos (Nmatos)
Posted on Tuesday, February 22, 2005 - 5:19 am:   

Noah,

I'm using a FT-NIR, with a resolution of 16 cm-1 and using 64 scans.

Just to complement, this qualification that I'm trying to do will be applied to raw-materials as well to production intermediates.

Nuno
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Tuesday, February 22, 2005 - 8:33 am:   

Nuno - whether adding out-of-spec samples to the calibration set is a good idea or not depends on the type of algorithm you're using. If you're using a quantitative method (e.g., MLR, PLS, PCR) to verify whether some component of the samples is within the required range of values, then adding out-of-spec samples is a good idea, because in that case Noah is correct, and the extra range will give the calibration more robustenss.

On the other hand, if you're using a qualitative algorithm (e.g., Mahalanobis Distance, Spectral matching) then you should not add out-of-spec samples to the calibration data set because then you will be teaching the algorithm that out-of-spec samples are OK: the algorithm will be learning the wrong thing.

Howard

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

Nuno Matos (Nmatos)
Posted on Tuesday, February 22, 2005 - 9:11 am:   

Howard,

As I said before, I'm using the Mahalanobis distance (and secondly the spectral residual) and therefore I will use only in spec samples for trainning.

What kind of pre-processement do you think I should use (besides mean-centering and MSC) ?
Top of pagePrevious messageNext messageBottom of page Link to this message

David W. Hopkins (Dhopkins)
Posted on Tuesday, February 22, 2005 - 9:58 am:   

Nuno,

Yes, you will need both good (pass acceptance) and bad (don't pass acceptance) in your training sets. You need to set up a model to discrimate the two classes, and you say you have chosen Mahalanobis Distance on scores. This should be a good choice, but it is not the only choice. You might want to try PLS-DA, too. That means you assign a value of, say, 0 to Class 1 (Good) and 1 to Class 2 (Bad) and perform a normal PLS to predict the values, and set a threshold that divides "0" and "1".

Any way you do the discrimination, you need to select the wavelengths carefully to exclude noise (usually at the ends of the spectra) and include a large number of critical regions. You need to have large enough sets of both Class 1 and Class 2 to set up good PCA models that will properly discriminate the 2 classes. One challenge will be in building a large enough set of Class 2 (Bad) samples that future Class 2 samples will be identified correctly. Another challenge will be having enough variability in the Class 1 samples that the acceptance tolerances will be wide enough to accept future Good samples even though they vary somewhat from the training samples. You will need to set a procedure to update the model in the future, when samples of Class 1 and 2 are not correctly identified.

Every software package has a slightly different flavor to the discrimination, but I hope that this general discussion will help you in your project.

Best wishes,
Dave
Top of pagePrevious messageNext messageBottom of page Link to this message

Nuno Matos (Nmatos)
Posted on Tuesday, February 22, 2005 - 10:15 am:   

Well, the OOS samples were obtained by the R&D department and "simulate" situations that could happen. The use of DPLS (or PLS-DA) is not a good idea since I do not want that the model would be able just to discriminate that "kind" of OOS. Instead, I want that the model discriminate everything that isn't "normal". That's way I've chosen Mahalanobis distance over PCA scores.

Normally, DPLS is oftenly used for classification rather than discrimination.

I'm glad to see all this support from you. It's very important for me! Any more suggestions would be great.

Thanks

Nuno
Top of pagePrevious messageNext messageBottom of page Link to this message

David W. Hopkins (Dhopkins)
Posted on Tuesday, February 22, 2005 - 10:30 am:   

Nuno,

If you conceive the problem as a single PCA that is to accept the good samples and reject the bad ones, I agree, you should not put the bad ones in a training set. Rather, you require that as a test set, they fall outside the acceptance limit on the good samples.

How you do the pretreatment may depend on what variables are important for your process. If particle size is important, you may not want to use either MSC or derivatives, because they remove the offsets and slopes introduced by particle size variations. In fact, you should evaluate whether mean-centering is useful, again because the critical variation may be removed. If particle size is not an issue, but you are concerned about contamination by foreign materials, MSC or SNV may be useful. Also, the use of derivatives may be useful. You should check whether it is better to perform the derivatives before or after the MSC or SNV, as well as whether MSC is better than SNV in your application. If you use derivatives, you need to optimize the derivative procedure (first or second, how many points in the polynomial, and what order polynomial). There are other ways to calculate derivatives, but I assume that your software has Savitzky-Golay derivatives available. Derivatives have been discussed elsewhere in the discussion area, try a search for more details. There are few general rules here, you may need to try the alternatives and judge which is "best".

Best wishes,
Dave
Top of pagePrevious messageNext messageBottom of page Link to this message

YangLiu
Posted on Tuesday, February 22, 2005 - 10:33 am:   

Nuno,

Even when you use PCA + M-distance, you need to determine a threshold of M-distance to distinguish the out-of-specs, which needs a group of out-of-spec as a validation set (although you don't put them into your PCA
model). Right now you may use a default setting in the software (eg. 3 times of the average distance of your model). But eventually, you want your model be more robust to all sorts of variation, especially you said you want to
generalize the model to raw material. I think David's suggestions are great.

Yang
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Tuesday, February 22, 2005 - 10:43 am:   

Nuno - yes, that message came in after I sent off my reply - you can't always rely on the I-net to be timely!!

As for the "best" data transformation, to some extent, that is partly a matter of trial and error. And also on the nature of the of "out-of-spec" samples. NIR is a very empirical science; sometimes you have to try different things in order to find the one that works best.

But there are general guidelines that you can follow. For example, if you think the samples will be out-of-spec because of contamination, then those will show spectral differences at different wavelengths; in ths case a derivative treatment, possibly followed by the Standard Normal Variate will tend to emphasize those and give you good discrimination. On the other hand, if particle size differences are important and will create out-spec conditions, then what David said is correct and derivatives and SNV or MSC are the last things you'd want to consider, since those diminish the particles size effects.

Howard

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

Nuno Matos (Nmatos)
Posted on Wednesday, February 23, 2005 - 2:49 am:   

David, Yang and Howard,

See comments bellow
____________________________________
David,

I'm applying MSC because I do not want to evaluatethe particle size and other fenomena related to scattering. I use mean-centering because applying PCA to raw spectra one can see that the 1st score is the average spectrum. So, using the raw spectra we will add an extra score with no need.

For the derivatives question. I've tuned the window width (less the width more discrimination power but less robust). Nevertheless I haven't commutated the derivatives and the MSC/SNV pre-treatment. Is there a reason why?

Thanks for the SNV, I've totally forgotten that SNV is a scattering correction treatment.
________________________________________________
Yang:

The treshold I'm using is average+4xSTD. The statistical parameters are calculated from the Mahalanobis distances obtained from the calibration set.
_______________________________________________
Howard,

As in David's. But thanls again for the SNV.

Again, thank you all for this support.
I'm fully open to your suggestions

Nuno
Top of pagePrevious messageNext messageBottom of page Link to this message

YangLiu
Posted on Wednesday, February 23, 2005 - 10:37 am:   

Nuno,
First of all,one correction, I meant average+3 times STD in the first post. My bad. This is a statistical approach. Given certain situation, OOS and good ones may vary. It's good if one use a set of OOS (and some good ones) to validate (adjust) the threshold (4 as you use now). As Howard said in the other post, to "challenge" the model. Just my 2 cents and every case has its features.
Yang

Add Your Message Here
Posting is currently disabled in this topic. Contact your discussion moderator for more information.