NIR Discussion Forum: Discrimination model: how mana samples? which quality?

Discrimination model: how mana sample... Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » I need help » Discrimination model: how mana samples? which quality?

« Previous Next »

Author

Message

David W. Hopkins (dhopkins)
Senior Member
Username: dhopkins

Post Number: 92
Registered: 10-2002

Posted on Monday, September 04, 2006 - 10:26 pm:

Hi Nuno,

Please tell us how many samples you have in each of your groups.

I think SIMCA is a great method for discrimination of samples, but for only 2 groups, perhaps you need to look at the separation of the 2 groups set a threshold based on 1/2 the distance.

Being PCA based, you can still enhance the separation by optimizing the separate PCA models. For best results, I usually find that 2 or 3 factors for each model is best. But you need to optimize the spectral range carefully, be sure to omit as much of the low information low wavelengths as possible, and that portion of the high wavelengths that add more noise than information. Maybe eliminate water band regions? And perhaps most important, evaluate whether transforms (derivatives, with or without subsequent MSC or SNV) will add specificity to the models, so that the groups are tighter in MD space and more widely separated.

I would certainly suggest using all your data (omitting appropriate outliers). I find that SIMCA is very insensitive to the difference in the groups sizes.

With only 2 groups, you might obtain more straightforward results using PLS-DA. I think you can find good discussion strings on this method in the archives. PLS-DA is rather sensitive to the sizes of the groups, however, and influences where you select your threshold. You would still want to optimize your wavelength selection and pretreatments.

If you tell us more about your application, dry powders, liquids, tablets, etc, perhaps we can offer more specific suggestions.

Hope this helps.

Best regards,
Dave

Nuno Matos (nmatos)
Advanced Member
Username: nmatos

Post Number: 38
Registered: 2-2005

Posted on Monday, September 04, 2006 - 3:25 am:

I am constructing a discrimination model using the Mahalanobis distance in PC space (PCA). I have two main groups. One group for which I have the majority of samples and a smaller group. As expected, is difficult that some samples of this minor group fall bellow a distance of 3 (the 3 sigma).
Do you think that I should include less samples from the major group in the calibration set? Will I loose robustness? How many samples should I use?

Best regards
Nuno Matos