Validation methods Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » Chemometrics » Validation methods « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

Vincent Soudant (Soudant)
Posted on Monday, March 26, 2001 - 5:15 am:   

Dear all,

I am working on my first assignment regarding NIR and chemometrics following attending the most usefull training given by NIRP training. In this assignment we try to find a correlation between the strength properties of woodpulp for paper production (reference method) and NIR spectra. We use the unscrambler software for this purpose.
After treating the spectra by smoothing and calculating the 2nd derivative I perform the PLS regression using leverage correction and find excellent regression coefficients (> 0.95). I can even produce a model using training data and do a prediction using separate validation spectra and the model will prove to be as accurate as the reference method.
However if I use cross validation the whole calibration falls apart and the regression coefficient will drop to < 0.7.
I am aware that leverage correction is only to be used as a 'quick and dirty' initial validation method but can anybody give a (possible) explanation for the above behaviour?

Regards,

Vincent Soudant
Top of pagePrevious messageNext messageBottom of page Link to this message

Tony Davies (Td)
Posted on Thursday, March 29, 2001 - 12:15 am:   

Dear Vincent,

Thank you for the plug for the NIRP course. We have another one scheduled for May1-3!

I assume you mean correlation coefficient (r^2). Regression coefficients are the bs in the regression equation.

Your are correct the recommendation is do not use leverage correction! Modern PCs are powerful enough to do cross-validation all the time. But it is an interesting question. As to why leverage found a good model. I think we need some more information. How many samples in the sets? How many factors are you using?

Best wishes,

Tony
Top of pagePrevious messageNext messageBottom of page Link to this message

Vincent Soudant (Soudant)
Posted on Monday, April 09, 2001 - 5:04 am:   

Dear Tony,

Nice to hear from you again. I hope to meet you again in Korea! Well we have some 75 samples and need about 10-plus components.

Best regards,

Vincent
Top of pagePrevious messageNext messageBottom of page Link to this message

Thomas Frenzel
Posted on Tuesday, October 16, 2001 - 4:02 am:   

Dear All,

We are working on our first analytical method in the field of NIR spectroscopy. We are using a Perkin Elmer FT-NIR spectrometer for the determination of chemical constituents in rice flour. In order to evaluate our method we calculated the SECV and SEP values. However, in our opinion this is not enough to assess the applicability of the NIR method. For example, an SEP = 1 % should be sufficient if the contents of a rice constituent to be predicted range from 10 % to 50 %. On the other hand, an SEP = 1 % is not sufficient, if the contents range from 1 % to 3 %.
Our question is: Which parameters are common to assess an SEP/SECV value in relation to the range of contents expected/observed for the samples to be predicted? Is there any literature regarding this question?

Thank you for your help.
Thomas and Beate
Top of pagePrevious messageNext messageBottom of page Link to this message

Peter Tillmann
Posted on Tuesday, October 16, 2001 - 7:15 am:   

Dear Thomas,
dear Beate,

it depends on the question you have to solve, i.e. the demands given to you from your coustomer. If there is none, be glad and use statistics.

Useful measures are:
An equation is good,
- if the SEP is two times the laboratory error (i.e. repeatability).

- if the R square in validation is 0.95 (if you are a statistician) or 0.7 (if in plant breeding) or 0.999 (if a real chemist).

To sum it up: The quality of an equation is determined by the fit to use.

If you are German speaking get a copy of my book "Kalibrationsentwicklung für NIRS-Geräte" it is intended to answer especially those questions.

And if you are running a real life application use the SEP on an independend dataset otherwise you are proving nothing for your real life application but only for your special dataset.


Yours

Peter Tillmann

[email protected]
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Tuesday, October 16, 2001 - 11:57 am:   

Thomas - Peter's advice is good, but I think a bit limited. The accuracy you can expect to get depends on many things: the accuracy of the reference lab, the accuracy of the instrument, the quality of the fit of the model to the data and others. Some rules of thumb are:

1) If the NIR instrument is predicting absolute truth, you will still see a difference between the NIR and the reference data because of the error of the reference lab method. Since that is usually unknown, and the accuracy of the NIR almost always unknown beforehand, we use a value of 1.5 to 2 times the reference lab error. If it's uknown, there are ways to estimate it from the calibration results.

2) The quality of the fit of the model to the data depends on many things: the range of the constituent, the accuracy (again, of both measurements), whether there is nonlinearity or other systematic differences, how well the instrument and measurement techniques are suited to the measurement you want to make, extraneous variations (temperature, particle size (for solids), sample homgeneity, interferences, etc) These get very complicated to try to determine, and except for some research applications, it usually takes too much time and effort to turn every routine application into a full-blown research study. So the models are evaluated empirically, using various statistics to tell us how well (or badly) the model is behaving. There are many books dealing with evaluating calibration models. The one I usually recommend is the one by Draper and Smith: "Applied Regression Analysis", Wiley. Even though it does not deal explicitely with PCR and PLS, the methods they recommend for evaluating and judging the quality of a calibration can be applied to all the algorithms that are commonly used. As Peter says, though, the quality of the model you can EXPECT to get will depend on the nature of the samples you are dealing with, how well (accurately) the analyte values are known, and the nature and magnitude of the other interfering phenomena that might be present. These you will have to judge for yourself, based on your own experience. You may be able to get some idea of what can be accomplished from the literature, and seeing what sort of results have been obtained from sample types similar to the ones you have. There's a good deal if information in the several published proceedings of the International NIR meetings, and also in NIR News and in the Journal of NIR Spectroscopy. NIR News also contains a listing of current published research related to NIR.

Hope this helps.

Howard
Top of pagePrevious messageNext messageBottom of page Link to this message

Yang LIU
Posted on Sunday, October 21, 2001 - 11:19 pm:   

Dear Thomas and Beate,

I have used PE FTNIR and Quant+ to build the calibration model. I'm quite agree with Howard.

In my experiences, after you set a strict procedure of sample preparation and scan, the accuracy of reference data can be one of the major error sources. Usually, the samples we have often scatter in the middle of the calibration range while much less samples will be in the edge of the calibration range. As to the reference method, the accuracy may not be the same when concentrations of constitute are different. The lower concentration often has bigger reference error.

To assess the calibration, using independent dataset just like Peter suggested, but make sure the independent data can represent the practical situation. As to optimize your calibration, you can build two sub-calibration for high concentration and low concentration respectively if needed, which may help the real performances.

There are some good books too. Such as Richard Kramer's "Chemometric Techniques for Quantitative Analysis " as a guide, and Harald Martens "Multivariate Calibration" for advanced level. The PE's "user's reference of Quant+" is also quite imformative to the method they applied.

Hope this can help.

Yang LIU
Top of pagePrevious messageNext messageBottom of page Link to this message

william foley
Posted on Saturday, January 05, 2002 - 6:25 pm:   

Dear Colleagues

Under what conditions could the SEP <SECV?

We had a MPLS model(using cross validation) of food intake in koalas that yielded a SECV of 5.9 g/d. (n=36)(r2 = 0.92)

We have since done a validation using an independent sample set (n=18) and get an SEP of 3.6g/d.(r2 =0.81).

Repeatability (= "Laboratory error") for these types of measurements is of the order of 3-5g/d

It strikes me as a bit odd that the SEP is so much better than the SECV....am I missing something here?

Are there circumstances where would you expect SEP < SECV?

Any thoughts appreciated...
Bill Foley
Top of pagePrevious messageNext messageBottom of page Link to this message

David Hopkins (Hopkins)
Posted on Sunday, January 06, 2002 - 12:43 pm:   

William Foley,

You have the reverse of the problem most of us have, but I think it is good that you ran an independent validation set, and did not trust the SECV alone. We generally expect that the SEP should be approximately equal to the SECV. There are 2 surprises here, that your SEP is so far smaller than the SECV, and the R2 is somewhat smaller. The large R2 for the calibration suggests that you had a large range of food intake values, and a useful calibration was apparently achieved. However, the validation set apparently has a far narrower range, and tighter fit, despite the lower R2.

Therefore interesting questions arise. Did the lab analyses for the validation cover as much time or number of analysts? Perhaps the lab error is significantly smaller for some reason. Did the calibration set comprise various populations of samples, some of which are not included in the validation? Perhaps you can achieve tighter predictions of the samples by splitting the samples into identifiable classes, that when separately calibrated, yield SEPs of 3.5 g/d.

So, your unusual statistics are telling you about the errors in the measurements of the 2 sets of samples, and it is up to you to look to your samples and methods and interpret those results and what you will do about them. It sounds to me like an interesting problem.
Top of pagePrevious messageNext messageBottom of page Link to this message

Christopher Brown
Posted on Monday, January 07, 2002 - 10:58 am:   

Bill,

While we do expect the SECV be close to the SEP, it's often a matter of what "close" is. I would echo the sentiments of David Hopkins above regarding the range and character of the two sample sets, but also add that with only 18 samples in the independent sample set, the 95% CI on your SEP is somewhere around +/- 1.2 g/d, and the CI on your SECV is around +/- 1.3 g/d.
Top of pagePrevious messageNext messageBottom of page Link to this message

william foley
Posted on Tuesday, January 08, 2002 - 11:23 pm:   

David and Christopher,

Thanks for those helpful comments. You are right that the range of values in the validation set is smaller than in the calibration set - and it is very difficult to measure intake of tree leaves when very little is eaten (as was the case in some samples of the calibration set). So I suspect that there is certainly a greater error in the "laboratory measurements" of the calibration set which is in line with your coments.

The other thing I am having difficulty with is in getting the ecologists (at whom this work is aimed) to appreciate the concept of cross- validation - it is the 'validation' part of the word that sets them off! So when I write this up I am thinking about describing cross-validation as more of a model optimization procedure and a procedure to guard against overfitting than validation in the strict sense of the word....as you know we have to educate each new group that we take NIRS to!

Well I'll probabaly try that line unless I hear any screams from you or others on the board here....!!

Bill
Top of pagePrevious messageNext messageBottom of page Link to this message

David Hopkins (Hopkins)
Posted on Wednesday, January 09, 2002 - 6:48 am:   

William,

Thanks for your 'validation' of our comments. I think your approach to using the SECV is good. It is a very useful 'suggestion' for the number of factors to use in a model. However, because of the uncertainty in the error values themselves noted by Chris, it is sometimes useful to select 1 or 2 (or more) factors fewer, and still not lose accuracy. This might achieve a calibration you would judge from other features of the calibration coefficients to be less influenced by noise, and might yield more robust calibrations for future use. A lot depends upon your future plans and the risk involved in bad results, how much effort you put into the selection, testing and refinement of your calibration.

Chris, would you share the method you used to calculate the uncertainty limits on the SECV, or a citation, with us?

I would be glad to offer you my thoughts on your upcoming write-up, if you are interested.

Sincerely,
Dave Hopkins
[email protected]
Top of pagePrevious messageNext messageBottom of page Link to this message

David Hopkins (Hopkins)
Posted on Wednesday, January 09, 2002 - 6:52 am:   

William,

Thanks for your 'validation' of our comments. I think your approach to using the SECV is good. It is a very useful 'suggestion' for the number of factors to use in a model. However, because of the uncertainty in the error values themselves noted by Chris, it is sometimes useful to select 1 or 2 (or more) factors fewer, and still not lose accuracy. This might achieve a calibration you would judge from other features of the calibration coefficients to be less influenced by noise, and might yield more robust calibrations for future use. A lot depends upon your future plans and the risk involved in bad results, how much effort you put into the selection, testing and refinement of your calibration.

Chris, would you share the method you used to calculate the uncertainty limits on the SECV, or a citation, with us?

I would be glad to offer you my thoughts on your upcoming write-up, if you are interested.

Sincerely,
Dave Hopkins
[email protected]
Top of pagePrevious messageNext messageBottom of page Link to this message

Christopher Brown
Posted on Wednesday, January 09, 2002 - 9:08 am:   

Bill,

Cross-validation is a troublesome misnomer for some folks, and I've often had success avoiding the term. I find several alternatives useful: making a stronger distinction between an "internal validation" (which can really only check the ruggedness of the model with the samples that were acquired), and an "external verification" (which, hopefully, will be an unbiased measure of the ability of the model to predict the population of samples at large).

Regarding Dave's Q on the standard error of the SECV, the sampling distribution of the standard deviation is given by

std_sigma = sigma/sqrt(2*m)

where m is the error degrees of freedom. I think most stats inference texts would have this sampling distribution in there somewhere. For a quick stab at a 95% CI on an SECV, then, you can use

SECV +/- sqrt(2/m)*SECV

(two-tailed z score for 95th percentile is 1.96 (~2))

I think I can dig up a chem reference or two if anyone desires it. Drop me a note offline.

~ Chris
[email protected]
Top of pagePrevious messageNext messageBottom of page Link to this message

william foley
Posted on Thursday, January 10, 2002 - 1:32 am:   

Thanks again Chris and David,

This has been a most useful exchange. I've agreed to describe some of our activities for NIR News later in the year so watch this space!

Cheers,
b
Top of pagePrevious messageNext messageBottom of page Link to this message

maria Segara
Posted on Wednesday, September 11, 2002 - 11:59 pm:   

Hello!
I am following the EMEA "guideline on the use of NIR spectroscopy..." as a validation plan for a quantitative NIR/PLS method. However, something is not clear concerning the definition of the standard error of calibration SEC.
Is is mentionned that SEC = square root of (residual sum of squares / (n-p)). With n=number of batch and p= number of coefficients used in the calibration model.
I agree with this formula, but then they mention that "It should be understood that in cases of direct calibration for a method like PLS the p-value should be considered being larger than than the number of included coefficients."
I do not understand this sentence, can somebody give me a clue? Does p=number of PLS factors or not?
The EMEA document can be found at:http://www.emea.eu.int/pdfs/human/qwp/330901en.pdf
Thanks a lot for your help!
Maria
Top of pagePrevious messageNext messageBottom of page Link to this message

maria Segara
Posted on Thursday, September 12, 2002 - 12:00 am:   

Hello!
I am following the EMEA "guideline on the use of NIR spectroscopy..." as a validation plan for a quantitative NIR/PLS method. However, something is not clear concerning the definition of the standard error of calibration SEC.
Is is mentionned that SEC = square root of (residual sum of squares / (n-p)). With n=number of batch and p= number of coefficients used in the calibration model.
I agree with this formula, but then they mention that "It should be understood that in cases of direct calibration for a method like PLS the p-value should be considered being larger than the number of included coefficients."
I do not understand this sentence, can somebody give me a clue? Does p=number of PLS factors or not?
The EMEA document can be found at:http://www.emea.eu.int/pdfs/human/qwp/330901en.pdf
Thanks a lot for your help!
Maria
Top of pagePrevious messageNext messageBottom of page Link to this message

Christopher D. Brown
Posted on Thursday, September 12, 2002 - 9:14 am:   

Maria,

For linear (in the parameters) calibration techniques such as multiple linear regression or principal component regression, the formula cited here is valid (p will either be the number of wavelengths or factors in the model). PLS does not happily take 1 d.o.f. per factor (usually it takes more than 1), and so the formula you've given for SEC tends to underestimate the SEC with PLS models).

There are numerous methods in the literature to approximate the true (non-integer) degrees of freedom in PLS models. I don't have the references off the top of my head, but if you want them drop me a note (or look for Phatak, Van Der Voet, or Denham). That being said, we usually prefer to look at validation (or cross-validation) performance, which is free of this complication. As this document alludes to, since we usually don't really rely on the SEC as a measure of performance (i.e., it doesn't really matter), the (n-p) approximation is often satisfactory.

Chris
Top of pagePrevious messageNext messageBottom of page Link to this message

Su-Chin Lo (Suchin)
Posted on Thursday, September 12, 2002 - 9:30 am:   

Maria,

Additional comments here:

The EMEA guideline is a draft document and is under revised now. There are a lot of inadequate descriptions or statements regarding to the NIR plus chemometric issues. The PASG (Pharmaceutical Analytical Sciences Group) and EFPIA all gave good comments on this guidance. Regarding to the SEC term as described in the note, it is satisfactory for MLR models, and is not appropriate for PCR/PLS models. Based on the chemometric terminology, the n is the number of samples (or batches) and p should be the number of variables. For example, p is the number of wavelength used in MLR calibration, or number of factors or latent variables used in PCR/PLS. If the spectral data/refe values are mean-centering (pre-processing step and it is a default procedure on FOSS NIR VISION and Bruker OPUS Quant software), then the p should be used as p-1 .

Therefore the sentence of "It should be understood that in cases of direct calibration for a method like PLS the p-value should be considered being larger than the number of included coefficients." does not make sense here. The meaning of p needs redefining. The second following sentence should also be deleted......See original PDF file.
Overall, the section on Calibration are problematic and need considering redrafting, according to EFPIA's comments. In addition, just review the Glossary for the Chemometrics: a definition of "Mathematics to compare data"- You will find out this guidance uses non-standard terminology and should be proceed a major re-drafting.

Su-Chin Lo
Top of pagePrevious messageNext messageBottom of page Link to this message

Miguel
Posted on Saturday, October 30, 2004 - 1:53 am:   

Hello

We are a laboratory of forage testing and we are in process of acreditation by ISO 17025.We have bought equations from FOSS-electric for diferent parameters but we have to validate it.My questions are :

1.- ¿which parameters is necesary to put in document validation for quality assurance?
2.- ¿how evaluate we the method if we have not references of repeatibility and reproducibility limits?
3.- we have seen that some Rsquare are below 0.50.¿Can accept this value?

Thanks
Top of pagePrevious messageNext messageBottom of page Link to this message

Pierre Dardenne (Dardenne)
Posted on Tuesday, November 09, 2004 - 6:34 am:   

Miguel,

Check the GHs. If the average GH is > 5, do call Foss: the models do not fit very well your product.

GH OK <5

You have to know the SCEV of the models.
Run 10 samples known for wet chemistry.
Compute RMSEP and SEP (SEP=Standard deviation of the residuals). With a probability of 95%, your SEP must be less than 1.30*SECV. If it's not the case, redo some wet for the largest residuals and reevaluate.

If the bias is significant ( when bias is higher than 0.6*SECV ), you can correct your model for the bias.

If the test is ok, run the routine samples for a while, but you have to check the models regularly.

Do not use correlation for a validation set. Correlation depends on the range and you can have perfect predictions with no correlation.

Scan the same sample every day over 2 or 3 weeks and you will have data to calculate the repeatability and reproducibility. But generally NIR is very repeatable regarding accuracy.
Top of pagePrevious messageNext messageBottom of page Link to this message

David W. Hopkins (Dhopkins)
Posted on Tuesday, November 09, 2004 - 12:43 pm:   

Pierre,

What is a GH?

I assume you mean SECV in your third paragraph?

Best regards,
Dave
Top of pagePrevious messageNext messageBottom of page Link to this message

Pierre Dardenne (Dardenne)
Posted on Tuesday, November 09, 2004 - 1:29 pm:   

Dave,

GH = Global H = Mahalanobis distance in Infrasoft International software package.

GH is calculated such a way that the average H for the calibration set is always 1 whatever the number of objects and components (PC or PLS).
Top of pagePrevious messageNext messageBottom of page Link to this message

Miguel
Posted on Wednesday, November 10, 2004 - 1:37 am:   

Thanks Pierre

I,m grateful your help.But i ask you :Do you think that equations of Foss for a irregulars matrix in forage testing are valid for spanish matrix?.Its recommended to make a new equation with spanish forage matrix? Is Mahalanobis distance average (GH) a value indicator of unequal matrix.
Please recommend me a good book about theme.


Thanks
Top of pagePrevious messageNext messageBottom of page Link to this message

Pierre Dardenne (Dardenne)
Posted on Wednesday, November 10, 2004 - 2:34 am:   

Miguel,

send me your email at
[email protected]
Top of pagePrevious messageNext messageBottom of page Link to this message

Erik Skibsted
Posted on Wednesday, November 10, 2004 - 3:27 pm:   

I will advice to use the ASTM standards...it seems that this will be the golden standard in the future as FDA is supporting ASTM and from what I know there is a lot of 'drive' in the ASTM towards generating the best standards.


@ vincent...nice to hear from you up in cold finland :-) for the note...please bear in mind that biological materials requires a lot more samples for calibration/validation then "pure" chemical samples. Secondly, Faber et al published a very nice paper with some suggestions for equations how to correct your RMSECV/RMSEP/SEC/SEP or what notation you use, when you know the error of your reference analysis.

Another rough estimate of the error from the NIR spectrometer can be achieved by recording e.g. 100 repeated spectra of your sample without moving the sample i.e. spectra variation is only due to detector noise...then predict the concentration/property and calculate some statistics of these results and compare to your SEP/SEC

Cheers
Erik Skibsted
Top of pagePrevious messageNext messageBottom of page Link to this message

Erik
Posted on Wednesday, November 10, 2004 - 3:31 pm:   

uppss found the reference ->

KLAAS FABER* and BRUCE R. KOWALSKI²

Improved Prediction Error Estimates for
Multivariate Calibration by Correcting for the
Measurement Error in the Reference Values

Applied Spectroscopy, vol 51, no 5, 1997

Out
Erik
Top of pagePrevious messageNext messageBottom of page Link to this message

Dennis Karl (Dennisk)
Posted on Wednesday, November 10, 2004 - 4:44 pm:   

To Miguel,
Forage analysis and your Spanish matrix.

Our experience with purchased equations is that they do not always represent the unique sample matricies you may have. We find in New Zealand that our sample matricies are not always represented in the Foss (or other suppliers) sample sets. Maybe for Spain this is the same situation. For some sample types such as soya meal, which is traded internationally, the purchased equations are OK for starting but you do have to keep monitoring and upgrading to factor in your own laboratory conditions. For example sample preparation.
When you come to material like forages we found it was better to start from scratch and develop our own equations as our forages would not be represented at all. This is because our grass species are largely unique to NZ and vary widely from region to region. There are also changes in the seasonal characteristics. Spring forages tend to have less weed species present wheareas late summer forages are more rank and have mature weeds present. It takes several years and changes in seasons to get good performance out of your equations. We were also developing equations for wet forage where we simply received the forages, as sampled, into the lab within 24 hrs (or less) from cutting and scanning without further sample preparation using the Foss Natural Product Cell. These calibrations were remarkably good after a few years work.

So the message is: try the purchased equations. They may be OK as a starting point but you will need to constantly monitor and add back into the data set your own unique matricies.

Also: we have achieved ISO 17025 for Analysis of a number of analytes by NIR. To achieve this we needed to demonstrate a rigorous approach to our Quality procedures including the wet chemistries behind the calibration development. Your technical staff also need to demonstrate good knowledge of the principles of calibration development.
Once you have your procedures documented, and are following them, it all becomes quite easy and routine.

Good luck
Dennis
Top of pagePrevious messageNext messageBottom of page Link to this message

Carlos Álvarez
Posted on Saturday, November 20, 2004 - 1:50 pm:   

Hello All,
I am really excited because of this forum, I didn't know about it until today.
Well, I work analyzing powders (minerals) by NIRS in order to quantify clays and micas one year and half ago. I made a cross validation, PLS calibration. One of the constituents is in the range of 0.5 to 2% in the calibration samples so I would like to know if a prediction of an unknown sample of 5% or more is reliable? the M-Distance is <3. I analyzed the same sample by XRD and the result was around 18%.
On the other hand, I have been looking for information about analysis of minerals by NIRs and there aren't much information available so I'll appreciate your help on that way.
Thanks to all

Carlos.
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Saturday, November 20, 2004 - 2:58 pm:   

Carlos - I think that your own data (NIR value of 5% versus XRD value of 18%) shows that SOMETHING is not reliable. Possibly both values are not reliable, but I can't speak to the XRD value. Do you have any independent evidence about the accuracy of the XRD results?

It is a general rule, however, that extrapolation of NIR predictions beyond the range of the calibration data is risky, at best, and I would not consider the NIR value trustworthy - as your own data shows.

What I find strange, however, is that the Mahalanobis distance is less than 3, given the distance outside the range for the sample in question. I wonder if your calibration model is in fact predicting anything reasonable at all, but that is something that is not easy to determine by remote control.

One thing you might try to do is to get more samples like the one that is giving the high value. Then you can expand the range of your calibration, so that future samples will not be so questionable.

Ron Rubinovitz has been presenting information about analysis of inorganic materials at several conferences recently, but I don't think he has published any of it yet. If you can get to next Pittcon you may be able to talk to him about your calibration and ask his opinion as to whether the results are real or spurious.

Howard

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Saturday, November 20, 2004 - 3:43 pm:   

Carlos - I think that your own data (NIR value of 5% versus XRD value of 18%) shows that SOMETHING is not reliable. Possibly both values are not reliable, but I can't speak to the XRD value. Do you have any independent evidence about the accuracy of the XRD results?

It is a general rule, however, that extrapolation of NIR predictions beyond the range of the calibration data is risky, at best, and I would not consider the NIR value trustworthy - as your own data shows.

What I find strange, however, is that the Mahalanobis distance is less than 3, given the distance outside the range for the sample in question. I wonder if your calibration model is in fact predicting anything reasonable at all, but that is something that is not easy to determine by remote control.

One thing you might try to do is to get more samples like the one that is giving the high value. Then you can expand the range of your calibration, so that future samples will not be so questionable.

Ron Rubinovitz has been presenting information about analysis of inorganic materials at several conferences recently, but I don't think he has published any of it yet. If you can get to next Pittcon you may be able to talk to him about your calibration and ask his opinion as to whether the results are real or spurious.

Howard

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

Carlos Álvarez
Posted on Tuesday, November 23, 2004 - 10:05 am:   

Thanks Howard. To expand the range of the calibration is something I already planned to do.
About your comments, you mean the M-distance >3 is then not reliable?, is it the same (not reliable) if the M-distance is 5 or 20 or 100? the threshold of 3 is so determinant? if so why you consider the prediction of 5% with M-distance<=3 not trustworthy. As far I understand the M-distance is a measure in terms of standard deviation from the mean of the calibrations samples to the unknowns, I mean It gives a statistical measure of how well the spectrum of the unknown sample matches, or not, the original calibration spectra.
Howard, could you share with me the e-mail of Ron Rubinovitz? or how do I make contact with him? It will be difficult to be in the next Pittcon. Thanks again.
Carlos.
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Tuesday, November 23, 2004 - 11:08 am:   

Carlos - no, that's not what I'm saying. A Mahalanobis Distance >3 is a reliable indicator of a difference between the spectra of the samples. But that's not the situation you have, since your M-distance is not greater than 3. According the the M-distance, the spectra of the new sample is essentially the same as the calibration samples even though the XRD results say that they have a different composition. Therefore, either the XRD results are not reliable and the sample really is the same as the calibration samples, or the NIR is not responding to the change in composition. This may be either because of poor model, or because of a lack of a distinctive absorbance of the analyte.

I'll send you Ron's e-mail address privately off the discussion group, since I don't know that he wants his address spread far and wide.

Howard

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

hlmark
Posted on Tuesday, November 23, 2004 - 11:11 am:   

Carlos - I tried to send you an e-mail message, but your e-mail address is not in your message. Please let me know what your e-mail address is

\o/
/_\
Top of pagePrevious messageNext messageBottom of page Link to this message

Carlos Álvarez
Posted on Tuesday, November 23, 2004 - 1:13 pm:   

Thank you Howard. I appreciate your support.
I sent to you an e-mail to make you know my e-mail address and it is available already with this message.

Add Your Message Here
Posting is currently disabled in this topic. Contact your discussion moderator for more information.