NIR Discussion Forum: Reference vs Unknowns

Reference vs Unknowns Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » General, All others » Reference vs Unknowns

« Previous Next »

Author

Message

Michael C Mound (mike)
Senior Member
Username: mike

Post Number: 32
Registered: 7-2007

Posted on Wednesday, September 26, 2007 - 9:45 am:

Hi,Howard,

You didn't give a wrong impression. The "help wanted" I was jesting about was related to your comment that sometimes NIR needs a little help.

Me, well, I have never been famous for being shy, so am not going to begin to be so.

Thanks,

(Not so) Helpless in der Schweiz,

Mike

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 154
Registered: 9-2001

Posted on Wednesday, September 26, 2007 - 8:32 am:

Mike - I hope I didn't leave a wrong impression. Call for help as often as you need it, as loudly as required. The "whispering" is only so that "outsiders" shouldn't get the wrong idea about NIR, and not try it out when they should.

\o/
/_\

Michael C Mound (mike)
Senior Member
Username: mike

Post Number: 31
Registered: 7-2007

Posted on Wednesday, September 26, 2007 - 1:41 am:

Howard,

Agreed.

Glad we are in the same love-fest. I promise to only whisper the "help wanted" when appropriate.

Best,

Mike

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 153
Registered: 9-2001

Posted on Tuesday, September 25, 2007 - 12:32 pm:

Mike - all of us monitoring this discussion board "love" NIR!

But sometimes it needs a little help

And sometimes (say this VERY quietly) you need something else, when the problem is beyond its capabilities. Ya gotta be realistic, too.

\o/
/_\

Michael C Mound (mike)
Senior Member
Username: mike

Post Number: 30
Registered: 7-2007

Posted on Tuesday, September 25, 2007 - 10:57 am:

Howard,

I don't think so. Actually, the analogy is not with the beach sand as representing unconsolidated materials. What my point was (and I apologize for not emphasizing this), was that the consolidated (i.e., the rock) may contain all or some of these components that, in fact, provides the taxonomic rock type with its characteristic name and must include all of the shells, bitumens, flotsam, etc, somewhere in the formation.

My point (apologies to Archimedes and all them Greeks who were spared) is that the characterization of the rock is what is at issue, and it is simply not practical to prepare samples if you are interested in a rapid analysis. Otherwise, there are plenty of alternative technologies that would work just fine. I am in love with NIRS, and will not surrender easily...(said Don Quixote).

Thanks, though, as your insights are always appreciated.

Mike

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 152
Registered: 9-2001

Posted on Tuesday, September 25, 2007 - 8:51 am:

Michael - I think you're missing a key point, and I'm somewhat surprised that the other respondents didn't mention it. The problem you're having is not unique to geology, nor even to spectroscopic analysis.

On the other hand, NIR in particular has grown under an implicit paradigm of avoiding sample preparation. In some cases, that's a requirement imposed by the application, in other cases it's done simply for ease, convenience and simplicity. And because it so often works!!

Historically, however, chemical analysis has virtually always been done in the face of interferences of all sorts, as far back at least to Archimedes' need to analyze the king's crown for its gold content, in the face of a possible unknown alloying metal. In Archimedes case he was able to figure out a method that didn't require any "sample preparation" (good thing for him, too, I think!).

But in general, sample preparation has often been a prerequsite for many forms of analysis, including spectroscopy.

Taking the case of your beach sand, for example, running the sample through a screen would remove most or all of the shell, rocks and other extraneous materials. If you had an interest in the non-sand portions, you could further separate those, afterward, until you reduced it to the sample you wanted.

So while we tend to avoid using sample preparation methods, there are situations where some sample pretreatment may be necessary and desirable. Obviously, as is normally the case, whatever you're going to do for analysis has to be done on the calibration samples as well, so you'll have to find a good compromise between the time and effort to prepare the samples and the time saved through the use of NIR.

\o/
/_\

Michael C Mound (mike)
Senior Member
Username: mike

Post Number: 29
Registered: 7-2007

Posted on Tuesday, September 25, 2007 - 7:16 am:

Hi, Dave and Peter,

First of all, thanks for the recommended references. Most interesting.

From the mineralogical (i.e., exploration and exploitation) point of view, it is simply not possible, nor practical, to obtain the large number of samples indicated by other applications to validate a reference set. The reason is purely pragmatic as well as the fact that it is unlikely to be able to secure a satisfactorily comfortable array. Imagine that you are walking along a beach, and, as you stroll along (in a straight line), you stop every ten feet or so and scoop up a bucketful of sand and its contents. Sometimes you will have some starfish, sometimes not. Sometimes, you will have some oyster shells, or clam shells, or snail shells, and...sometimes not. Occasionally, you will have some tarry materials, various flotsam, and a mixture of silt, mud, evaporated salts of one sort or another, some zeolitic materials, some heavy minerals like rutile or illite, weathered granitic materials, shaly particles, lots of quartz, micaceous fragments, plenty of sand crabs, assorted woody fibers, seaweed, etc...you get my drift (pun intended)?

Now imagine that this kind of assorted biota, mineral matter, bitumens and various carbonaceous bric-a-brac, etc., become a sedimentary rock deposit and your task is to characterize the mineralogy and chemistry. How would you propose to provide the "right" number of samples to validate a possibly spurious reference set?

The answer, of course, is, "it depends".

In actual practice, what is done is to "channel sample", i.e., to randomly select a pattern of bottoms and tops of the rock stratum in question and chip from bottom to top in a "channel" so as to get 100% of the vertical dimension between the rock stratum upper and lower boundaries. This is tantamount to coring, but is done where beds are exposed.

In spite of these seemingly insurmountable barriers to characterization, the science of this sampling technique works fairly well, as colleagues of mine, as well as myself, have made some interesting economic discoveries in both petroleum and mineralogical work. NIR in a couple of incarnations is used pretty much this way for mining work.

However, though I cannot claim that this is the best way to do things, it is actually amazing to note how successfully one can manage with an paucity of good standards to work with.

Mike

David W. Hopkins (dhopkins)
Senior Member
Username: dhopkins

Post Number: 121
Registered: 10-2002

Posted on Monday, September 17, 2007 - 10:37 am:

Michael,

That is a question that is likely to generate some good discussion. In fact, you will find a lot of discussion in the Discussion Group, if you search on �Validation� with the Utilities option. I recommend it.

Like Peter, my feelings have on validation have changed over the years, but I highly recommend that a model should be tested with a validation set when you derive it, and with test samples on a routine basis when you are using it.

When you derive it, I think you should set aside a reasonable number of samples before you even do the calibration as a validation or test set. The number depends upon the total number of samples you have available. If you are able to do a DOE using a small number of samples, you are fortunate. Then you may use a small number of samples for the test. It then will be a sanity check, that there have been no bad errors in your work. The number of samples should be chosen so that the statistics are meaningful, and not sensitive to random errors. So, that could be anywhere from 3 to 10 samples, again I stress, if you are using a limited number of samples for the calibration.

If you are like many of us, and cannot make up samples for a DOE, but must use �natural samples� that are available, the calibration may require many more samples, such as 20 to 100 or more samples. In that case, I like to use a number of validation samples that is roughly equivalent to the number of calibration samples, so that the statistics will have the same reliability.

If I recall, you are working with geological samples, so where on that spectrum of available samples do you fall?

Best wishes,
Dave

Peter Tillmann (tillmann)
Junior Member
Username: tillmann

Post Number: 11
Registered: 11-2001

Posted on Monday, September 17, 2007 - 5:07 am:

Since this targets the question of external validation and implicitly cross validation I can strongly recommend to look at:

Forina et al. 1994: Analytica Chimica Acta, 295, 109ff

Martens and Dardenne 1997: Chemom Intel Lab Sysem, 44, 99ff.

One Question:
How to validate calibration models?

Two approaches:
one case study and one one Monte-Carlo study

and one clear answer.

Peter Tillmann

P.S. Until recently I was praising independend external validation with new, unknown samples. I don't do it anymore.

Michael C Mound (mike)
Senior Member
Username: mike

Post Number: 28
Registered: 7-2007

Posted on Sunday, September 16, 2007 - 6:01 am:

Folks:

As a practical matter, what is the general opinion on the sufficient number of "unknown" (i.e., "blind") samples to use vs the total number of known (i.e., reference standards) samples used to create a model to convincingly validate the model and so proceed to use the model.

That is, to state it differently, regardless of the program or method used for validation, is there a "magic" practical minimum number of unknowns to use relative to the number of knowns that created the model in the first place so as to confidently confirm the utility and robustness of the model?

It is assumed that the selection of the unknowns is within the range of the PC's defined in the model. This means that sufficient care has been taken not to include "blinds" in the test sequence that could be considered outliers.

Your opinions welcome and appreciated.