NIR Discussion Forum: Indicator variables and PLS

Indicator variables and PLS Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » I need help » Indicator variables and PLS

« Previous Next »

Author

Message

Alisha (agnosus)
Junior Member
Username: agnosus

Post Number: 7
Registered: 1-2009

Posted on Tuesday, November 23, 2010 - 10:42 am:

A very helpful response indeed.
Thank you very much Howard.

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 363
Registered: 9-2001

Posted on Tuesday, November 23, 2010 - 8:59 am:

Alisha - that's a little tricky to answer. It's true that the indicator variables will accomodate the bias differences between the different groups of samples, nevertheless it's not the same as simply doing a bias correction.

The reason is that when you include indicator variables in the calibration (if you can do that properly) then the coeffcients you calculate (and the factors, in the case of PLS) will be different than the ones you would calculate in the absence of the indicator variable. The reason is that in the absence of the indicator variables the model has to minimize the Sum Squared Error (SSE) that INCLUDES the contribution from the inter-group bias. If you include the indicator variables then the model only has to accomodate the other, presumably more fundamental variations of the data.

If you don't include the indicator variables, then at the very least, you would have to include more factors (for PLS) or wavelengths (for MLR) to try to get a model that will give the same performance, and that will make the model development process more subject to overfitting.

The bottom line is, of course, as it always in, that you should try it out and see if you can get satisfactory results by simply bias-correcting the different subsets of the samples, if so then you may be able to dispense with the indicator variables.

\o/
/_\

Alisha (agnosus)
Junior Member
Username: agnosus

Post Number: 6
Registered: 1-2009

Posted on Tuesday, November 23, 2010 - 4:19 am:

Thank Howard for taking the time to reply back.
Back to the question of usefulness of using indicator variables in PLS, am I correct in saying that:'bias correction' (adding or substracting a constant to a model) is an easy alternative to using indicator variables i.e. instead of having a PLS model with few indicator variables we can have few PLS models one for every indicator variables? Isn't that the case for MLR as well?

BTW, These questions came to my mind after reading Don Burn's chapter on indicator variables in the Handbook of NIR analysis.

Thanks.
Alisha

Howard Mark (hlmark)
Senior Member
Username: hlmark

Post Number: 362
Registered: 9-2001

Posted on Friday, November 19, 2010 - 3:43 pm:

Alisha - in principle, indicator variables would work with PLS as with PCR or MLR. The problem is that they would have to be introduced into the calculations in a different way than the data is. Therefore, they can't just be included among the spectral data, so it would require specially-written software to introduce them into the calculations properly.

There are also some theoretical issues, that they wouldn't have exactly the same meaning as in MLR or PCR. This may be moot, however, since I don't know if you'll be able to find software to do it.

\o/
/_\

Alisha (agnosus)
New member
Username: agnosus

Post Number: 5
Registered: 1-2009

Posted on Monday, November 15, 2010 - 8:04 am:

Dear all,

I have problem extending the concept of 'indicator variables' to factor based models (e.g. PLS).
Questions:
1. Is it useful/meaningful to do so?
2. Do I need to weigh (scale) the indicator variable differently?
3. How should I test the significance of an indicator variable in PLS?

I will appreciate if any reference/article can be introduced on this topic.

Thanks.

Alisha