How to infer from PCA Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » I need help » How to infer from PCA « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

Ian Michael (admin)
Board Administrator
Username: admin

Post Number: 32
Registered: 1-2006
Posted on Thursday, June 02, 2011 - 7:50 am:   

I'm sure others will come in with more direct help, but a few papers have been published in JNIRS on bacteria. See http://www.impublications.com/content/journal-search?terms=bacteria&Search=Search&fields=Abstract%2CAuthors%2CKeywords%2CTitle&searchJournal=jnirs Hope they are are some help/interest.
Top of pagePrevious messageNext messageBottom of page Link to this message

mathews m john (mathews)
New member
Username: mathews

Post Number: 2
Registered: 6-2011
Posted on Thursday, June 02, 2011 - 7:45 am:   

@gustavo:thanks for your reply. But is there some library kind of thing i can refer to for finding out my bacteria.
I understand that from the spectra i can infer what lipids, proteins etc. is there. Almost all bacteria have this. Can someone help?
Top of pagePrevious messageNext messageBottom of page Link to this message

Gustavo Figueira de Paula (gustavo)
Senior Member
Username: gustavo

Post Number: 26
Registered: 6-2008
Posted on Thursday, June 02, 2011 - 7:24 am:   

Mathews,

I must complement my previous post. Why LDA, LR or PLS are more powerful? Because they take into account an information that YOU have, and you must inform to the algorithm in the calibration phase: the actual classification of each sample. This is the way the machine "learns".

If you say that some samples are from bacteria A, and some are from bacteria B, these methods will try to find linear combinations of variables that maximizes the distinction between A and B, not taking into account undesirable information, just because YOU know that this is the right information to be evaluated.

The relative power when compared to PCA is just this additional information, which makes all difference... PCA doesn�t know a priori how the data must be separated. It just evaluates variance. These other methods in other hand knows what is the expected separation, so it can search for an optimal combination of variables to miximize the RIGHT separation.
Top of pagePrevious messageNext messageBottom of page Link to this message

Gustavo Figueira de Paula (gustavo)
Advanced Member
Username: gustavo

Post Number: 25
Registered: 6-2008
Posted on Thursday, June 02, 2011 - 7:14 am:   

Mathews,

PCA will rotate your coordinated space to offer the maximum separation based on variance of your data. In other words, it will show you the linear transformation that maximizes the variance.

Buuuut... are you sure the variance is due to your data (bacterial composition)? Could the variance be due to uncontrolled aspects like temperature, sample positioning error, small changes on sample preparation? PCA will not respond it easily.

Linear methods are great, reasonably easy to understand, and must be used as first approach (I�m sure you know what KISS means). I *LOVE* univariated linear regression; although it�s almost useless in NIR spectroscopy, it can give you useful insights. Try to plot a curve of linear correlation versus wavelength to see what happens; sometimes you have surprises... (give a number to each class you have).

More powerful linear methods are Linear Discriminant Analysis, Logistic Regression and Partial Least Squares Classification. I like Logistic regression because is simple to understand, gives reasonably good results in classification and the resultant equation is easy to implement in dedicated software.

As an additional suggestion, try to use free software like Weka, it implements a lot of methods (linear and non-linear, for classification and regression), and is absolutely free.
Top of pagePrevious messageNext messageBottom of page Link to this message

mathews m john (mathews)
New member
Username: mathews

Post Number: 1
Registered: 6-2011
Posted on Thursday, June 02, 2011 - 5:19 am:   

I am new to NIRS. I have read a couple of papers on detection and identification of bacteria using NIRS.
I do not entirely understand the usage of PCA and SIMCA analysis. From what i could understand, these methods will group closely related data together. But how can i infer what the data points infer?

Add Your Message Here
Posting is currently disabled in this topic. Contact your discussion moderator for more information.