No neighbourhood H (NH) values given ... Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » I need help » No neighbourhood H (NH) values given in WinISI « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

Mathieu Jourdain (mathieu_jourdain)
New member
Username: mathieu_jourdain

Post Number: 2
Registered: 6-2010
Posted on Friday, April 12, 2013 - 12:18 am:   

Hi Tim,

The solution to select samples according NH in WinISI has already been given.
1. Create scores and loadings files thank to "Create a score file from a spectral file". You could retain good spectra range in �Options� as Pierre Dardenne said.
2. Remove possible outlier(s)
3. Run "Select sample from spectra file" and put your *.cal file in Input filename and your *.pca in loading filename
4. Adjust empirically the NH cut-off according number of samples you want in training /validation.
As cut-off value increases, the less number of selected samples is.

However, it may be not the good choice to split data by this way. There are some discussions about this subject on this forum.
From what I could read (and understand I hope), it could be preferable to sort samples by Time & Date analysis (by default in WinISI) and move the N last samples to test set.
Even if you're not sure validation samples belong to experimental domain of your CAL set (as it's the case by NH selection), they should be completely independent from your training set.

NH selection in WinISI is especially useful in 2 cases :
* you�ve an important library of samples collected on NIR and you want to select judiciously a subset to analyze by reference analysis and develop a model.
* you�ve already developed a model and you want to identify among your routine analysis �samples of interest�. Then you could analyze them in Laboratory, merge them to your database calibration and update your model to improve robustness.

Regards
Mathieu
Top of pagePrevious messageNext messageBottom of page Link to this message

Tim van der Weijde (timw)
New member
Username: timw

Post Number: 3
Registered: 4-2013
Posted on Thursday, April 11, 2013 - 4:29 pm:   

Hello everyone,

I could never have imagined being helped so well and so fast. Thanks a million for all the helpfull comments. I have now achieved what I wanted and learned a lot more about how to validate my calibrations and how to select samples for trainingset from spectral data.

I will surely keep this forum in my favorites!

All the best and see you around!

Tim
Top of pagePrevious messageNext messageBottom of page Link to this message

Pierre Dardenne (dardenne)
Senior Member
Username: dardenne

Post Number: 80
Registered: 3-2002
Posted on Thursday, April 11, 2013 - 3:50 pm:   

Hi,

SELECT in Winisi using PCA will select the most different samples based on their spectral variability.
Thus 2 points; 1) check for outliers before otherwise they will always be selected 2) extreme X is not necessarily extreme for Y in the PCA space, then it is more efficient to make a special wavelength selection with bands related to the Y which will be calibrated and make the PCA with these wavelengths.

Pierre
Top of pagePrevious messageNext messageBottom of page Link to this message

Daniel Alomar (dalomar)
Member
Username: dalomar

Post Number: 13
Registered: 2-2009
Posted on Thursday, April 11, 2013 - 2:58 pm:   

Tim
I could add that by changing the cut-off value you can select different number of samples for your validation set.
Regards
Daniel
Top of pagePrevious messageNext messageBottom of page Link to this message

Jose Miguel Hernandez Hierro (jmhhierro)
Advanced Member
Username: jmhhierro

Post Number: 25
Registered: 4-2008
Posted on Thursday, April 11, 2013 - 12:58 pm:   

Dear Tim,

I agree with Fernando, may be a cutoff value NH of 0.6 will be appropiate.
Please, bearing in mind that all the files should have the same name.

Regards
Top of pagePrevious messageNext messageBottom of page Link to this message

Fernando Morgado (fmorgado)
Senior Member
Username: fmorgado

Post Number: 36
Registered: 12-2005
Posted on Thursday, April 11, 2013 - 12:58 pm:   

Hello :
One of the problems of Foss Users is they can not use other programs for work with the *.nir or *.cal files. That situation is a limitation for research. If you want I can move your cal files or nir files to JDX files. That format is supported for a lot of quemometrics programs. Only contact to me.
Regards
Fernando
Top of pagePrevious messageNext messageBottom of page Link to this message

Fernando Morgado (fmorgado)
Senior Member
Username: fmorgado

Post Number: 35
Registered: 12-2005
Posted on Thursday, April 11, 2013 - 12:39 pm:   

Hello :

Maybe using the option �select sample from spectra file�. There you put your cal file and automatic show the PCA file calculate before. After calculate you will see the NH distance from all the samples to the more near sample
fernando
Top of pagePrevious messageNext messageBottom of page Link to this message

Tim van der Weijde (timw)
New member
Username: timw

Post Number: 2
Registered: 4-2013
Posted on Thursday, April 11, 2013 - 12:18 pm:   

That makes a lot of sense. Thanks guys for fast and good response. But how do I select samples for a trainingset on basis of NH values then? I have 132 samples scanned and now I want to select a traningset from that on basis of the spectral differences using NH. Whats the appropriate procedure? I in fact have chemical data for all of them. Thats why i used all of them to make a calibration and hence the zero's for NH values upon monitoring the result. But now i want to do some more labwork for a different parameter, but only on a trainingset that represents the spectral variation.
Thanks in advance!!!
Tim
Top of pagePrevious messageNext messageBottom of page Link to this message

Fernando Morgado (fmorgado)
Senior Member
Username: fmorgado

Post Number: 34
Registered: 12-2005
Posted on Thursday, April 11, 2013 - 11:38 am:   

Hello :
I run a Model for check the �problem�. In Monitor appear H but NH is zero. Thinking about that Tim and Mathiu have the reason. Program show the correct result. NH for each sample of the database is zero, since calulate de distance from the sample predict to the more near sample in the database, and that sample is in the �same position� . The distance from me to me is zero.
Top of pagePrevious messageNext messageBottom of page Link to this message

Jose Miguel Hernandez Hierro (jmhhierro)
Advanced Member
Username: jmhhierro

Post Number: 24
Registered: 4-2008
Posted on Thursday, April 11, 2013 - 11:11 am:   

Dear Tim,

I agree with Mathieu, you have used the same samples in both calibration and prediction and then the closest samples in each case is itself.

Best regards
Top of pagePrevious messageNext messageBottom of page Link to this message

Fernando Morgado (fmorgado)
Senior Member
Username: fmorgado

Post Number: 33
Registered: 12-2005
Posted on Thursday, April 11, 2013 - 10:38 am:   

Dear Tim :

You are using a wrong calibration selection.
Use Develope Equation using indicator variable, there you need put a name for Eqa, PCA and LIB files and mark create loading and score file. After run the model in monitor you will find the H and NH.
Regards
Fernando
Top of pagePrevious messageNext messageBottom of page Link to this message

Mathieu Jourdain (mathieu_jourdain)
New member
Username: mathieu_jourdain

Post Number: 1
Registered: 6-2010
Posted on Thursday, April 11, 2013 - 10:37 am:   

Hi,

If you are monitoring your CAL set, then it's normal to get all NH at 0.
Indeed, in this case, the closest neighborhood of each sample is itself.

Otherwise, did you give exactly the same name to your equations, loading and score files ? For example : product.eqa, product.pca and product.lib.
If that's the case, there is probably a problem on your files. Is it possible to send it to me by mail or publish it this forum ?

I hope it could help.

Best regards,
Mathieu Jourdain
Top of pagePrevious messageNext messageBottom of page Link to this message

Tim van der Weijde (timw)
New member
Username: timw

Post Number: 1
Registered: 4-2013
Posted on Thursday, April 11, 2013 - 10:06 am:   

Dear all,
I work with WinISI4 and I want to make a subset of my samples based on the neighbourhood H (NH) values. My cal file contains the spectral and reference data and is used to make the pca and the lib file, both of which contain datapoints, so I know they are not empty files. Then I use them to make an eqa file and then monitor the results to see the predicted values, the references values, the global H and the neighbourhood H. Everytime I get only 0's for the NH. The global H and everything else I do get, but never the NH. All the files have the exact same name, according to the instructions.

Anyone has advice on what I can check or do to solve this?

Thanks in advance!
Tim van der Weijde

Add Your Message Here
Posting is currently disabled in this topic. Contact your discussion moderator for more information.