NIR Discussion Forum: Splitting the NIR spectra into training and test set

Splitting the NIR spectra into traini... Log Out | Topics | Search
Moderators | Register | Edit Profile

NIR Discussion Forum » Bruce Campbell's List » Chemometrics » Splitting the NIR spectra into training and test set

« Previous Next »

Author

Message

Ciaccheri Leonardo (leonardo)
Member
Username: leonardo

Post Number: 12
Registered: 5-2010

Posted on Thursday, March 28, 2013 - 8:34 am:

Hi Bilal,
according to what my teacher said, it should be better to put all replicas of the same sample in the same set (calibration or validation).
This because, if a sample is somewhat contaminated it have influence on all its spectra. If you split those spectra, they will bias both calibration and validation sets. Thus your estimation of SEP will be optimistic.

Leonardo Ciaccheri

Bilal Ahmad Malik (elp09bm)
Member
Username: elp09bm

Post Number: 14
Registered: 7-2011

Posted on Thursday, March 28, 2013 - 4:59 am:

f we collect 90 NIR spectra from 30 samples by repeating the spectra for each sample 3 times. what will be the advantage of using 3 replicates of each sample. What is the best way to split the data for the training set and test set? Could we simply split the data into 2/3 and 1/3 for training and test set resp. or should we split the three replicates for each sample? Will that affect the results? What will be recommended way.

If the dataset is small. Is it better to use cross validation only? Will cross validation be sufficient enough. say "10-fold cross validation" or leave one out cross validation.

Will it more appropriate to do more random splits (2/3,1/3) say 100 splits and see if the results are held up across say 95 splits out of 100.
Will this be better approach instead of 1 split or Leave-one-out-cross validation will nearly give same results.