Deeplearning with electroencephalography (EEG) data

Question:

I am making a convolutional network model with which I want to classify EEG data. The data is an experiment where participants are evoked with images of 3 different classes with 2 subclasses each. To give a brief explanation about the dataset size, a subclass has ±300 epochs of a given participant (this applies for all the subclasses).

Object
Color
Number

I have 5 participants in my training dataset, I took 15% of each participants’ data and put it in the testing dataset. Can I consider the 15% as unseen data even though the same participant was used to train the model on?

Asked By: William

Source

Answers:

It depends on what you want to test. A test set is used to estimate the generalization (i.e. performance on unseen data). So the question is:

Do want to estimate the generalization to unseen data from the same participants (whose data was used to train the classifier)?
Or do you want to estimate the generalization to unseen participants (the general population)?

This really depends on you goal or the claim you are trying to make. I can think of situations for both approaches:

Think of BCIs which need to be retrained for every user. Here, you would test on data from the same individual.
On the other hand, if you make a very general claim (e.g. I can decode some relevant signal from a certain brain region across the population) then having a test set consisting of participants which were not included in the training set would lend much stronger support to your claim. (The question is whether this works, though.)

Answered By: cheersmate