For example, if 90% of all observations fall in class A then a classifier that
always assigns new observations to this class will have a 90% overall accuracy.
It will also have a 100% accuracy of correctly classifying class
A observations.
However, it will have 0% accuracy of correctly identifying class B
observations because the classifier always tells you be in class A.
So this is not so good.
An important question is determining which voxels drive the classification.
And so the classifier weights can be mapped back onto the brain to provide
information about each voxels contribution to the classifier performance.
So, different classifiers may provide different maps
as they're sensitive to different features in the data.
So, in this particular example, we might have the w here, which are the classifier
weights, which again, let's just say for sake of argument that it's a vector
of length v, where v corresponds to all the voxels of the brain.
We can then map this back on to the brain to see
the relative contribution of each voxel.
And we can use this for interpretation purposes.
In fMRI it is important to make analysis choices that
balance interpretability with predictive power.