logo头像

脑机接口,从未如此有趣!

Common Spatial Pattern(CSP) and Its Improvements/共空间模式及其改进方法

Introduction of Common Spatial Pattern

1. Why spatial filtering is important?

Raw EEG scalp potentials are known to have a poor spatial resolution owing to volume conduction. And data-driven approaches to calculate subject-specific spatial filters have proven to be useful for this situation, as shown in the following figure[1].

Spectra of left versus right hand motor imagery. All plots are calculated from the same dataset but using different spatial filters. The discrimination between the two conditions is quantified by the r2-value. CAR stands for common average reference.

From the figure above, we can see that CSP is very useful for the discrimination of motor imagery.

2. How does CSP work?
CSP is a technique to analyze multichannel data based on recordings from two classes (conditions). CSP yields a data-driven supervised decomposition of the signal parameterized by a matrix that projects the signal in the original sensor space to the surrogate space. In a nutshell, CSP filters maximize the variance of the spatially filtered signal under one condition while minimizing it for the other condition. The effect of CSP is shown below.

Effect of spatial CSP filtering. CSP analysis was performed to obtain four spatial filters that discriminate left from right hand motor imagery. The graph shows continuous band-pass filtered EEG after applying the CSP filters. The resulting signals in filters CSP:L1 and CSP:L2 have larger variance during right hand imagery (segments shaded in green) while signals in filters CSP:R1 and CSP:R2 have larger variance during left hand imagery (segment shaded red).

A toy example of CSP filtering in 2-D. Two sets of samples marked by red crosses and blue circles are drawn from two Gaussian distributions. In (a), the distribution of samples before filtering is shown. Two ellipses show the estimated covariances and dashed lines show the direction of CSP projections wj (j = 1, 2). In (b), the distribution of samples after the filtering is shown. Note that both classes are uncorrelated at the same time; the horizontal (vertical) axis gives the largest variance in the red (blue) class and the smallest in the blue (red) class, respectively.

Math Behind CSP

First, we calculate the covariance matrix.

Define the EEG signals as , every row represents a trial.

Considering the feature of EEG signals, we get:

So,

In a perspective of matrix, the above formula can be rewrriten as: , + means left-hand imagery, while - means right-hand imagery.

In the following section, we whiten the covariance matrix.

First, define G: . Because the covariance matrices are symmetrical, G is symmetrical, too.

Second, do the eigen decomposition of G: .
Because G is symmetrical, .
So, .

Third, define P: . So, we get:. Becasue is a diagonal matrix, we get:

So after all the effect, we get a very important formula:
.
This formula exactly explain why CSP filters maximize the variance of the spatially filtered signal under one condition while minimizing it for the other condition.

Finally, whiten the matrices:

In the end, we solve the general eigen decomposition of and :

Because the covariance matrices are symmetrical, the whitened covariance matrices are symmetrical, too. So, the matrices of their eigen vectors are orthogonal, i.e., . So:

Because , we get:

So:

After the processing, we can transform the signals to the new space:

Some Key Code of CSP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Fea_NUM = 3; % The number of the chosen features of CSP
C = conv(X); % Calculate covariance matrix
for i = 1:3
C_mean{i} = mean(C(find(label==i))); % Calculate average covariance matrix
end
% G = C_mean{1}+C_mean{2}; % Whiten, all these steps can be ignored.
% [Ug,Lambdag] = eig(G);
% P = sqrt(lambdag^(-1))*Ug';
%for i = 1:3
% C_mean_w{i} = P*C_mean{i}*P'; % Calculate whitened covariance matrix
%end
[V1,~]=eig(C_mean{1},C_mean{2}); % Calculate the general eigendecomposition problem
fea_idx=[1:Fea_NUM,size(V1,2)+(-Fea_NUM+1:0)];
fea(s_idx,:)=log(diag(V1(:,fea_idx)'*C{s_idx}*V1(:,fea_idx))); % Feature extraction

Improvements of CSP

  1. Multiclass: In its original form, CSP is restricted to binary problems. A general way to extend this algorithm to the multiclass case is to apply CSP to a set of binary subproblems (all binary pairs or, preferably, in a one-versus-rest scheme).A more direct approach by approximate simultaneous diagonalization was proposed in [2].
1
2
3
4
5
[V1,~]=eig(C_mean{1},C_mean{2});
[V2,~]=eig(C_mean{1},C_mean{3});
[V3,~]=eig(C_mean{2},C_mean{3});
fea_idx=[1:Fea_NUM,size(V1,2)+(-Fea_NUM+1:0)];
V=[V1(:,fea_idx),V2(:,fea_idx),V3(:,fea_idx)];
  1. The common spatio-spectral pattern (CSSP): It solves the standard CSP problem on the EEG time series augmented by delayed copies of the original signal, thereby obtaining simultaneously optimized spatial filters in conjunction with simple frequency filters. More specifically, CSP is applied to the original x concatenated with its off τ ms delayed version x(t− τ). This amounts to an optimization in an extended spatial domain, where the delayed signals are treated as new channels. This technique automatically neglects or emphasizes specific frequency bands at each electrode position in a way that is optimal for the discrimination of two given classes of signals.
  2. The common sparse spectral spatial pattern (CSSSP):It eludes the problem of manually selecting the frequency band in a different way. Here a temporal FIR filter is optimized simultaneously with a spatial filter. In contrast to CSSP only one temporal filter is
    used, but this filter can be of higher complexity. In order to control the complexity of the temporal filter, a regularization scheme is introduced which favors sparse solutions for the FIR coefficients. Although some values of the regularization parameter seem to give good results in most cases, for optimal performance a model selection has to be performed.
  3. An iterative method (SPEC-CSP): It alternates between spatial filter optimization in the CSP sense and the optimization of a spectral weighting. As result, one obtains a spatial decomposition and a temporal filter with are jointly optimized for the given classification problem.
  4. Connection to a discriminative model: This connection is of theoretical interest in itself, and can also be used to further elaborate new variants of CSP.
  5. Regularizing CSP: In [3], a regularization on the CSP filter coefficients was proposed to enforce a sparse solution; that is, many filter coefficients become numerically zero at the optimum. Therefore, it provides a clean way of selecting the number and the positions of electrodes. Their results have shown that the number of electrodes can be reduced to 10–20 without significant drop in the performance.
  6. Some advanced techniques towards reducing calibration data: More details in [1].

Reference:

[1] Blankertz, Benjamin, et al. “Optimizing spatial filters for robust EEG single-trial analysis.” IEEE Signal processing magazine 25.1 (2008): 41-56.

[2] G. Dornhege, B. Blankertz, G. Curio, and K.-R. Müller, “Boosting bit rates in non-invasive EEG single-trial classifications by feature combination and multi-class paradigms,” IEEE Trans. Biomed. Eng., vol. 51, no. 6, pp. 993–1002, June 2004.

[3] J. Farquhar, N.J. Hill, T.N. Lal, and B. Schölkopf, “Regularised CSP for sensor selection in BCI,” in Proc. 3rd Int. Brain-Computer Interface Workshop Training Course 2006, Verlag der Technischen Universität Graz, Graz, Austria, pp. 14–15.