Approach for epileptic EEG detection based on gradient boosting

2015-07-06CHENShuangshuangZHOUWeidongGENGShujuanYUANQiWANGJiwen

Journal of Measurement Science and Instrumentation 2015年1期

CHEN Shuang-shuang, ZHOU Wei-dong, GENG Shu-juan, YUAN Qi, WANG Ji-wen

(1. Suzhou Institute of Shandong University, Suzhou 215123, China;2. School of Information Science and Engineering, Shandong University, Jinan 250100, China;3. Qilu Hospital, Shandong University, Jinan 250100, China)

CHEN Shuang-shuang1,2, ZHOU Wei-dong1,2, GENG Shu-juan1,2, YUAN Qi1,2, WANG Ji-wen3

(1.SuzhouInstituteofShandongUniversity,Suzhou215123,China;2.SchoolofInformationScienceandEngineering,ShandongUniversity,Jinan250100,China;3.QiluHospital,ShandongUniversity,Jinan250100,China)

The automatic seizure detection is significant for epilepsy diagnosis and it can alleviate the work intensity of inspecting prolonged electroencephalogram (EEG). This paper presents and investigates a novel machine learning approach utilizing gradient boosting to detect seizures from long-term EEG. We apply relative fluctuation index to extract features of long-term intracranial EEG data. A classifier trained with the gradient boosting algorithm is adopted to discriminate the seizure and non-seizure EEG signals. Smoothing and collar technique are finally used as post-processing in order to improve the detection accuracy further. The seizure detection method is assessed on Freiburg EEG datasets from 21 patients. The experimental results indicate that the proposed method yields an average sensitivity of 94.60% with a false detection rate of 0.18/h.

electroencephalogram (EEG)； seizure detection； wavelet transform； fluctuation index； gradient boosting

0 Introduction

Epileptic seizures are suddenly abnormal reactions in the brain represented by loss of awareness or consciousness and disturbances of movement, sensation, mood or mental function[1]. Electroencephalogram (EEG) signal analysis is widely used for assessing disorders of brain function, especially for epilepsy diagnosis. Visual inspection of long-term EEG recordings for seizures is very tedious and time-consuming. Therefore, the development of an automatic seizure detection system has an important role in analyzing EEG recordings.

The automatic seizure detection method presented by Gotman is the first widely applicable technique[2]. In this method, EEG signals were decomposed into half-waves, and then features as peak amplitude, duration, slope and sharpness were extracted for detection. Expanding on this work, Khan and Gotman employed discrete wavelet transform (DWT) to decompose EEG signals into sub-bands and computed features, such as energy, coefficients of variation and relative amplitude, on the DWT coefficients for seizure detection[3].

Several classifiers capable of classifying seizure and non-seizure EEG signals have been presented in the literature. Gardner et al. employed one-class support vector machine (SVM) to classify short-time, energy-based statistics computed from one-second windows of data[4]. Temko et al. presented a multi-channel patient-independent neonatal seizure detection system based on the SVM classifier[5]. An algorithm for automatic seizure detection using self-organizing map (SOM) neural network (NN) with unsupervised training was proposed by Gabor et al.[6]. Gradient boosting is a machine learning technique for regression problems, which produces a prediction model in the form of an ensemble of weak prediction models. By minimizing different loss functions, gradient boosting can deal with not only the regression problems but also the classification problems. Gradient boosting has been applied to motor imagery classification with higher performances[7].

In this paper, we propose a method to detect the seizures from EEG signals using gradient boosting algorithm in conjunction with ordinary least squares (OLS) regression. Gradient boosting with OLS is an interesting alternative to state of the art algorithms for epileptic seizure detection. The algorithm can build linear classification rules so that a small number of operations are needed to apply the classifier to new data. Relative fluctuation index is employed to characterize the EEG signals from each channel. Furthermore, smoothing and collar techniques are used as post-processing in order to improve the accuracy of this method. The experimental results show that this method can detect the seizures with a high sensitivity and low false detection rate.

1 Data acquisition and preprocessing

1.1 Data acquisition

All the EEG data used in our study come from the Epilepsy Center of the University Hospital of Freiburg, Germany[8]. The EEG database contains invasive EEG recordings of 21 patients suffering from medically intractable focal epilepsy. The EEG data were acquired using a Neurofile NT digital video EEG system with 128 channels, 256 Hz sampling rate, and a 16 bit analog-to-digital converter.

For each of the patients, there are datasets called “ictal” and “interictal”. The former contains files with epileptic seizures and at least 50 min pre-ictal data, and the latter contains approximately 24 h EEG recordings without seizure activity. At least 24 h continuous interictal recordings are available for 13 patients. For the remaining patients interictal invasive EEG data consisting of less than 24 h were joined together, to end up with at least 24 h per patient. For each patient, the recordings of three focal and three extra-focal electrode contacts are available.

1.2 Data processing

The traditional signal analysis based on Fourier transform only obtains frequency information, but there are no transient features in Fourier coefficients. Compared with the short time Fourier transform (STFT), the wavelet transform adapts the window size for good resolution and localization performance in both the time and frequency domains. Long time windows are employed to get precise frequency information, and short time windows are used to obtain accurate time information[9]. In this way, the wavelet transform has an optimal time-frequency resolution.

In this work, the EEG signal is broken down into epochs containing 1 024 points(4 s). The 5-scale wavelet transform, using a Daubechies-4 wavelet, is performed on each 4 s epoch of data in each channel, respectively. The EEG signals with sampling rate of 256 Hz are decomposed into five scales, giving the approximation coefficients representing 0-3 Hz (a5) and detail coefficients representing 64-128 Hz (d1), 32-64 Hz (d2), 16-32 Hz (d3), 8-16 Hz (d4), and 4-8 Hz (d5). Seizure activity is characterized by scales 3, 4 and 5 since it is most often between 3 and 29 Hz (Khan & Gotman, 2003). The 0-3 Hz band is not used because occurrences of activity in this band can be frequent in non-ictal sleep EEG . Then, signals at scales 3, 4 and 5 are decomposed into half-waves using the method developed by Gotman[2]to eliminate superimposed fast activity with small amplitude.

1.3 Feature extraction

We propose to employ relative fluctuation indexRFIto measure the intensity of the fluctuation of EEG signals. Fluctuation index can be expressed as

|ai(j+1)-ai(j)|

whereaidenotes the amplitude of the filtered signal with lengthLENat scalei. Fig.1 illustrates the fluctuation index from two hundred epochs of seizure data and non-seizure data selected randomly. It can be seen from Fig.1 that fluctuation index features from seizure data are higher than those from non-seizure data.

Fig.1 Comparison of fluctuation index for seizure and non-seizure EEG epochs

Fig.2 shows the mean values and standard deviations of the fluctuation index features extracted from these EEG samples. The statistical analysis indicates that the difference in fluctuation index between non-seizure and seizure EEG epochs is significant.

Fig.2 Means and standard deviations of fluctuation index between seizure and non-seizure EEG epochs

The fluctuation index relative to the background is the ratio of the fluctuation index of the analyzed EEG epoch to the average fluctuation index of the background. Empirically, the background is defined as 120 s EEG data ending 60 s prior to the analyzed EEG epoch. The gap of 60 s is selected to allow a gradual onset of a seizure. The background of 120 s is needed to get a steady estimation of the background fluctuation index[10].

2 Classifier design

2.1 Gradient boosting

In essence, gradient boosting is a kind of machine learning method that builds one strong classifier from many weak classifiers. The main idea of gradient boosting algorithm is the gradient of the loss function being minimized, with respect to the model values at each training data point[11]. To improve the model is to let the loss function declining at its gradient direction. Gradient boosting with OLS[7]can be described as follows.

We denote segmented training data of EEG byW, corresponding class labels byY, in which 0 represents non-seizure section and 1 represents seizure section, and the length of every segment byLEN. In a single EEG channeln, the feature for scalejis labeled asRFIj,n. The feature vectorwiis formed by combiningRFIj,nfor scales 3-5 in 6 channels. Now we get two setsW={wi∈Rk,i=1,2,…,N} andY={yi∈{0,1},i=1,2,…,N}, in whichK=C×Sis the number of features, with the number of EEG channelsC, the number of wavelet scalesSand the number of segmentsN. The final model we need to build is

Before training, we set an initial guessF0(wi)=0,i=1,2,…,Nand then form=1,2,…,M, wheremstands for step andMstands for iterations. In order to improve the model along the gradient descent direction of its loss function, we need to calculate the loss function at first. With training data, the loss function of the model can be expressed as

L(Fm;W,Y)=

Then the gradient of loss function can be computed as

2(yi-pm-1(yi=1|wi)).

After computation of the gradient of loss function, the weak classifierfmthat best fits the gradient in a least squares sense is selected as

We use weak classifiers that have aC-dimensional vector of regression coefficientsαand a time indextas parameters. The output of a weak classifier is the projection of the vectorwi(t) of EEG samples at timetonto the regression coefficients,

f(wi;α,t)=αTwi(t).

fm(wi)=f(wi;αm,tm).

Now the size of the stepγmis determined by

In order to improve the generalization performance of the boosting algorithm,γmis shrinked to a small value through multiplication with a smallεat each step,

where the parameter 0<ε≤1 controls the learning rate of the procedure[12]. The algorithm using gradient boosting can be summarized as

1)p0(yi=1|wi)=0.5,i=1,2,…,N

2)F0(wi)=0,i=1,2,…,N

3) Form=1 toMdo:

d.Fm=Fm-1+εγmfm

i=1,2,…,N

end For

4) End algorithm.

2.2 Classifier design

In this work, all six channels of EEG were used for seizure detection. For each patient, one or two hours of EEG signals (depending on the total number of data that contains seizures for this patient) were used as training data and the remaining data as testing data. The training data of each patient contains some epochs (4 s) of seizure, which would separate with epochs of non-seizure for training. Features got from training data formed feature vectors. The same operation was performed for testing data. Then the classifier was trained using gradient boosting to get the best model. In our work, we trained classifiers for each patient in order to fit the classifier to each patient in optimal.

The output value of the classifier obtained with gradient boosting usually fluctuates between 0 and 1 ( 0 represents non-seizure epoch and 1 represents seizure epoch). For this reason, post-processing is necessary. The post-processing scheme consists of smoothing and collar technique.

1) Smoothing

Since the output of the classifier is an estimate of the probability that an epoch contains seizures. A moving average filter is applied to the output of the classifier to remove the short time jump-points. The moving average filter used here can be expressed as

wherexis the inputs of the filter, y is the outputs of the filter and 2N+1 denotes the span of the moving average filter. The average output is then compared to a threshold obtained with the training data.

2) Collar technique

To compensate for possible difficulties in detecting pre-seizure and post-seizure parts due to use of the smoothing process, collar technique[13]is used to make up for the missed seizure decisions. In the collar technique, both sides of each seizure decision are stretchedmepochs severally in the last step of our seizure detection procedure. In this paper,mis given by 3.

3 Results

We tested the algorithm with the Freiburg datasets introduced in Section 2.1. All the experiments were practiced in Matlab R2011a environment running in an AMD Athlon processor with 2.71 GHz.

For each person, an hour seizure data and non-seizure data selected randomly are used for training. Firstly, the EEG signals are divided into 4 s epochs by using the method mentioned in Section 2.2. Then the relative fluctuations index is calculated from the signal filtered by DB-4 wavelet. Afterwards, the feature vector established from three scales in six channels of the current epoch is fed into the classifier with post-processing.

Sensitivity, specificity and selectivity are employed to evaluate the performance of our method. We define the number of true positives that is identified as seizures by both our method and EEG specialists asTP, and the number of true negatives asTN. Furthermore,FPis the number of false positives that is identified as seizures only by our method but EEG specialists, andFNis the number of false negatives. Then the sensitivity, specificity and selectivity are defined as

In addition, the false detection rate (number of false detections per hour) is also calculated in order to display the feasibility of the presented method. The results of our method are listed in Table 1.

In this experiment, there are two to five hours of seizure data for each patient. We have tested our algorithm on all of the EEG data from Freiburg dataset. Only three hours data of patient 10 are excluded here because we could not find any seizure spikes by visual inspection. Therefore, 84 seizure activities are totally used to evaluate the performance of the method.

Table 1 Results for evaluating the performance of the seizure detection method for each patient

PatientSensitivity(%)Specificity(%)Selectivity(%)Falsedetections(h)11001001000296．241001000383．3393．4180．360．241001001000589．7499．8494．62161001001000767．0899．9683．330．33888．5699．6896．670．5971．3699．9185．530．41010098．8475．1701110099．4287．340．51210010010001310010010001497．6699．9799．720．51594．4499．8598．510．251697．9599．9397．4601710010010001810010010001910010010002010010010002110099．8091．120Mean94．5999．5594．750．18

For each patient, there are four statistical measures shown in Table 1 which are the sensitivity, specificity, selectivity, and false detection rate. It can be seen in Table 1 that the best sensitivity of 100%, specificity of 100%, selectivity of 100%, and false detection rate of 0/h are got with eight patients, respectively. The means of sensitivity, specificity and recognition accuracy are greater than 90.00%, and the mean of false detections is 0.18/h. Half of all the patients (patients 1, 4, 6, 10, 11, 12, 13, 17 to 21) had the sensitivities 100%. Thirteen patients (patients 1, 2, 4, 6, 10, 12, 13, 16 to 21) had no false detections.

To date, many seizure detection methods have been developed and investigated. Khan and Gotman[3]developed a seizure detection method for intracerebral monitoring using features of relative energy, coefficient of variation and relative amplitude. The method was evaluated on long-term EEG data from 11 patients, including 229 h and 66 seizures, and achieved a sensitivity of 87%. Compared to their system, our proposed approach yielded a higher sensitivity.

Recently, Aarabi et al.[14]developed a fuzzy rule-based system for epileptic seizure detection in intracranial EEG. The system was based on knowledge obtained from experts’ reasoning. Temporal, spectral and complexity extracted from intracranial EEG segments were used as features, and spatio-temporally integrated using the fuzzy rule-based system for seizure detection. The system yielded a sensitivity of 98.7%, a false detection rate of 0.27/h on the same database with us. In comparison to their system, the false detection rate of our algorithm is much better.

Chua et al.[15]improved a patient-specific seizure detection method for pre-surgical evaluation. Their system presented a method for adapting a subject-independent seizure detection system to subject-specific ones using feedback from the EEG technologist. The subject-specific scheme yielded a sensitivity of 78% and a false alarm rate of 0.18/h by testing on 529 h of intracranial EEG containing 63 seizures from 15 subjects in the same database as our method. Compared to this system, our system obtained a better sensitivity.

4 Conclusion

The visual scanning of EEG recordings for the spikes and seizures is very time consuming, especially in the case of long recordings. In this paper, we propose a novel method to detect the seizures from long-term EEG using gradient boosting. Relative fluctuation index is extracted as feature of EEG signals. The gradient boosting is utilized to build a classifier to discriminate the seizure and non-seizure EEGs. Sophisticated optimization algorithms, like those used for SVM or for independent component analysis (ICA) are not necessary for the boosting method presented here, making the algorithm fast to implement. A post-processing scheme composed of a moving average filter and a collar operation is applied to improve the performance of the detector. The seizure detection method is evaluated on Freiburg dataset with 21 patients. Experimental results indicate that the proposed method performs with an average sensitivity of 94.60% and specificity of 99.55% with a false detection rate of 0.18/h. Additionally, the low computational cost of this detect method makes it possible for real-time application.

[1] Sanei S, Chambers J A. EEG signal processing. Chichester: John Wiley & Sons Ltd, 2007.

[2] Gotman J. Automatic recognition of epileptic seizures in the EEG. Electroencephalography and Clinical Neurophysiology, 1982, 54(5): 530-540.

[3] Khan Y U, Gotman J. Wavelet based automatic seizure detection in intracerebral electroencephalogram. Clinical Neurophysiology, 2003, 114(5): 898-908.

[4] Gardner A, Krieger A, Vachtsevanos G, et al. One-class novelty detection for seizure analysis from intracranial EEG. Journal of Machine Learning Research, 2005, 7: 1025-1044.

[5] Temko A, Thomas E, Boylan G, et al. An SVM-based system and its performance for detection of seizures in neonates, In: Proceedings of IEEE International Conference on Engineering in Medicine and Biology, 2009: 2643-2646.

[6] Gabor A G, Leach R R, Dowla F U. Automated seizure detection using a self-organizing neural network. Electroencephalography and Clinical Neurophysiology, 1996, 99: 257-266.

[7] Hoffmann U, Garcia G, Vesin J M, et al. A boosting approach to P300 detection with application to brain-computer interfaces. In: Proceedings of the 2nd International IEEE EMBS Conference on Neural Engineering, 2005: 97-100.

[8] EEG database. Epilepsy Center of the University Hospital of Freiburg.[2014-12-01]. https://epilepsy.uni-freiburg.de/freiburg-seizure-prediction-project/eeg-database/.

[9] Ocak H. Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy. Expert Systems with Applications, 2009, 36(2): 2027-2036.

[10] Grewal S, Gotman J. An automatic warning system for epileptic seizures recorded on intracerebral EEGs. Clinical Neurophysiology, 2005, 116(10): 2460-2472.

[11] Friedman J H. Greedy function approximation: a gradient boosting machine. Annuals of Statistics, 2001, 29: 1189-1232.

[12] Friedman J H. Stochastic gradient boosting. Nonlinear Methods and Data Mining, 2002, 38(4): 367-378.

[13] Temko A, Thomas E, Marnane W, et al. EEG-based neonatal seizure detection with support vector machines. Clinical Neurophysiology, 2011, 122(3): 464-473.

[14] Aarabi A, Fazel-Rezai R. A fuzzy rule-based system for epileptic seizure detection in intracranial EEG. Clinical Neurophysiology, 2009, 120(9): 1648-1657.

[15] Chua E C, Patel K, Fitzsimons M, et al. Improved patient specific seizure detection during pre-surgical evaluation. Clinical Neurophysiology, 2011, 122(4): 672-679.

基于梯度boosting的癫痫脑电检测方法

陈爽爽1,2，周卫东1,2，耿淑娟1,2，袁琦1,2，王纪文3

(1. 山东大学苏州研究院，江苏苏州 215123； 2. 山东大学信息科学与工程学院，山东济南 250100；3. 山东大学齐鲁医院，山东济南 250100)

自动癫痫脑电检测对癫痫的诊断具有重要意义，可以减轻监测长期脑电的工作强度。本文提出和探讨一种基于梯度boosting的长程脑电癫痫检测的新机器学习算法。该算法提取长程脑电的相对波动指数作为特征，采用梯度boosting算法训练分类器来识别发作和正常脑电。最后采用平滑和“collar”技术作为后处理进一步提高检测准确率。利用弗莱堡21位病人的脑电数据对该癫痫检测算法进行评估，实验表明，该算法的平均灵敏度为94.6%，误检率为0.18/h。

脑电信号；癫痫检测；小波变换；波动指数；梯度boosting

CHEN Shuang-shuang, ZHOU Wei-dong, GENG Shu-juan, et al. Approach for epileptic EEG detection based on gradient boosting. Journal of Measurement Science and Instrumentation, 2015, 6(1)： 96-102.

10.3969/j.issn.1674-8042.2015.01.017

s： Key Program of Natural Science Foundation of Shandong Province (No.ZR2013FZ002); The Program of Science and Technology of Suzhou (No.ZXY2013030); Independent Innovation Foundation of Shandong University (No.11170074611102)

ZHOU Wei-dong (wdzhou@sdu.edu.cn)

1674-8042(2015)01-0096-07 doi： 10.3969/j.issn.1674-8042.2015.01.017

Received date： 2014-12-10

CLD number： TN911.7 Document code： A