Prediction of Epileptic form Activity in Brain Electroencephalogram Waves using Support vector machine

Document Sample
Prediction of Epileptic form Activity in Brain Electroencephalogram Waves using Support vector machine Powered By Docstoc
					                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                          Vol. 8, No. 6, September 2010

     Prediction of Epileptic form Activity in Brain
         Electroencephalogram Waves using
               Support vector machine
 1                                                                          2
  Pavithra Devi S T                                                         Vijaya M S
 M.Phil Research Scholar                                                    Assistant Professor and Head
 PSGR Krishnammal College for Women                                         GRG School of Applied Computer
 Coimbatore Tamilnadu, INDIA                                                Technology                                                 PSGR Krishnammal College for Women
                                                                            Coimbatore Tamilnadu, INDIA

ABSTRACT                                                                  hearing things that are not there, or having a sudden flood
Human brain is a highly complex structure composed of                     of emotions. Complex partial epilepsy usually starts in a
millions of nerve cells. Their activity is usually well organized         small area of the temporal lobe or frontal lobe of the brain.
with mechanisms for self-regulation. The neurons are                      In general epilepsy the patient becomes unconscious the
responsible for a range of functions, including consciousness             patient has a general tonic contraction of all their muscles,
and bodily functions and postures. A sudden temporary                     followed by alternating colonic contractions. It affects the
interruption in some or all of these functions is called a                entire brain.
seizure. Epilepsy is a brain disorder that causes people to have
recurring seizures. Electroencephalogram (EEG) is an                          Various diagnostic techniques like Computed
important diagnostic test for diagnosing epilepsy because it              Tomography (CT), Magnetic Resonance Imaging (MRI),
records the electrical activity of the brain. This paper                  Electroencephalogram (EEG), and Positron Emission
investigates the modeling of epilepsy prediction using Support            Tomography         (PET)     are    commonly      presented.
Vector Machine, a supervised learning algorithm. The                      Electroencephalography (EEG) is the recording of
prediction model has been employed by training support                    electrical activity along the scalp produced by the firing of
vector machine with evocative features derived from EEG                   neurons within the brain. In clinical contexts, EEG refers to
data of 324 patients and from the experimental results it is              the recording of the brain's spontaneous electrical activity
observed that the SVM model with RBF kernel produces 86%
                                                                          over a short period of time, usually 20–40 minutes, as
of accuracy in predicting epilepsy in human brain.
                                                                          recorded from multiple electrodes placed on the scalp. The
Keywords                                                                  Electroencephalograph (EEG) signal is one of the most
Support Vector Machine, Epilepsy, Prediction, Supervised                  widely signal used in the bioinformatics field due to its rich
Learning.                                                                 information about human tasks for epilepsy identification
                                                                          because of its characteristics like frequency range, spatial
                   1. INTRODUCTION                                        distributions and peak frequency. EEG waves are observed
    Epilepsy is a disorder characterized by recurrent                     by neurologists based on spectra waveform of the signal to
seizures of cerebral origin, presenting with episodes of                  identify the presence of epilepsy.
sensory, motor or autonomic phenomenon with or without                        Machine learning provides methods, techniques and
loss of consciousness. Epilepsy is a disorder of the central              tools, which help to learn automatically and to make
nervous system, specifically the brain [1]. Brain is one of               accurate predictions based on past observations. Current
the most vital organs of humans, controlling the                          empirical results prove that machine learning approach is
coordination of human muscles and nerves. Epileptic                       well-matched for analyzing medical data and machine
seizures typically lead to an assortment of temporal                      learning techniques produce promising research results to
changes in perception and behavior. Based on the                          medical domains.
physiological characteristics of epilepsy and the
abnormality in the brain, the kind of epilepsy is determined.                Forrest Sheng Bao carried out the work and developed
Epilepsy is broadly classified into absence epilepsy, simple              a neural network based model for Epilepsy diagnosis using
partial, complex partial and general epilepsy. Absence                    EEG [1]. Piotr Mirowski carried out the work and
epilepsy is a brief episode of staring. It usually begins                 implemented a model based on classification of patterns of
between ages 4 and 14. It may also continue to adolescence                EEG synchronization for seizure prediction using neural
or even adulthood. Simple partial epilepsy affects only a                 network [2]. Suleiman A.B. R. proposed a new approach
small region of the brain, often the hippocampus. It can                  for describing and classifying the EEG brain natural
also include sensory disturbances, such as smelling or                    oscillations such as delta, theta, alpha, and beta frequencies

                                                                                                     ISSN 1947-5500
                                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                              Vol. 8, No. 6, September 2010

using Wigner-Ville analysis with Choi-Willians filtering                              mental activity. Their amplitude is highest in the
and neural network [3].                                                               occipital region. When the person is asleep, the
                                                                                      alpha waves disappear. When the person is alert
   The motivation behind the research reported in this
                                                                                      and their attention is directed to a specific activity,
paper is to predict the presence of epilepsy in human brain.
                                                                                      the alpha waves are replaced by asynchronous
Supervised learning technique, a kind of machine learning
                                                                                      waves of higher frequency and lower amplitude.
algorithm is used to model the epilepsy prediction problem
as classification task to assist physician for accurate                                        Beta waves have a frequency range of 14
prediction of epilepsy in patients.                                                   to 22 Hz, extending to 50 Hz under intense mental
                                                                                      activity. It has their maximum amplitude (less than
    In this paper, the prospective benefits of supervised
                                                                                      20 mV) on the parietal and frontal regions of the
learning algorithm namely support vector machine are
                                                                                      scalp. There are two types: beta I waves, lower
made use of for the computerized prediction of epilepsy.
                                                                                      frequencies which disappear during mental
The proposed SVM based epilepsy prediction model is
                                                                                      activity, and beta II waves, higher frequencies
shown in Figure 1.
                                                                                      which appear during tension and intense mental
                                                                                              Gamma waves have frequencies between
                                   Feature Extraction using
                                                                                      22 and 30 Hz with amplitude of less than 2 mV
                                   Wavelet    Toolbox    in
                                   MATLAB                                             peak-to-peak and are found when the subject is
                                                                                      paying attention or is having some other sensory
                                                                                              Theta waves have a frequency range
                                                                                      between 4 to 7 Hz with amplitude of less than100
                                                                                      mV. It occurs mainly in the parietal and temporal
                                        SVM Training                                  regions in sleep and also in children when awake,
                                                                                      and during emotional stress in some adults,
                                                                                      particularly during disappointment and frustration.
                                                                                      Sudden removal of something causing pleasure
                                                                                      will cause about 20 s of theta waves.
                                        SVM Based
                                      Prediction model                                        Delta waves have frequency content
                                                                                      between 0.5 and 4 Hz with an amplitude less than
                                                                                      100 mV. It occurs during deep sleep, during
                                                                                      infancy and in serious organic brain disease. They
                                                                                      will occur after transactions of the upper brain
                                                                                      stem separating the reticular activating system
                                          Prediction                                  from the cerebral cortex. They are found in the
                                                                                      central cerebrum, mostly the parietal lobes.
                                                                                Five sets of images namely Normal Epilepsy, Absence
                                                                             Epilepsy, Simple Partial Epilepsy, Complex Partial
      Figure 1.Proposed SVM based epilepsy prediction model
                                                                             Epilepsy and General Epilepsy are taken into
                 2. DATA ACQUISITION
    EEGs show continuous oscillating electric activity. The
amplitude and the patterns are determined by the overall                                    3. FEATURE EXTRACTION
excitation of the brain which in turn depends on the activity                    Feature extraction process plays a very important role
of the reticular activating system in the brain stem.                        on the classification. Fourier transformation method,
Amplitudes on the surface of the brain can be up to 10 mV,                   discrete    transformation     method     and    continuous
those on the surface of the scalp range up to 100 mV.                        transformation methods are normally available to extract
Frequencies range from 0.5 to 100 Hz. The pattern changes                    features that characterize EEG signals. The wavelet
markedly between states of sleep and wakefulness. Distinct                   transform (WT) provides very general techniques which
patterns are seen in epilepsy and five classes of wave                       can be applied to many tasks in signal processing. Wavelets
groups are described as alpha, beta, gamma, delta and                        are ideally suited for the analysis of sudden short-duration
theta.                                                                       signal changes.

                 Alpha waves contain frequencies                                In the proposed model, EEG signal analysis and feature
         between 8 and 13 Hz with amplitude less than 10                     extraction have been performed using Discrete Wavelet
         mV. It found in normal people who are awake and                     Transform (DWT). The DWT is a extraordinary case of the
         resting quietly, not being engaged in intense                       WT that provides a compact representation of a signal in
                                                                             time and frequency that can be computed efficiently.

                                                                                                         ISSN 1947-5500
                                                           (IJCSIS) International Journal of Computer Science and Information Security,
                                                           Vol. 8, No. 6, September 2010

   The DWT is defined by the following equation:                             The energy is computed using E is given by

     Ψ (a,b) (t) = 2 a/2 ψ (2 a/2 (t-b))             (1)
                                                                                  E=∑xi2/N                                                   (3)

    where a is a scales and b is positions of the wavelet
mother ψ (t) is a time function with finite energy. Choosing                 where xi is signal value, values are present in waves is
scales and positions are based on powers of two, which are                denoted as n. Total number of signal is N
called dyadic scales and positions (a j=2 –j; bj,k=2j k) ( j and
k integers). Equation (1) shows that it is possible to build a
wavelet for any function by dilating a function ψ (t) with a                 Maximum Subband – It generate maximum of the
coefficient 2 j, and translating the resulting function on a              wavelet coefficients in each subband is calculated using
grid whose interval is proportional to 2−j.
    The selection of appropriate wavelet and the number of                         Max=Max(xi)                                      (4)
decomposition levels is very important in analysis of
signals using the WT. The number of decomposition levels                     where max (xi) is maximum number of signal value.
is chosen based on the dominant frequency components of
the signal. The levels are chosen such that those parts of the
signal that correlate well with the frequencies required for                  Mean – It is defined as average value of a distribution
classification of the signal are retained in the wavelet                  of the wavelet coefficients in each subband which is given
coefficients. The smoothing feature of the Daubechies                     by
wavelet of order 2 (db2) made it more suitable to detect
changes of the signals. Thus, the wavelet coefficients are                             n
computed using db2. The frequency bands corresponding
to different levels of decomposition for db2 with a                               E=∑xi/N                                                 (5)
sampling frequency of 256 Hz. The discrete wavelet                                     i=1
coefficients are computed using the MATLAB wavelet                            where xi is signal and total number of signal is present
toolbox.                                                                  in the wavelet is N

    The purpose of feature extraction is to reduce the size
of the original dataset by measuring certain properties or                   Minimum Subband – calculate minimum of                          the
features that distinguish one input pattern from another.                 wavelet coefficients in each subband is defined as
The various measurements based on statistical features
from EEG are extracted. The extracted features provide the
characteristics of the input type to the classifier by                                 Min=Min(xi)                                        (6)
considering the description of the relevant properties of the
signals into a feature space.                                                  Where min (xi) is minimum number of signal value.
    The statistical feature of the wavelet coefficients in
each subband such as energy, entropy, Minimum subband,
                                                                             Standard deviation - standard deviation of each
maximum subband, mean, and standard deviation are used
                                                                          subband is defined as σ .This feature provide information
to investigate the adequacy for the discrimination of normal
                                                                          about the amount of change of the frequency distribution.
and abnormal patients. The following statistical features
have been derived using the following.
   Entropy is the diminished capacity for spontaneous                              n
changes in signals.
                                                                               σ=∑(x-µ)2                                              (7)

Entropy =     P(i, j ) log P(i, j )
              i, j
                                                                              where ∑ is sum of squared elements in the wavelet,x is
                                                                          signal value and µ is a mean of the corresponding
    Where P(i, j) reflects the distribution of the probability            signal(xi).
of occurrence of each signal (i , j are integer).                           Thus a total of 21 statistical feature are extracted from
                                                                          EEG signal for each subband for preparing dataset.

   Energy – Provides the sum of squared elements in the
wavelet. This is also known as uniformity or the angular
second moment.

                                                                                                      ISSN 1947-5500
                                                           (IJCSIS) International Journal of Computer Science and Information Security,
                                                           Vol. 8, No. 6, September 2010

            4. SUPPORT VECTOR MACHINE                                     dimension of the real feature space can be very high or
     Support Vector Machine (SVM) is a kind of learning                   even infinite. The parameters are obtained by solving the
machine based on statistical learning theory. SVM is                      following non linear SVM formulation (in Matrix form),
basically applied to model pattern classification task. SVM
first, maps the input vectors into feature vectors in feature
space with a higher dimension, either linearly or non-                    Minimize LD(u)=1/2uT Qu - eT u                               (9)
linearly. Then, within the feature space SVM constructs a
hyperplane which separates two classes. SVM training                                  dTu=0 0≤u≤Ce
always seeks a global optimized solution and avoids over-
fitting, thus it has the ability to deal with a large number of           where and K - the Kernel Matrix. Q = DKD.
features. The machine is presented with a set of training
examples, (xi, yi) where the xi is the real world data                       The Kernel Function K (AAT) (polynomial or
instances and the yi are the labels indicating which class the            Gaussian) is used to construct hyperplane in the feature
instance belongs to. For the two class pattern recognition                space, which separates two classes linearly, by performing
problem, yi = +1 or yi = -1. A training example (xi, yi) is               computations in the input space.
called positive if yi = +1 and negative otherwise. SVMs
construct a hyperplane that separates two classes and tries
                                                                                                f(x)= sgn(K(x,xiT)*u-γ)
to achieve maximum separation between the classes.
Separating the classes with a large margin minimizes a
                                                                              Where u - the Lagrangian multipliers. In general the
bound on the expected generalization error.
                                                                          larger the margin the lower the generalization error of the
    The simplest model of SVM called Maximal Margin                       classifier.
classifier, constructs a linear separator (an optimal
hyperplane) given by w T x - y= 0 between two classes of                                5. EXPERIMENTAL SETUP
examples. The free parameters are a vector of weights w                       The data investigation and epilepsy prediction is carried
which is orthogonal to the hyper plane and a threshold                    out using SVMlight1 for machine learning. Five categories
value. These parameters are obtained by solving the                       of feature vectors are labeled as 1 for Absence, 2 for
following optimization problem using Lagrangian duality.                  General, 3 for Complex Partial Epilepsy, 4 for Normal
                                                                          Epilepsy and 5 for Simple Partial Epilepsy, The training
             1       2
                                                                          dataset used for epilepsy prediction modeling consists of
Minimize =     w
             2                                                            about 324 images, where each category consists of about

                    w x           
Subject to D   ii          i      1, i  1,......, l.     (8)
                                                                              The dataset has been trained using SVM with linear,
                                                                          polynomial and RBF kernel and with different parameter
                                                                          settings for d, gamma and C–regularization parameter. The
    where Dii corresponds to class labels +1 and -1. The                  parameters d and gamma are related with polynomial
instances with non-null weights are called support vectors.               kernel and RBF kernel respectively.
In the presence of outliers and wrongly classified training
examples it may be useful to allow some training errors in                   The 10 fold cross validation method is used for
order to avoid over fitting. A vector of slack variables ξi               evaluating the performance of the SVM based trained
that measure the amount of violation of the constraints is                models. The performance of the models is evaluated based
introduced and the optimization problem referred to as soft               on prediction accuracy of the models and learning time.
margin is given below. In this formulation the contribution
                                                                                             6. RESULTS AND DISCUSSION
to the objective function of margin maximization and
training errors can be balanced through the use of                           The cross validation outcome of the trained models
regularization parameter C.                                               based on support vector machine with linear kernel is
                                                                          shown Table I.
    The following decision rule is used to correctly predict
the class of new instance with a minimum error.                                                 Table I. SVM Linear kernel

                                                                               Linear SVM       C=0.1       C=0.2            C=0.3     C=0.4
         f(x)= sgn[wtx-γ]
                                                                               Accuracy (%)       70          72              76               78
    The advantage of the dual formulation is that it permits
an efficient learning of non–linear SVM separators, by                          Time(secs)       0.01        0.02            0.02          0.03
introducing kernel functions. Technically, a kernel function
calculates a dot product between two vectors that have
been (non- linearly) mapped into a high dimensional
feature space. Since there is no need to perform this                         SVMlight is an open source tool.
mapping explicitly, the training is still feasible although the 

                                                                                                        ISSN 1947-5500
                                                                                       (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                       Vol. 8, No. 6, September 2010

    The outcome of the model based on SVM with
polynomial kernel and with parameters d and C are shown                                                                                   Learning Time
in Table II.
                       Table II. SVM Polynomial kernel                                                         0.7
                       C=0.1                    C=0.2              C=0.3               C=0.4                   0.5
       d               1          2         1          2       1           2       1           2               0.4                                               Time(secs)
Accuracy (%)       70             80        82        80       80        81        74       75
                                                                                                               0.1     0.056
 Time(secs)        0.2           0.1        0.2       0.6      0.3       0.1       0.3      0.4
                                                                                                                     Linear    Polynomial     RBF

   The predictive accuracy of the non-linear support
                                                                                                                                   Figure 3: Learning Time
vector machine with the parameter gamma (g) of RBF
kernel and the regularization parameter C is shown in
Table III.                                                                                                   As far as the epilepsy predictions task is anxious,
                                                                                                         accuracy plays major role in determining the performance
                                                                                                         of the epilepsy trained model than considering the learning
                               Table III.SVM RBF kernel                                                  time. From the above results, it is found that the predictive
                           C=0.1                    C=0.2           C=0.3          C=0.4                 accuracy shown by SVM with RBF kernel with parameters
                                                                                                         C=0.2 and g=2 is higher than the SVM with linear and
       g                   1       2            1          2    1          2       1       2             polynomial kernel.
Accuracy (%)           80          83        83         81      83       86    85          77
                                                                                                                                      7. CONCLUSION
  Time(secs)           0.2        0.3        0.4        0.4     0.5      1.5   1.6        1.2                This paper elucidates the modeling of the epileptic
                                                                                                         seizure prediction task as multi-class classification problem
                                                                                                         and the implementation of supervised learning algorithm,
                                                                                                         support vector machine. The performance of SVM based
                                                                                                         epilepsy prediction models is evaluated using 10 fold cross
    The average and comparative performance of the SVM                                                   validation and the results are analyzed. The results indicate
based prediction model in terms of predictive accuracy and                                               that the support vector machine with RBF kernel provide
learning time is given in Table IV and shown in Figure 1                                                 the high prediction accuracy compared to other kernels.
and Figure 2.                                                                                            SVM is better than conventional methods and show good
              Table IV. Overall performance of three models                                              performance in all experiments it is very flexible and more
                                                                                                         powerful because of its robustness. It is hoped that more
Kernel type                        Accuracy                          Learning time
                                                                                                         interesting results will follow on further exploration of
 Linear                                84.96%                         0.027 secs                         data.
 Polynomial                            90.12%                         0.362 secs
                                                                                                                       8. ACKNOWLEDGMENT
 RBF                                   93.87%                         0.787 secs
                                                                                                            The author would like to thank the Management and
                                  Prediction Accuracy
                                                                                                         Hospital, Coimbatore for providing the EEG data.

    100                                                86                                                                           9. REFERENCES
                 78                                                                                      [1] Forrest Sheng Bao , Jue-Ming Gao, Jing Hu , Donald Y. C.
     75                                                                                                      Lie , Yuanlin Zhang , and K. J. Oommen. ―Automated
                                                                                                             Epilepsy Diagnosis Using Interictal Scalp EEG‖. 31st
     50                                                                 Accuracy(%)
                                                                                                             Annual International Conference of the IEEE EMBS
                                                                                                             Minneapolis, Minnesota, USA, September 2-6, 2009.
                                                                                                         [2] Piotr Mirowski MSc*, Deepak Madhavan Yann Le
                                                                                                             Cun, uben Kuzniecky ― Classification of Patterns of
              Linear           Polynomial            RBF                                                     EEG Synchronization for Seizure Prediction‖.
                                                                                                         [3]     A. R.Sulaiman, ― Joint Time - Frequency Analysis and
                           Figure 2. Prediction Accuracy                                                        Its pplication for Non - Stationary Signals‖, Ph.D. Thesis
                                                                                                                Elect. Eng. Dept., University of Mosul, 2001.
                                                                                                         [4] Webster, J. G., ―Medical Instrumentation Application                  and
                                                                                                             esign‖, 2nd ed., New York: Wiley, 1995.

                                                                                                                                            ISSN 1947-5500
                                                      (IJCSIS) International Journal of Computer Science and Information Security,
                                                      Vol. 8, No. 6, September 2010

[5]    Nello Cristianini and John Shawe - Taylor. ―An
      Introduction to Support Vector Machines and other
      kernel - based earning methods‖ Cambridge University
      Press, 2000.
[6]    K. Crammer and Y. Singer. ―On the Algorithmic
      implementation of Multi – class SVMs, JMLR,
      2001. Vojislav Kecman: "Learning and Soft Computing —
      Support Vector Machines, Neural Networks, Fuzzy Logic
      Systems", The MIT Press, Cambridge, MA, 2001.
[7] Chui, C.K. (1992a), ―Wavelets: a tutorial in   theory and
    applications‖, Academic Press.
[8] lan H. Witten, Eibe Frank, Len Trigg, Mark Hall, Geoffrey
    Holmes, Sally.
[9] Ian H. Witten, Eibe Frank. : Data Mining – Practical
    Machine Learning Tools and Techniques. 2nd edn. Elsevier.
[10] Joachims T, Schölkopf B, Burges C, Smola A,‖Making
     large-Scale SVM Learning Practical. Advances in Kernel
     Methods - Support Vector Learning‖, 1999, MIT Press,
     Cambridge, MA, USA.
[11] John Shawe-Taylor, Nello Cristianini, ―Support Vector
     Machines and other kernel-based learning methods‖, 2000,
     Cambridge University Press, UK.
[12] Soman K.P, Loganathan R, Ajay V, ―Machine Learning with
     SVM and other Kernel Methods‖, 2009, PHI, India.
[13] Crammer Koby, Yoram Singer,―On the Algorithmic
     Implementation of Multi-class Kernel-based Vector
     Machines‖, Journal of Machine Learning Research, MIT
     Press, Cambridge, MA, USA, 2001, Vol.2 Page 265-292.

                                                                                                 ISSN 1947-5500

Description: IJCSIS is an open access publishing venue for research in general computer science and information security. Target Audience: IT academics, university IT faculties; industry IT departments; government departments; the mobile industry and computing industry. Coverage includes: security infrastructures, network security: Internet security, content protection, cryptography, steganography and formal methods in information security; computer science, computer applications, multimedia systems, software, information systems, intelligent systems, web services, data mining, wireless communication, networking and technologies, innovation technology and management. The average paper acceptance rate for IJCSIS issues is kept at 25-30% with an aim to provide selective research work of quality in the areas of computer science and engineering. Thanks for your contributions in September 2010 issue and we are grateful to the experienced team of reviewers for providing valuable comments.