Statistical analysis of healthy individual’s and patient with epilepsy’s EEG signal.

## Abstract

In this study, EEG data from two volunteer individuals, a healthy individual and a patient with epilepsy, were investigated with two different methods in order to distinguish healthy and patient individuals from each other. The data were obtained from a healthy individual and from a patient with epilepsy at the time of epileptic seizure and of seizure-free interval. The data are those of which validity and reliability were proven and were supplied from the data bank records of University Hospital of Bonn in Germany. In the study, the statistical parameters of the collected data were calculated, then the same data were analysed using short-time Fourier transform (STFT) method, and then they were compared. Both statistical parameter results and spectrum analysis results are compatible with each other, and they can successfully detect healthy individuals and epileptic patients at the time of epileptic seizure and seizure-free interval. In this sense, the results were mathematically highly compatible, which offers significant information for the diagnosis of the disease. In the analysis, the variance values were determined as 253.203 for the healthy individual, 806.939 for the patient at seizure-free interval and 6985.755 for that patient at the time of seizure. Accordingly, standard deviation can be said to be quite distinctive in the designation of values. The frequencies of all three cases resulted in 0, 0–5 and 0–20 Hz, respectively, as a result of conducted STFT analysis, which is quite consistent with the results of the statistical analysis parameters.

### Keywords

- electroencephalogram
- statistical analysis
- epilepsy
- STFT
- seizure

## 1. Introduction

Temporary clinical conditions, including loss of consciousness, sensory, autonomic and mental disorders, arising from excessive electrical discharges in the nerve cells in the brain, with certain intervals are called as seizure. The condition which becomes chronic with the repetition of these seizures is called as epilepsy. Epilepsy is a chronic disorder that affects the brain and that can be encountered in people of all age groups. It is a neurological disease, most commonly seen in childhood and adolescence periods, and is the second most common disease in adults, followed by brain vessel diseases [1, 2, 3]. According to the World Health Organization data, 50 million people around the world are patients with epilepsy [4, 5]. EEG is also used as an auxiliary diagnostic method in the diagnosis of epilepsy, in addition to clinical information, and EEG analysis is performed on patients who are considered to have epileptic seizures [6, 7, 8]. Mathematical and spectral methods are used very effectively for the diagnosis of the disease during the analysis of EGG data [9, 10, 11].

EEG method establishes the basis of epilepsy science, and its history dates back to the 1940s and is used since then. In principle, it is based on the recording of fluctuations of electrical activity of neurons in the brain, and the main contributions of EEG for epileptic cases can be summarized as follows: supports a clinically identified diagnosis; is used as a confirmatory test; helps to make diagnosis correctly; directly and indirectly identifies seizure type and epilepsy syndrome, together with some findings; and informs about the location of focus [12].

Delta (δ) waves are those with frequencies of 1–4 Hz and amplitudes of 20–400 μV. They are seen in cases when the brain has very low activity, such as deep sleep, general anesthesia, immune system, natural recovery.

Theta (θ) waves are those with frequencies of 5–7 Hz and amplitudes of 5–100 μV. They are seen in cases when the brain has low activity, such as sleep with dream, middle anesthesia, stress, emotional commitment.

Alpha (α) waves are those with frequencies of 8–13 Hz and amplitudes of 2–10 μV. They are seen in cases when awake individuals are physically and mentally full resting, there is no any external stimulant, in relaxed positions and when eyes are closed. They are most prominently observed in records obtained from the occipital region.

Beta (β) waves are those with frequencies of 14–30 Hz and amplitudes of 1–5 μV. They are seen in cases, including focused attention, mental work, problem solving, memory, sensory information processing, rapid eye movements phase of sleep [13, 14, 15, 16].

The EEG signals used in this study are registered at the University Hospital of Bonn in Germany [17]. The dataset consists of five subsets (denominated as A, B, C, D and E) that are recorded with the same 128-channel amplifier system and 12-bit analog-to-digital converter. Each of the subsets contains 100 segments with a sampling frequency of 173.61 Hz and a duration of 23.6 s, i.e. 4096 sample points; the corresponding frequency bandwidth is 86.8 Hz. Subsets B, D and E were analysed in this study. While EEG samples in set B were obtained from five healthy volunteers via external surface electrodes, for closed eye condition, set D consisted of EEG segments recorded from patients with epilepsy using intracranial electrodes to monitor epileptic activity, obtained at the time without seizure. Set D data were obtained from epileptic area, and they were recorded. Set E contains EEG data, obtained from patients with epilepsy, recorded at the time of seizure. Strip electrodes were used while recording set E data.

## 2. Statistical and mathematical background

EEG signals are not deterministic. Since EEG signals do not have a specific shape as electrocardiogram (ECG) signals do, statistical and parametric methods are used in the analysis of EEG signals [18, 19, 20]. Spectral methods are used for the classification and characterization of EEG signals [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35].

### 2.1 Statistical analysis methods

Statistical parameters are used to obtain necessary properties, in most of the analyses performed in time domain. Although the mean and median values of signal are expected to result in pretty near zero when the signals have a periodical and sinusoidal structure, these values can get away from zero by taking positive or negative values in non-periodical signals. In the analyses, the most basic mean value, μ, and the standard deviation, σ, can produce distinctive results on non-periodic signals [19]. For a given data set {xi}, these are defined as follows:

where N is the number of the data points.

Knowing the standard deviation value for a given series of numbers and understanding this concept mean knowing to what extent this series is distributing around an average. The bigger standard deviation indicates that datapoints get further away from the average; a small standard deviation indicates that data points form more close groups around the average.

In practice, data often correspond to normal probability distribution (Gaussian), which is actually due to the central limit theorem. According to the central limit theorem, the sum of random variables, which are independent of each other and all have the same distribution, tends to follow a normal distribution at the limit. Here, skewness (α) and kurtosis (β), two functions obtained from the Gaussian distribution, are given in the following Equations [20]:

Here, when α equals zero, it means a perfect normal distribution, and when α takes negative or positive values, it means symmetry is deflected towards the right or left side. In statistical calculations, if the skewness is negative, the tail of the curve will extend to the left, and the distribution will densify on the right side of the graph. If the skewness is positive, the tail of the curve will extend to the right, and the distribution will densify on the left side of the graph. The kurtosis (β) is very close to 3 for the normal distribution. These statistical parameters can be used to quickly check the changes in the statistical behaviour of a signal [18, 19, 20].

### 2.2 Fourier transform and STFT

The Fourier transform (FT) method is one of the most effective methods used to process signals, in order to obtain information in the signal. In the Fourier transform method, a signal is expressed as the sum of the fundamental cosine and sinus components at different amplitudes, frequencies, and phases. The tabulation of each component with its frequency and amplitude provides convenience during data processing through computers. The equations for Fourier transform are given below in Eq. (5) and Eq. (6) [23, 26, 29, 30, 33]:

Similarly, based on the Fourier transform method, short-time Fourier transform (STFT) and spectrogram were developed by Gabor in 1946. This method most clearly reveals time and frequency localization [17]. STFT provides very decisive results in the analysis of signals. Here, a x(t) signal is used in a fixed window size and in frequency resolution. To define the STFT, let us consider a signal x(t) with assumption that it is stationary when it is windowed through a fixed dimension window g(t), cantered at time location τ. The Fourier transform of the windowed signal yields the STFT [23, 25, 26, 33, 34, 35].

Similarly, for two-dimensional, discretely timed signals, this time-frequency function (t, f) is given in Eq. (8). Here, window g(t) is chosen; the STFT resolution is fixed over the entire time-frequency plane [23, 24, 25, 26, 27, 28, 29, 30]:

The spectrogram is given in Eq. (9):

## 3. Analysis and application

In this study, the data from two different volunteers were used in the analysis of EGG data. One of these individuals is healthy and the other one is a patient with epilepsy. The healthy individual’s eyes are closed (Figure 1a). The data of the patient with epilepsy were collected when he did not suffer a seizure, and these data were taken from the epileptic area (Figure 1b).

When the data of two individuals, healthy and patient with epilepsy, given in Figure 1, are examined, it is seen that while the graphical amplitude of the healthy individual changes in the range of 0–40 μV as shown in Figure 1a, the graphical amplitude of the patient with epilepsy rises up to 125 μV as shown in Figure 1b. As is understood from this graph, the amplitudes of two individuals, healthy and patient with epilepsy, are sufficient to clearly diagnose the disease. The graphical amplitude gets maximum value despite the absence of seizure as shown in Figure 1b. However, the moment when the same individual has a seizure shown in Figure 1c), the amplitudes are seen to increase much. These graphs clearly indicate whether the individual is a patient with epilepsy and whether those patients with epilepsy suffer a seizure (Table 1).

Mean (μ) | Standard deviation (σ) | Variance | Skewness(α) | Kurtosis(β) | |
---|---|---|---|---|---|

Healthy | −1.347 | 15.912 | 253.203 | −0.027 | 3.043 |

Free seizure | −5.24 | 28.406 | 806.939 | 1.437 | 5.738 |

Seizure | −2.521 | 83.580 | 6985.755 | −0.100 | 3.268 |

When the statistical parameters of the data from the healthy individual and of the epileptic area of the patient with epilepsy are examined, it is seen that there are significant differences both in the mean value and in the standard deviation. The mean value yields −1.347 in the healthy individual, −5.24 in the patient with epilepsy and − 2.521 when the patient with epilepsy suffers a seizure. As a result of the analysis, standard deviation values can be said to be quite decisive in the diagnosis of the disease. However, while healthy individual and the patient with epilepsy can be distinguished from each other in histograms, the histogram shows a normal distribution at the time of seizure.

In this context, histogram graphs can only be used in the diagnosis of epilepsy, but not in detecting seizure moments. The histograms of the healthy individual and of the patient with epilepsy are given in Figures 2 and 3. Figure 4 shows histograms of the data of the patient with epilepsy at the time of seizure.

The spectrum analysis of data received from individuals is given in Figures 5–7. The frequency values were limited to 50 Hz, considering general characteristics of EEG frequency values. At the spectral analysis of the data of the healthy individual, the frequency values were seen to be about 10 Hz, and this corresponds to the alpha waves. When the spectrogram of the patient with epilepsy was examined at the time of free seizure, it was observed that the frequency values corresponded to the range of 0.5–4 Hz (delta wave), and when the data of the same patient were examined at the time of seizure, the frequency values were seen to distribute in the range of 0–20 Hz.

## 4. Discussion

First of all, it should be noted that the data used in the study provide validity and reliability conditions. In this sense, the data used in this study were used in many articles and obtained from the records of the database of University Hospital of Bonn in Germany [11]. Time-frequency-based techniques have been used in many studies using EEG data [36]. In most of these studies, anomalies in the brain can be determined from the high-frequency difference [36, 37, 38, 39, 40, 41, 42]. In this study, the EEG data of individuals with different conditions (non-patient, sick and seizure patient) were analysed. The analyses were compared. This comparison makes it easy to classify the patients. In this study, EEG analysis was approached from different perspectives compared to other studies. Traditionally, basic linear analyses and statistical approaches have been used in time and frequency fields. In this sense, it can be said that the study contains more definite, distinctive results than other analyses. In the literature, the amplitude of the signal, the distance between seizure and non-seizure intervals and the energy ratio of EEG have been investigated. These studies have been used as a criterion for the evaluation of epileptic activity [43, 44, 45, 46, 47]. Today, many mathematical methods are used in the analysis of EEG data [48, 49, 50]. Data collection systems are constantly changing with the developing technology. In the future, it is predictable that the data will be made by remote sensing. Furthermore, the analysis of EEG data with artificial intelligence methods can be developed as a tool. With this tool, neurofeedback applications can be considered as the most important method in the treatment and development.

## 5. Conclusions

In this study, the data from healthy individual and from the patient with epilepsy were examined. The collection and analysis of data of the patient with epilepsy at both the time of seizure and of seizure-free interval are important for the diagnosis of the disease. In this study, first of all, statistical analyses were performed, and as a result of the analysis, the mean and standard deviation values of the healthy individual and the patient with epilepsy suggest very decisive results. Figure 8 shows the comparison of the histograms of the individuals.

Variance, one of the statistical parameters, also produces meaningful results in distinguishing patient and healthy individuals. In this study, variance value yields 253.203 for the healthy individual, 806.939 for the patient with epilepsy at seizure-free interval and 6985.755 for that patient at the time of seizure. Other statistical parameters also revealed very clear results in distinguishing patient and healthy individuals. However, the results of the STFT analyses support the statistical parameters. In this study, the frequency values of the healthy individual were seen to distribute around 10 Hz, in the range of 0–5 Hz for the patient with epilepsy at seizure-free interval and in the range of 0–20 Hz for that patient at the time of seizure. Accordingly, it can be said that the statistical and spectral analyses made are quite decisive for the diagnosis of the disease.