

Contents lists available at ScienceDirect

Integration, the VLSI Journal



journal homepage: www.elsevier.com/locate/vlsi

# An area and power efficient VLSI architecture for ECG feature extraction for wearable IoT healthcare applications



## Meenali Janveja\*, Gaurav Trivedi

Department of EEE, Indian Institute of Technology Guwahati, 781039, India

## ARTICLE INFO

Keywords: IoT healthcare ECG Wavelet Low power Wearable devices

## ABSTRACT

The mortality rate due to cardiac abnormalities is enormous, making the development of wearables to monitor functioning of the heart of paramount importance. In this paper, we present a resource efficient and low power architecture using Integer Haar Wavelet Transform for the complete delineation of ECG signal. The novelty of the algorithm lies in the use of single scale wavelet coefficients to delineate P-QRS-T features making it computationally simple. The proposed architecture is implemented using Xilinx FPGA ZedBoard Zynq<sup>TM</sup>-7000 platform, and utilises only 4.38% of the available resources. It is synthesised using 180 nm CMOS technology consuming 0.88  $\mu$ W power, making it area as well as power-efficient for the wearable IoT healthcare devices.

## 1. Introduction

One of the prime application of IoT healthcare is the wearable devices that enable individuals to monitor their health statistics round the clock. As we know, the number of patients having heart abnormalities have been increasing due to stressful and unhealthy lifestyle since past few decades. According to the World Health Organization, cardiovascular diseases (CVD) takes a life toll of about 17 million each year worldwide [1]. By the year 2020, heart diseases have become the primary cause of mortality and disability globally. WHO anticipates this figure to increase about 24 million each year by the next decade. The rise in mortality rate is due to scarce medical facilities and healthcare centres that leads to delay in diagnosis of pathologies. Therefore, the need of the hour is to develop personalised cardiovascular disease monitoring wearable devices that are compact and reliable, and can operate in low power. These devices should monitor patients' continuously and connect them to the doctors preventing them from reaching any critical conditions. These devices should have strict power budget requirements so that they can run on low power to improve battery life. It necessitates the development of area and power efficient VLSI architectures that can perform ECG analysis in power and resources constrained environment.

Electrocardiogram (ECG) is a bio-signal, which is utilised by medical practitioners all over the world to monitor electrical activity of the heart. A human heart contains four chambers; two atria and two ventricles, whose rhythmic depolarisation and repolarisation is captured with the help of ECG. As we know, ECG has three primary features namely P wave, QRS complex and T wave, as shown in Fig. 1. Atrial

depolarisation and ventricular repolarisation is represented by P and T wave, respectively. The QRS wave characterises rapid depolarisation of the ventricles. Due to the large muscle mass of ventricles as compared to atria, QRS complex has a larger amplitude than the P wave. Any variation in the standard values of ECG features and derived intervals aids cardiologists to diagnose cardiac abnormalities [2]. Hence, accurate and efficient algorithms are required that can process the acquired ECG signal to extract fiducial points. The features extracted from an ECG are used individually or in a group for different analysis and classification of heart abnormalities.

Pan and Tompkins [3] propose an algorithm to extract QRS Complexes of ECG signal with an accuracy of 99.3%. This algorithm includes several complicated operations, such as differentiation, squaring and moving average. Despite a remarkable accuracy, it is inefficient for the area optimal and low power hardware implementation due to computational complexity. Several algorithms are developed that utilise discrete wavelet transform for ECG feature extraction [4-6]. These algorithms employ modulus maxima analysis and time-domain refinement to extract QRS complex and, P and T wave. These proposed algorithms use multiscale resolution coefficients increasing the memory footprint. Tekeste T. et al. introduce curve-length transform for ORS detection and discrete wavelet transform for P and T wave extraction [7]. The architecture developed for this method was realised using 65nm technology consuming 642nW power while operating at a frequency of 7.5 kHz. However, the use of both curve-length transform and wavelet transform makes it computationally complex. For efficient hardware implementation of algorithms, feature extraction module

\* Corresponding author. *E-mail addresses:* meena176102001@iitg.ac.in (M. Janveja), trivedi@iitg.ernet.in (G. Trivedi).

https://doi.org/10.1016/j.vlsi.2021.09.006

Received 24 June 2020; Received in revised form 9 May 2021; Accepted 25 September 2021 Available online 4 October 2021 0167-9260/© 2021 Elsevier B.V. All rights reserved.



Fig. 1. Normal ECG.



Fig. 2. Mallat's algorithm.

should be computationally inexpensive utilising minimum memory. Therefore, the objective of this paper is to develop a computationally efficient algorithm and to design memory and power-efficient architecture for extracting ECG features. In order to achieve this goal, optimisation is performed at both algorithmic and architecture level by employing power optimisation techniques. This makes the proposed architecture suitable for resource-constrained low power wearable devices. The accuracy of the proposed architecture is validated using ECG databases from Physionet [8]. The extracted ECG features can be used further to analyse various ECG intervals for detecting cardiovascular diseases [9,10]. The rest of the paper is sectioned as follows. Section 2 presents the theoretical perspectives of the wavelet transform. Section 3 elaborates the proposed ECG feature extraction algorithm, and Section 4 explains the hardware implementation of this algorithm. Results are discussed in Section 5, and this paper is concluded in Section 6.

#### 2. Wavelet transform

Wavelet transform is a linear transformation employed to process non-stationary signals [11]. It performs an overall frequency analysis of the signal and also provides a signal's time-domain based information. It carries out analysis on different levels with a proper resolution by breaking signals into various frequency levels. Hence, in short time duration, high-frequency signals are evaluated, and similarly, long time intervals are considered to analyse low-frequency signals. The advantage of inspecting signals in different frequency levels prove to be efficient for characterising and delineating non-stationary signals, such as ECG. For a signal f(t), wavelet transform in continuous time domain is given as Eq. (1), where \* is the complex conjugate of any function and  $\psi_{x,y}(t)$  is the basis function or mother wavelet. x and yare termed as the scaling factor and translation factor, respectively. Therefore,  $\psi^*(\frac{t-y}{x})$  represents a shifted and scaled version of basis function. Discrete wavelet transform (DWT) is potent as compared to continuous wavelet transform (CWT) due to less redundancy and better time complexity for a large range of applications. The DWT of any signal x[n] is calculated by applying a set of low pass and high pass filters with impulse responses g and h, respectively. The output obtained from convolution of signal x[n] with g and h is given by Eq. (2).

$$W(x, y) = \int_{-\infty}^{\infty} f(t)\psi_{x,y}(t)dt$$
  

$$\psi_{x,y}(t) = \frac{1}{\sqrt{a}}\psi^*(\frac{t-y}{x})$$
(1)

$$y[n] = \sum_{k=-\infty}^{\infty} x[k]g[n-k]$$
  

$$y[n] = \sum_{k=-\infty}^{\infty} x[k]h[n-k]$$
(2)

Detail coefficients and approximate coefficients of the signal are obtained from the output of high pass filter (*h*) and low pass filter (*g*), respectively. The two filters are called Quadrature Mirror Filters. In general, the wavelet coefficients are calculated using Mallat's algorithm [4] as shown in Fig. 2. As we know, the signal x[n] is applied to LPF and HPF to realise detailed and approximate coefficients. These coefficients are downsampled by 2, thus, generating the coefficients of the level 1 (2<sup>1</sup>) in Mallat's algorithm. The approximate coefficients of level 1 are fed to LPF and HPF, and then again downsampled to obtain level 2 coefficient (2<sup>2</sup>). This process is continued until the resolution required for a specific application is achieved. There are a wide variety of wavelet functions available [11] which are utilised in various applications.

#### 2.1. Integer Haar transform

The simplest wavelet among all the mother wavelets is the Haar wavelet. The coefficients of the transfer function of LPF and HPF for Haar wavelet are described in Eq. (3).

$$g = [\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}]$$

$$h = [\frac{1}{\sqrt{2}}, \frac{-1}{\sqrt{2}}]$$
(3)

In [12], the authors elaborated uniqueness and simplicity of the Haar wavelet and compared it with the other wavelets. It is observed that Haar wavelet outperforms other wavelets due to its less memory requirement and reduced computational complexity. Although Haar wavelet is highly efficient, there are challenges in its hardware implementation due to the involvement of floating-point arithmetic. Therefore, integer Haar wavelet (IHT) is adopted for avoiding these floating point calculations. The approximate and detailed coefficients (*CA and CD*) of IHT are presented in Eq. (4).

$$CA[n] = \lfloor \frac{1}{2}x[2n] + \frac{1}{2}x[2n+1] \rfloor$$

$$CD[n] = x[2n] - x[2n+1] \qquad (4)$$

It can be observed that the basic operations, such as addition, subtraction and division by 2 are required to calculate wavelet coefficients. Further, division with  $2^n$  can be realised using a simple right shift operation. The absence of floating point operations reduces complexity of hardware implementation of IHT and makes its realisation efficient in the digital logic systems.

#### 3. ECG feature extraction algorithm

The prime application of our proposed work is wearable IoT healthcare devices, where the ECG features processing and associated computations take place in power and area constrained environment [13]. Therefore, limiting the complexity and power consumption of the algorithm is essential and critical engineering necessity of the proposed method. Moreover, ECG analysis algorithm needs to fulfil clinical requirements of producing acceptable results by introducing the least error in the delineation of features. These two criteria pose significant constraints on the employment of relevant signal processing methods in terms of their physical implementation in an energy-constrained environment. Therefore, in this section, a modified and efficient algorithm aiming towards optimal power dissipation and clinical accuracy is described in detail.

| Al | gorithm | 1 | Pseudocode | to | extract | ORS | Complex |
|----|---------|---|------------|----|---------|-----|---------|
|----|---------|---|------------|----|---------|-----|---------|

```
1: Input: ECG x[n] of length N
```

34:

35:

36: end for

 $en_t = 1$ 

end if

```
2: Initial estimation of R peaks
```

```
3: Calculate detailed coefficients CD<sub>3</sub> of x[n]
 4: Calculate abs_max = max(CD_3)
 5: Threshold for R peak: th_R = abs_max >> 2
 6: j \leftarrow 0
 7: for i = 0, \dots, n_{coef} do
 8:
        if CD_3 > th_R then
            R_pos_temp_i = CD_3 * 2^3
 9:
10:
            i \leftarrow i + 16
11:
            j \leftarrow j + 1
        else
12.
13:
            i \leftarrow i + 1
14:
        end if
15: end for
16: R peak Delineation
17: for k = 0, \dots, j do
18:
         R_peak_k = max(x[n]), n \in (R_pos_temp_k - W1, R_pos_temp_k + W1)
19:
         R_peak_pos = pos(max)
20:
        k \leftarrow k + 1
21: end for
22: Q and S peak Delineation
23: for l = 0, \dots, j do
        Q_l = min(x[n]), for n \in (R_peak_pos, R_peak_pos - W2)
24:
25.
        S_l = min(x[n]), for n \in (R\_peak\_pos, R\_peak\_pos + W3)
26:
        l \leftarrow l + 1
27: end for
28: RR Estimation
29: for m = 1, \dots, j do
30:
        RR_{m-1} = R_m - R_{m-1}
31:
        m \leftarrow m + 1
32:
        if m > 1 then
33:
            en p = 1
```

Algorithm 1 represents proposed algorithm to detect QRS complex. DWT is utilised in this algorithm owing to its effectiveness. The most considerable advantage of DWT lies in its time-scale nature that can inherently separate artefacts like isoelectric line wandering and associated noises from the ECG signal [14] which is processed by employing Integer Haar wavelet (IHT). As explained in the previous subsection, the hardware implementation of IHT is preferred due to its integer nature. IHT avoids any floating-point calculations; therefore, it can be implemented using simple addition, subtraction and shift operations. Although IHT has its limitations, it is an appropriate choice for the implementation of digital logic systems dedicated to low power healthcare [15]. The algorithm is divided into three main stages.

1. The algorithm begins with the detection of R peak.

- 2. Once the R peaks are found, QRS onsets and offsets can be delineated, if required, before and after the R peak.
- 3. After the extraction of QRS complex, delineation of P and T waves can be performed whenever needed.

#### 3.1. QRS detection module

The proposed ECG feature extraction algorithm operates on the ECG signal sampled at 250 Hz. Both frequency analysis and time-domain refinement are employed for delineating ECG fiducial points. Initially, a window of 800 samples (N) of ECG signals is chosen for processing. It is observed that during certain cardiac abnormalities, such as Arrhythmia, the RR interval deviates from its standard value. The minimum two beats are required to calculate RR interval and, P and T waves. A window of 800 samples enables proposed algorithm to capture at least two P-QRS-T beats, when ECG is sampled at 250 Hz for normal as well as for abnormal patients. This leads to correct delineation of all the primary features in an abnormal ECG as well. As depicted in Algorithm 1, detailed coefficients  $(CD_3)$  of third dyadic scale  $(2^3)$  are computed for P-QRS-T extraction. These coefficients are calculated using Mallat's algorithm with IHT as mother wavelet, as shown in Fig. 2. Once these coefficients are computed, the absolute maximum of the  $(CD_3)$ coefficients are found (Eq. (5)), and threshold is calculated using Eq. (6) to determine the position of R peak. The value of the threshold is updated after every 800 samples making it adaptive to the variation in ECG signal.

$$abs_max = max(CD_3) \tag{5}$$

$$th_R = \frac{abs_max}{4} \tag{6}$$

After calculating the threshold value, an initial estimation of R peak is realised. The values of each  $CD_3$  coefficients are compared with the threshold value. If the magnitude of the coefficient is higher than the threshold value, then its value is stored as initial value of R peak. Later the next 16 samples of  $(CD_3)$  are skipped in order to avoid detection of more than one R peak from the same QRS complex. Initial R peak positions in the original ECG signal are found by mapping selected  $CD_3$ coefficients to the signal. This mapping is accomplished by multiplying  $CD_3$  coefficients with  $2^3$  as it is evident from Fig. 2.

The standard duration of the QRS complex is 120 ms, which may increase and decrease during certain cardiac abnormalities. Therefore, in order to find final values of R peaks, a dynamic window of  $2 \times W1$  as per the standard value of QRS complex is chosen to mark R peak. For a positive QRS complex, Q and S peaks are the minimum peaks before and after R peaks, as depicted in Fig. 1. For a standard duration of QRS complex, a window of 60–80 ms (W1 and W3) is selected to delineate Q and S peaks. The minimum of W1 and W2 are marked as Q and S peaks, respectively. Similarly, the onset and offset of QRS complex are delineated as the maximum value in an adaptive window of 40 ms after Q and S peaks, respectively. RR intervals are also computed in parallel from the detected R peaks.

## 3.2. P and T wave detection

As we know, P wave is the starting wave of the heartbeat. In normal patients, its duration is less than 120 ms, but it expands for the abnormalities, such as atrial fibrillation up to 170 ms [16]. The PR interval is of 120-200 ms [17]. Thus, the P wave is delineated considering a window (*W*4) of 100-200 ms before the QRS complex. The window should never exceed half of the RR interval. Choosing a window according to the RR interval makes the algorithm adaptive to the varying heart rate of a particular patient; therefore, detection is

performed accurately even for abnormal ECGs. Algorithm 2 exhibits the flow of P wave extraction from ECG signal.

| Algorithm | 2 | Pseudocode | for | Р | and | Т | wave | extraction |
|-----------|---|------------|-----|---|-----|---|------|------------|
|-----------|---|------------|-----|---|-----|---|------|------------|

| Calculation of P peaks                                              |
|---------------------------------------------------------------------|
| When $en_P == 1$                                                    |
| $flagP \leftarrow 0$                                                |
| for $c = 0, \cdots, j$ do                                           |
| $P1_c = min(CD_3[n])$ , for $n \in (W4)$                            |
| $P2_c = max(CD_3[n])$ , for $n \in (W4)$                            |
| if $P1_c < P2_c$ then                                               |
| $Pt_on_c = P1_c * 2^3$                                              |
| $Pt_of f_c = P2_c * 2^3$                                            |
| $flagP \leftarrow 1$                                                |
| if $flag P == 1$ then                                               |
| $P_on_c = min(x[n]), \text{ for } n \in (Pt_on_c - W5, Pt_on_c)$    |
| $P_off_c = min(x[n]), \text{for } n \in (Pt_off_c, Pt_off_c + W5)$  |
| $P\_peak_c = max(x[n]), \text{for } n \in (P\_on_c, P\_off_c)$      |
| $flagP \leftarrow 0$                                                |
| end if                                                              |
| else                                                                |
| $Pt_on_c = P2_c * 2^3$                                              |
| $Pt_of f_c = P1_c * 2^3$                                            |
| $flagP \leftarrow 1$                                                |
| if $f lag P == 1$ then                                              |
| $P_on_c = max(x[n]), \text{ for } n \in (Pt_on_c - W5, Pt_on_c)$    |
| $P_off_c = max(x[n]), \text{ for } n \in (Pt_off_c, Pt_off_c + W5)$ |
| $P_peak_c = min(x[n]), \text{ for } n \in (P_on_c, P_of f_c)$       |
| $flagP \leftarrow 0$                                                |
| end if                                                              |
| end if                                                              |
| $c \leftarrow c + 1$                                                |
| end for                                                             |
| T wave extraction                                                   |
| if $e_{n_{t}} = 1$ then                                             |
| Repeat P block with $n \in W6$                                      |
| end if                                                              |
|                                                                     |

 $CD_3$  coefficients for this chosen window are examined, and the minimum and the maximum values of the coefficients are obtained. If the minimum value is obtained before the maximum value, then P wave is considered to be positive else it is considered as inverted. The positions of the minimum and the maximum values of  $CD_3$  coefficients are multiplied by a factor of  $2^3$  to obtain a temporary position of start  $(P_on)$  and end  $(P_off)$  of P wave in the original ECG signal. For refining the delineation, a window (W5) of 40 ms is chosen on both the sides of initial values of  $P_{on}$  and  $P_{off}$ . The window W5 is chosen adaptively considering the sampling frequency of 250 Hz to get the accurate delineation results on the ECG excerpts. The minimum values of the respective windows are taken as the final boundaries of P waves. The maximum value of the window is considered if P wave is inverted in nature. Value of P peak is extracted by finding the maximum/minimum value in the interval  $P_{on}$  and  $P_{off}$  as per the orientation of P wave.

As we know, last characteristic of a heartbeat is T wave. Its duration is of 125–200 ms in normal ECG, whereas, the QT interval is less than 425 ms [18]. Considering these circumstances, a window W6 of 200–400 ms after the QRS complex marked to extract the T wave. However, this window should never exceed half of the RR interval. Rest of the procedure to find T wave is similar to P wave delineation mentioned above.

#### 4. Hardware implementation

The proposed architecture contains three main modules; QRS module and an optional P wave module and T wave module. The P and T wave modules are subdivided into boundary detection and peak



Fig. 3. Block diagram of ECG feature extraction algorithm.

detection modules. The modules are enabled and disabled according to the medical requirements or when they are nonoperational.

The hardware implementation of the proposed algorithm is realised in Verilog using Xilinx 2016.4 development environment Vivado and Xilinx Zynq<sup>™</sup>-7000 FPGA platform [19]. The modules are validated for the desired functionality through simulation using Vivado. Later these modules are ported on FPGA for further verification of the proposed algorithm at the hardware level. The basic architecture of ECG delineation algorithm is illustrated in Fig. 3. The architecture blocks are controlled by a finite state machine presented in Fig. 4. Different ECG signal samples have been utilised for the analysis of our proposed algorithm. ECG signal is fed serially as input into the system and is stored in a memory. The position of ECG features is obtained as output. Initially, the ECG signal is decomposed into its DWT coefficients utilising a cascaded filter bank structure, as shown in Fig. 2. The IHT block is designed to decompose a signal into detailed and approximate coefficients using integer Haar transform as shown in Fig. 5. Three such IHT blocks are pipelined to obtain  $CD_3$  coefficients making a complete DWT block. The coefficients are stored in memory for further processing. The DWT and comparator blocks are pipelined to obtain third scale wavelet coefficients along with their absolute maximum, which reduce delay in the calculation of R peak threshold. Fig. 6 represents the architecture to compute threshold of R peak.

The present and previously calculated  $CD_3$  coefficients,  $CD_3$ , and  $CD_{3}$ , respectively, are compared to find the maximum and the process is repeated for every new  $CD_3$  coefficient being calculated using the previous step. The threshold for R peak detection is then realised using Eq. (6). This threshold is updated after every 800 input samples making the proposed architecture adaptive to the variations in ECG. Once the threshold is found, R peaks are detected in  $CD_3$  coefficients, as explained in the algorithm and are mapped to the primary ECG signal subsequently by multiplying with  $2^3$ . This multiplication by  $2^3$ is implemented as left shift operation in hardware by shifting  $CD_3$  coefficients values by three bits. Once temporary R peaks are calculated. the final R peaks are retrieved by searching a maximum 16 samples before and after temporary R peaks using a comparator. Once the QRS complex is extracted, P and T waves extraction can be performed. In the proposed architecture, clock gating technique is utilised to minimise the power consumption of the proposed design [20]. Clock gating controls the activation of different blocks as per the requirement. Since all the processes do not run simultaneously, blocks can be turned off, when not in use, to save power. Clock gating is exhibited in Fig. 7. As shown in Fig. 7, P and T modules are controlled by clock and an enable signal. The module operates when enable signal  $(en_P)$  and  $(en_T)$  becomes



Fig. 4. Main finite state machine for ECG feature extraction.



Fig. 5. Architecture of IHT module.



Fig. 6. Comparator block.

high. P and T waves are then delineated as described in the algorithm. Both the modules can operate in parallel, if required, and utilise a simple comparator circuit to find the maximum and minimum values. When QRS module is active, then P wave and T wave module can be disabled after completing their ongoing tasks, which reduces power



Fig. 7. Clock gating of different blocks.

consumption. The architecture designed utilises fixed point computations and simple hardware blocks making it computationally simple.

#### 5. Results and discussion

In this section, experimentation of the proposed methodology and its outcome is discussed in detail.

#### 5.1. Validation of the proposed methodology

The proposed algorithm is first implemented in MATLAB to validate its correctness. It is tested on excerpts taken from the AHA database and on QT database of the Physionet [8]. It is observed that many of the traditional ECG processing algorithms, such as Pan-Tompkins [3] are implemented in software on a personal computer, which leads to use of floating-point representation of ECG data. As we know, FPGA based implementation of floating-point unit is complex, integer representation is preferred for simplicity as well as to reduce power consumption. Considering the associated constraints, the proposed algorithm is validated, taking integer representation of ECG signal data. Usually, ECG data is recorded in double-precision format. This double precision format is converted into integer format by multiplying it with  $10^N$ , N = 1, 2, ... and is rounded off to its integer value. As explained in [15], we consider ECG data to be multiplied by a factor of 100 and then rounded it off to convert in the integer format. It is to mention that variations in the interbeat intervals are negligible; therefore, ECG is rounded off by considering a factor of 100. Fig. 8 represents delineation of the boundary and peak points for QT database (sele0122m) recording sampled at 250 Hz. Table 1 presents test result for QT database [8] which consists of 105 15 min excerpts of 2-lead ECG recordings.

$$Se\% = \frac{TP}{TP + FN}$$

$$PPV\% = \frac{TP}{TP + FP}$$
(7)

These recordings are sampled at 250 Hz. For verification, these excerpts are manually annotated under the guidance of a medical expert. The proposed algorithm works on a single lead channel. The comparison of manually annotated and algorithmically annotated features is performed by selecting a ECG signal. It is found that the algorithmic annotation delineates the same points in (150 ms) proximity of the manually annotated signal. The error metric considered is described by the Association for the Advancement of Medical Instrumentation (AAMI) [30]. As shown in Eq. (7) *True Positive* (*TP*) is counted when a feature is correctly detected, then the respective error is calculated between manually and algorithmically annotated features. If no position is detected, then it is considered *False Negative*(*FN*). Any other

#### Integration 82 (2022) 96-103

#### Table 1

ECG feature extraction results from QT database.

| Algorithm              | Implementation<br>Platform          | Metric      | P wave         | QRS wave       | T wave         |
|------------------------|-------------------------------------|-------------|----------------|----------------|----------------|
| Kalyakulina et al. [6] | Software                            | Se%<br>PPV% | 97.49<br>97.89 | 98.42<br>98.24 | 97.2<br>96.55  |
| Sanjeev et al. [9]     | ASIC                                | Se%<br>PPV% | 98.91<br>91.07 | 100<br>100     | 99.97<br>97.76 |
| Bote et al. [21]       | TI MSP430 series<br>microcontroller | Se%<br>PPV% | 98.23<br>94.38 | 99.88<br>99.41 | 98.18<br>96.39 |
| Rincon et al. [22]     | Shimmer TM<br>embedded platform     | Se%<br>PPV% | 99.88<br>92.04 | 99.97<br>98.66 | 99.97<br>98.70 |
| Di Marco et al. [23]   | Software                            | Se%<br>PPV% | 98.15<br>91.00 | 100            | 99.72<br>97.76 |
| Proposed Work          | FPGA                                | Se%<br>PPV% | 97.93<br>94.23 | 99.46<br>99.06 | 98.12<br>96.54 |

#### Table 2

Comparison of FPGA results of R peak extraction.

| -                       |           | -        |           |          |           |          |           |          |            |          |           |                 |           |          |
|-------------------------|-----------|----------|-----------|----------|-----------|----------|-----------|----------|------------|----------|-----------|-----------------|-----------|----------|
| Resources               | [24]      |          | 4] [25]   |          | [26] [27] |          |           | [28]     |            | [29]     |           | Proposed design |           |          |
|                         | Available | Utilised | Available | Utilised | Available | Utilised | Available | Utilised | Available  | Utilised | Available | Utilised        | Available | Utilised |
| LUT                     | 9312      | 3443     | 9312      | 3061     | 124467    | 3734     | 9312      | 2328     | 1248012480 | 14081408 | 303600    | 88456           | 53200     | 2004     |
| Slice reg               | 4656      | 2901     | 4656      | 1809     | 34240     | 1712     | 4656      | 1489     | 1248012480 | 10861086 | 607200    | 5728            | 106400    | 503      |
| MUX                     | -         | -        | -         | -        | -         | -        | -         | -        | -          | -        | -         | -               | 26600     | 42       |
| FF                      | -         | -        | 9312      | 544      | 216750    | 4335     | 9312      | 651      | -          | -        | -         | -               | -         | -        |
| LUT-FF pair             | 9312      | 2283     | -         | -        | -         | -        | -         | -        | 1440       | 1054     | 93996     | 188             | -         | -        |
| IOB/IO                  | 232       | 18       | 67        | 67       | -         | -        | 232       | 88       | 172172     | 8282     | 700       | 114             | 200       | 34       |
| BUFG                    | -         | -        | -         | -        | -         | -        | -         | -        | 32         | 2        | 32        | 1               | 32        | 1        |
| MULT                    | 20        | 8        | -         | -        | -         | -        | -         | -        | -          | -        | -         | -               | -         | -        |
| GCLKs                   | 24        | 1        | -         | -        | -         | -        | 24        | 1        | -          | -        | -         | -               | -         | -        |
| Total resources         | 23556     | 8654     | 23347     | 5481     | 375457    | 9781     | 23536     | 4557     | 26604      | 3632     | 1005528   | 94487           | 186432    | 2584     |
| Comparative<br>resource | 3.34X     |          | 2.12X     |          | 3.78X     |          | 1.76X     |          | 1.40X      |          | 36.56X    |                 | Х         |          |

Utilisation



Fig. 8. ECG delineation.

point not related to any of the annotated features is regarded as False Positive(FP). On this basis, we consider Sensitivity(Se%) and Positive Predicted Value(PPV%) metrics for performance evaluation of the proposed algorithm. For validating the algorithm's architecture, different test benches are created to validate the outcome of the proposed algorithm with the manual annotations. It is seen from Table 1 that the proposed architecture gives a better or comparable sensitivity as

compared to state-of-the-art methods for delineating an ECG signal verifying the efficacy of the proposed research work.

#### 5.2. Experimental results

As stated earlier, the proposed method is implemented using MAT-LAB and Verilog both. Later, execution of the algorithm proposed in this paper is validated through FPGA. It has been found that the MATLAB based implementation of the proposed algorithm is in close correlation with its FPGA based implementation. It is observed that the position of R, P and T peaks obtained through MATLAB and Verilog HDL implementation match without any variations. However, a difference of just one sample is noticed in delineation of onset and offset of P and T waves in a few ECG beats. It is to mention that variation in the delineation of P and T waves is still in the error range reported in [30]. In literature, various methods have been reported focusing on the detection of R peaks because many heart diseases can be diagnosed by evaluating variability in the R peaks only. Therefore, R peak module is implemented first and is compared with already existing methods to showcase its effectiveness in terms of efficient hardware utilisation. Comparison details are presented in Tables 2 for completeness.

It can be observed from Table 2 that the proposed architecture for R peak detection utilises only 2584 resources out of total available FPGA resources, which is the least among all the previous works reported. It can also be noticed in Table 2 that the design proposed in this paper does not include any compute and power intensive elements, such as multiplication units. This makes our proposed design to be more efficient in term of area and power as compared to previously reported designs. It can be observed in Table 2 that our proposed design

#### Table 3

Comparison of FPGA results of ECG feature extraction architecture.

| Resources                           | [31]      |          | [32]      |          | [33]      |          | [34]      |          | Proposed design |          |
|-------------------------------------|-----------|----------|-----------|----------|-----------|----------|-----------|----------|-----------------|----------|
|                                     | Available | Utilised | Available | Utilised | Available | Utilised | Available | Utilised | Available       | Utilised |
| LUT                                 | 40960     | 29389    | 343680    | 119256   | 239616    | 131072   | 69120     | 24784    | 53200           | 6894     |
| Slice reg                           | 20480     | 15459    | 687360    | 130598   | 18752     | 3532     | 69120     | 14670    | 106400          | 1080     |
| MUX                                 | -         | -        | -         | -        | -         | -        | -         | -        | 26600           | 126      |
| FF                                  | 40960     | 4845     | -         | -        | -         | -        | -         | -        | -               | -        |
| IOB/IO                              | 565       | 30       | -         | -        | 152       | 128      | 640       | 58       | 200             | 66       |
| BUFG                                | -         | -        | -         | -        | -         | -        | -         | -        | 32              | 1        |
| MULT                                | 40        | 3        | -         | -        | 52        | 20       | -         | -        | -               | -        |
| GCLKs                               | 8         | 3        | -         | -        | -         | -        | 32        | 6        | -               | -        |
| BRAM                                | -         | -        | 2528      | 786      | -         | -        | 148       | 24       | -               | -        |
| DSP blocks                          | -         | -        | 864       | 458      | -         | -        | 64        | 7        | -               | -        |
| Total resources                     | 103013    | 49729    | 1034432   | 251098   | 258572    | 134752   | 139124    | 39549    | 186432          | 8167     |
| Comparative resource<br>Utilisation | 6.08X     |          | 30.74X    |          | 16.49X    |          | 4.84X     |          | Х               |          |

#### Table 4

Comparison of synthesis results of ECG feature extraction architecture.

| Parameter           | [7]      | [9]      | [35]    | [36]    | [37]     | [38]     | Proposed work |
|---------------------|----------|----------|---------|---------|----------|----------|---------------|
| Technology          | 65 nm    | 130 nm   | 180 nm  | 180 nm  | 180 nm   | 180 nm   | 180 nm        |
| Operating frequency | 7.5 kHz  | 4 kHz    | 1 MHz   | 1 MHz   | NA       | 0.12 kHz | 1 MHz         |
| Supply voltage      | 0.6 v    | 0.9 v    | 1.8 v   | 1.2 v   | 1.0 v    | 1.2 v    | 1.62 v        |
| ECG features        | P-QRS-T  | P-QRS-T  | P-QRS-T | P-QRS-T | QRS      | QRS      | P-QRS-T       |
| Power               | 0.642 uW | 0.384 uW | 9.47 uW | 32 uW   | 0.410 uW | 5.97 uW  | 0.88 uW       |
| Energy              | 85.6 pJ  | 96 pJ    | 9.47 pJ | 32 pJ   | NA       | 49.75 nJ | 0.88 pJ       |

is 28.86% more efficient in terms of resource utilisation as compared to the previous best known method reported in [28]. The hardware resource utilisation of complete ECG feature extraction algorithm is presented in Table 3. It can be seen that the proposed architecture utilises only 4.38% of the total available resources, which is the least among all other methods reported. It can also be observed that our proposed implementation of complete ECG feature extraction algorithm is 79.35% better than the previous best known method reported in [34]. This is primarily due to designing our proposed architecture using multiplexers, which not only reduces design complexity and makes it simpler but also optimises power consumption. In Tables 2 and 3, "-" indicates information not available for comparison in the reported works.

The entire design is synthesised using 180nm SCL PDK using Synopsys DC and IC Compiler tools, and its power consumption is compared with some of the most relevant previous known designs, as per our knowledge, and is shown in Table 4. It is observed that the power consumption of the architecture proposed in this paper is  $0.88 \mu W$  while operating at 1.62V at a maximum operating frequency of 1 MHz. It can be observed that the architecture proposed in [7] consumes 0.642uWof power when operated at 7.5 kHz, which is less than the power consumption of the proposed design. But, it is reported in [7] that the power consumption of their design is 1.33uW, if it operates at 100 kHz. Moreover, our proposed work consumes 0.66X less power while operating at 10X (1MHz) frequency with respect to the design proposed in [7] with the same delineation accuracy. This indicates that the proposed architecture is efficient in terms of power consumption among all the reported state-of-the-art architectures for ECG feature extraction.

Further, for a fair comparison, the proposed research work is compared with different architectures in terms of energy dissipation. It is concluded that the energy utilisation of the proposed design is 0.88pJ, which is the least among all other methods as depicted in Table 4. The primary challenge of IoT healthcare wearable devices is to operate a device with minimal power resources providing longer battery life [13,39] and should be compact in nature utilising minimal resources. Therefore, based on the experimental analysis, it is observed that our design utilises minimal resources and can operate at a very low power with high accuracy making it suitable for the applications, which mandatorily require low power and area optimality. It can also be inferred that at smaller technology nodes, the area and power consumption of the proposed design would be lesser, thus, making it a promising candidate for wearable healthcare applications.

#### 6. Conclusion

The proposed research work presents a simple, optimised and efficient VLSI architecture to delineate different ECG features, such as QRS complex, P wave and T wave. It displays low computational complexity in implementation while attaining comparable accuracy as compared to other state-of-the-art implementations. The proposed method utilises Discrete Wavelet Transform (DWT) incorporating Integer Haar function as the mother wavelet to avoid floating-point computations. The algorithm can attain a sensitivity of about 99% on QT dataset. The proposed architecture utilises only single scale wavelet enabling less memory resources utilisation when compared to the previous reported architectures which employ multiple scale of wavelets for complete ECG feature extraction. The designed architecture exploits only 4.38% of total available resources, which is the least among all other best known methods. Incorporation of clock gating further optimises power consumption. The architecture described in this paper can operate at a maximum operating frequency of 1 MHz consuming  $0.88 \mu W$  power. A remote real-time cardiovascular disease monitoring system is employed predominantly for assessing health condition of a patient round the clock; therefore, it must have minimal area and power utilisation . Thus, due to high accuracy, low resource requirement and minimal power consumption, our proposed design can be suitably employed in real time ECG signal analysis applications for wearable IoT healthcare devices. The proposed work can also be extended for extracting features in multi-lead ECG and to perform cardiovascular disease detection using different machine learning algorithms.

#### CRediT authorship contribution statement

Meenali Janveja: Conceptualization, Methodology, Software and Hardware implementation, Data curation, Writing – original draft. Gaurav Trivedi: Supervision, Validating algorithms, Investigation, Writing – review & editing.

#### Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

#### References

- World Health Organization, World Health Statistics 2015, World Health Organization, 2015.
- [2] G.S. Wagner, Marriott's Practical Electrocardiography, Lippincott Williams & Wilkins, 2001.
- [3] J. Pan, W. Tompkins, Real time algorithm detection for QRS, IEEE Trans. Eng. Biomed Eng. 32 (3) (1985) 230–236.
- [4] E.B. Mazomenos, D. Biswas, A. Acharyya, T. Chen, K. Maharatna, J. Rosengarten, J. Morgan, N. Curzen, A low-complexity ECG feature extraction algorithm for mobile healthcare applications, IEEE J. Biomed. Health Inf. 17 (2) (2013) 459–469.
- [5] N. Vemishetty, P. Patra, P.K. Jha, K.B. Chivukula, C.K. Vala, A. Jagirdar, V.Y. Gudur, A. Acharyya, A. Dutta, Low power personalized ECG based system design methodology for remote cardiac health monitoring, IEEE Access 4 (2016) 8407–8417.
- [6] A.I. Kalyakulina, I.I. Yusipov, V.A. Moskalenko, A.V. Nikolskiy, A.A. Kozlov, N.Y. Zolotykh, M.V. Ivanchenko, Finding morphology points of electrocardiographicsignal waves using wavelet analysis, Radiophys. Quantum Electron. 61 (8–9) (2019) 689–703.
- [7] T. Tekeste, H. Saleh, B. Mohammad, A. Khandoker, M. Ismail, A nano-watt ECG feature extraction engine in 65-nm technology, IEEE Trans. Circuits Syst. II 65 (8) (2017) 1099–1103.
- [8] A.L. Goldberger, L.A. Amaral, L. Glass, J.M. Hausdorff, P.C. Ivanov, R.G. Mark, J.E. Mietus, G.B. Moody, C.-K. Peng, H.E. Stanley, Physiobank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals, Circulation 101 (23) (2000) e215–e220.
- [9] S.K. Jain, B. Bhaumik, An energy efficient ECG signal processor detecting cardiovascular diseases on smartphone, IEEE Trans. Biomed. Circuits Syst. 11 (2) (2016) 314–323.
- [10] T. Tekeste, H. Saleh, B. Mohammad, A. Khandoker, M. Ismail, A biomedical SoC architecture for predicting ventricular arrhythmia, in: 2016 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE, 2016, pp. 2262–2265.
- [11] R. Rao, Wavelet transforms, Encycl. Imaging Sci. Technol. (2002).
- [12] R. Stojanović, D. Karadaglić, M. Mirković, D. Milošević, A FPGA system for QRS complex detection based on integer wavelet transform, Meas. Sci. Rev. 11 (4) (2011) 131–138.
- [13] S. Selvaraj, S. Sundaravaradhan, Challenges and opportunities in IoT healthcare systems: a systematic review, SN Appl. Sci. 2 (1) (2020) 1–8.
- [14] J.P. Martínez, R. Almeida, S. Olmos, A.P. Rocha, P. Laguna, A wavelet-based ECG delineator: evaluation on standard databases, IEEE Trans. Biomed. Eng. 51 (4) (2004) 570–581.
- [15] B. Zhang, L. Sieler, Y. Morère, B. Bolmont, G. Bourhis, A modified algorithm for QRS complex detection for fpga implementation, Circuits Systems Signal Process. 37 (7) (2018) 3070–3092.
- [16] S.A. Guidera, J.S. Steinberg, The signal-averaged p wave duration: a rapid and noninvasive marker of risk of atrial fibrillation, J. Am. Coll. Cardiol. 21 (7) (1993) 1645–1651.
- [17] C. Saritha, V. Sukanya, Y.N. Murthy, ECG signal analysis using wavelet transforms, Bulg. J. Phys. 35 (1) (2008) 68–77.
- [18] R.B. Northrop, Non-Invasive Instrumentation and Measurement in Medical Diagnosis, CRC Press, 2017.
- [19] L.H. Crockett, R.A. Elliot, M.A. Enderwitz, The Zynq Book Tutorials for Zybo and Zedboard, Strathclyde Academic Media, 2015.
- [20] T. Tekeste, H. Saleh, B. Mohammad, A. Khandoker, M. Ismail, A nano-watt ECG feature extraction engine in 65-nm technology, IEEE Trans. Circuits Syst. II 65 (8) (2017) 1099–1103.
- [21] J.M. Bote, J. Recas, F. Rincón, D. Atienza, R. Hermida, A modular low-complexity ECG delineation algorithm for real-time embedded systems, IEEE J. Biomed. Health Inf. 22 (2) (2017) 429–441.
- [22] F. Rincón, J. Recas, N. Khaled, D. Atienza, Development and evaluation of multilead wavelet-based ECG delineation algorithms for embedded wireless sensor nodes, IEEE Trans. Inf. Technol. Biomed. 15 (6) (2011) 854–863.
- [23] L.Y. Di Marco, L. Chiari, A wavelet-based ECG delineation algorithm for 32-bit integer online processing, Biomed. Eng. Online 10 (1) (2011) 23.
- [24] M. Karim, et al., Novel simple decision stage of Pan & Tompkins QRS detector and its FPGA-based implementation, in: Second International Conference on the Innovative Computing Technology (INTECH 2012), IEEE, 2012, pp. 331–336.

- [26] A. Page, A. Kulkarni, T. Mohsenin, Utilizing deep neural nets for an embedded ECG-based biometric authentication system, in: 2015 IEEE Biomedical Circuits and Systems Conference (BioCAS), IEEE, 2015, pp. 1–4.
- [27] H. Zairi, M. Kedir-Talha, S. Benouar, A. Ait-Amer, Intelligent system for detecting cardiac arrhythmia on FPGA, in: 2014 5th International Conference on Information and Communication Systems (ICICS), IEEE, 2014, pp. 1–5.
- [28] M.A. Kumar, K.M. Chari, Efficient FPGA-based VLSI architecture for detecting R-peaks in electrocardiogram signal by combining Shannon energy with Hilbert transform, IET Signal Process. 12 (6) (2018) 748–755.
- [29] D. Panigrahy, M. Rakshit, P. Sahu, FPGA implementation of heart rate monitoring system, J. Med. Syst. 40 (3) (2016) 49.
- [30] A. for the Advancement of Medical Instrumentation, et al., Testing and reporting performance results of cardiac rhythm and st segment measurement algorithms, ANSI/AAMI EC38 (1998).
- [31] H. Chatterjee, M. Mitra, R. Gupta, Real-time detection of electrocardiogram wave features using template matching and implementation in FPGA, Int. J. Biomed. Eng. Technol. 17 (3) (2015) 290–313.
- [32] X. Wang, Y. Zhu, Y. Ha, M. Qiu, T. Huang, An FPGA-based cloud system for massive ECG data analysis, IEEE Trans. Circuits Syst. II 64 (3) (2016) 309–313.
- [33] D. Alhelal, K.A. Aboalayon, M. Daneshzand, M. Faezipour, Fpga-based denoising and beat detection of the ECG signal, in: 2015 Long Island Systems, Applications and Technology, IEEE, 2015, pp. 1–5.
- [34] G.G.C. Lee, Z.-J. Huang, C.-Y. Chen, C.-F. Chen, Implementation of gabor feature extraction algorithm for electrocardiogram on FPGA, in: 2015 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE, 2015, pp. 798–801.
- [35] N. Vemishetty, P. Patra, P.K. Jha, K.B. Chivukula, C.K. Vala, A. Jagirdar, V.Y. Gudur, A. Acharyya, A. Dutta, Low power personalized ECG based system design methodology for remote cardiac health monitoring, IEEE Access 4 (2016) 8407–8417.
- [36] H. Kim, S. Kim, N. Van Helleputte, A. Artes, M. Konijnenburg, J. Huisken, C. Van Hoof, R.F. Yazicioglu, A configurable and low-power mixed signal SoC for portable ECG monitoring applications, IEEE Trans. Biomed. Circuits Syst. 8 (2) (2013) 257–267.
- [37] P. Li, X. Zhang, M. Liu, X. Hu, B. Pang, Z. Yao, H. Jiang, H. Chen, A 410nW efficient QRS processor for mobile ECG monitoring in 0.18-µm CMOS, IEEE Trans. Biomed. Circuits Syst. 11 (6) (2017) 1356–1365.
- [38] S.-Y. Lee, J.-H. Hong, C.-H. Hsieh, M.-C. Liang, S.-Y.C. Chien, K.-H. Lin, Low-power wireless ECG acquisition and classification system for body sensor networks, IEEE J. Biomed. Health Inf. 19 (1) (2014) 236–246.
- [39] T.T. Habte, H. Saleh, B. Mohammad, M. Ismail, IoT for healthcare, in: Ultra Low Power ECG Processing System for IoT Devices, Springer, 2019, pp. 7–12.



Meenali Janveja has received her B.Tech. degree in Electronics and Communication Engineering from Government Women Engineering College, Ajmer, India in 2013. She received her M.Tech. degree in VLSI Design from Indira Gandhi Delhi Technical University for Women, India in 2016. She worked as an assistant professor in the Department of Electronics and Communication Engineering in G.L Bajaj Institute of Technology and Management, India from 2016–2017. Presently, she is a research scholar in the Department of Electronics and Electrical Engineering at Indian Institute of Technology, Guwahati, India. Her research areas include digital VLSI design, computer architecture, algorithms and VLSI for biomedical signal processing.



Gaurav Trivedi received Ph.D. degree in Electrical Engineering from Indian Institute of Technology Bombay, India, in 2007. He is an associate professor in the Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati (IIT Guwahati), India. He worked as a senior member of technical staff with Cadence Design Systems and Berkeley Design Automation (presently, Mentor-Siemens) for three years and as a postdoctoral fellow for two years at Indian Institute of Technology Bombay, before joining IIT Guwahati as a faculty member. His research interests include VLSI CAD, semiconductor devices, digital and analog circuit design, high-performance computing, and quantum computing. He is the member of IEEE.