EL ESPÍRITU SANTO
791 El día que nacemos entramos a formar parte de nuestra
the term “modified leads” suggest that the electrode placement on torso are in positions chosen such that the signal closely match the Limb Leads signals. Such a modification is possible because the cardiac electrical field produces time-varying dipole approximation that is generally sufficient to produce same projections with minor changes in positions in the placement of the electrodes along the same axis. The wearable kit described in this chapter uses Limb Lead II for ECG signal acquisition.
Several ECG datasets hosted by the American Heart Association (AHA) and PhysioNet were considered for the research study. In addition ECG dataset samples related to cardiac arrhythmia, software routine libraries to query these datasets were also required to extract physiological parameters and clinical information from the dataset. As the ECG waveform records would be electronically examined using software algorithms, digitisation information was also required to convert signals from raw analogue samples to a digitised format compatible with the dataset.
3.3
MITDB arrhythmia dataset
The MITDB dataset from MIT BIH arrhythmia database maintained by PhysioNet considered for this research study has ECG samples gathered from lead II (ML2 according to MITDB) and V2. The ML2 and the V2 signals are the ECG signals to refer to the electrode placements on the human body. Since the recordings in the dataset were digitised at 360 samples per second at 11 bit resolution, the real-time ECG acquisition described in the subsection 4.5.2 had to sample the ECG readings and digitise the samples using the same sampling frequency, to conform to the MITDB data conversion format for further analysis. Within 11 bit resolution over +/-5 mV range, the sampling value range from 0 to 2047 could be obtained. The MITDB dataset contains signals that were originally filtered using anti-aliasing filter with the pass-band of 0.1 to 100 Hz and notch filter 60 Hz, which was achieved using common digital signal processing libraries in SciPy and MATLAB, which are commonly used in signal processing related
Figure 3.3: The PhysioNet LightWAVE applications showing ECG signals and annotations in an MITDB record. Source: https://PhysioNet.org/lightwave/
research. Since the MITDB database maintainers have already provided the sampling frequency and filter specifications, the same specifications were used for signal acquisition and filtering of ECG signals from human subjects as described in section 4.5 on signal acquisition.
The MITDB Arrhythmia database considered for this research study has 48 patient records of ECG waveforms with all the possible variations of arrhythmia that could be found in a human subject suffering from abnormal heart rhythm conditions. Each record in the MITDB dataset has been manually annotated by physicians and cardiologists identifying events of abnormal heart function. Annotations are labelled at each point in the waveform at specific locations where certain abnormalities were found at those locations. All the ECG recordings have annotations that indicate time of occurrence of the normal and abnormal beats for each heartbeat, also called as beat-by-beat annotations. These limitations could be observed by using tools like LightWAVE provided by PhysioNet. The normal beats appearing as ‘blue’ coloured dots labelled as ‘art’ in the LightWAVE plot are shown in figure 3.3.
3.3. MITDB arrhythmia dataset 73
clinical and physiological significance indicating normal sinus rhythm or an abnormality found in signal at a particular location. For example, the N indicates the normal beats and V,A,L,R annotations indicate abnormalities found in an ECG signal at corresponding locations. As shown in table 3.1, the non-beat annotation types provide structural information and morphological significance, e.g. the start ‘(’ and stop ‘)’ points in a heartbeat waveform for each of the PR, QRS, ST segments and p, t annotations indicate the peak of P and T waves respectively. To perform extensive data analysis on ECG samples from all the MITDB records, adequate number of samples were required for each of the beat annotation types. As discussed in the literature review section 2.2, premature ventricular complexes and premature atrial beats along with left and right Branch bundle blocks are an indicator of fatal arrhythmia that may occur if not treated in time. To identify these abnormalities, ECG samples in the datasets should have these beat annotations. Having examined the records using the LightWAVE tool it was found that these beat annotations were present in ample quantity in the MITDB records.
The most common beat annotations found in the ECG of all the MITDB records are:
N Normal beat
L Left bundle branch block beat
R Right bundle branch block beat
V Premature Ventricular Contraction
A Premature Atrial Beats
The non-beat type of annotations are as follows:
( Waveform onset
) Waveform end
p Peak of P-wave
t Peak of T-wave
Table 3.1: The common beat and non-beat type annotations for ECG signals in PhysioNet MITDB records
Each of the successive annotations is equivalent to RR-interval, and each RR-interval is made up of approximately 250 samples. On average each annotation covers about 360 samples and a single record is approximately 650,000 samples.
The data was obtained by downloading the ATR, DAT, HEA files for each record from the PhysioNet website for MITDB dataset. Source:
http://PhysioNet.org/physiobank/database/mitdb/.
Each record consists of at least three files i.e. the ATR, DAT, HEA files. The ATR files are binary files that consist of all the beat and non-beat annotations at particular positions in a signal for each of the MITDB records. The DAT files consist of the binary digitised samples of a signal of the record. Each sample in MIT-BIH record is represented by a 16-bit two’s complement amplitude stored as least significant byte first. Any unused high-order bits are sign-extended from the most significant bit. This format is known as Format 16. It is also the format that the freshly acquired signal would have to be converted to, in order to conform to MITDB record format. The HEA files are short text files that describe the contents of associated signal files. The header information consists of the sampling rate, age gender and the medication related information for a particular patient.
3.3.1
PhysioNet WFDB library
PhysioNet also provides PhysioToolkit (Silva and G. B. Moody 2014; Ary L Goldberger et al. 2000) which is a library of software for physiologic signal processing and analysis and detection of physiologically significant events within the signals using statistics and quantitative analysis, digital signal processing and nonlinear dynamics. It is also used for interactive display and characterisation of signals, creation of new databases, and simulation of physiologic and other signals. The focus of this research study was on the extraction of ’hidden’ information from biomedical signals, such that information that may have diagnostic value in medicine could be obtained and transformed into mathematical domain for further analysis. The WaveForm DataBase (WFDB) library (G. Moody 2019) provided by PhysioNet was used to extract features from the MITDB dataset.
The PhysioToolkit provides the WFDB (WaveForm DataBase) software package which is used for viewing analysing and creating records for physiologic signals. The WFDB software package has three components:
• WFDB library which is an application program interface to access PhysioNet data sets.
3.3. MITDB arrhythmia dataset 75
• WFDB routines or applications are online tools or C/C++ subroutines/functions for signal processing and automated analysis.
• WAVE is a software for viewing annotation analysis of signals.
The following primitives were used extensively in chapters 4 and 5 for extracting features from MITDB and MIMIC Numerics datasets.
• rdsamp reads ECG signal files for the specified record and outputs the samples as decimal numbers. rdsamp starts at the beginning of the record and outputs all samples line by line containing the sample number and samples from each signal, beginning with channel 0, separated by tabs.
• wrsamp reads text input (e.g. comma separated file) and outputs the specified columns in WFDB signal file Format-16, either to the standard output or to a disk file. Format-16 sample is represented by a 16-bit two’s complement amplitude stored as least significant byte first. Any unused high-order bits of the sample are sign-extended from the most significant bit.
• gqrs attempts to locate QRS complexes in an ECG signal in the specified record. The output of gqrs is an annotation file (with annotator extension
qrs) in which all detected peaks are labelled normal ’N’. The fields of each
annotation indicate: (a) the detection pass (0 or 1) during which the QRS complex was detected, (b) the signal number on which it was detected, and (c) the peak amplitude of the annotation detector filter during the QRS complex.
• rdann reads the annotation file specified by record, and outputs a text format, one annotation per line. The output contains (from left to right) the timestamp of the annotation as hours, minutes, seconds and milliseconds or sample number where the difference between two consecutive samples is (0.00277 corresponding to 360 Hz as most of the datasets, especially MITDB sampled at 360 Hz.); a mnemonic for the annotation type (, ), p, N, V, A, N, L, R corresponding to the annotations in the ECG signal.
• ecgpuwave analyses an ECG signal from the specified record, detects the QRS complexes and locates the start and end locations of the P, QRS, and T sub-waves. The output of ecgpuwave is a standard WFDB-format annotation file associated with the specified annotator e.g. qrs and epu in chapters 4 and 5. This annotation file can be converted into text format using rdann.
• ann2rr is typically used to obtain list of RR-intervals from ECG annotation file e.g. qrs file. By default, the intervals are listed in units of sample intervals (corresponding to 360Hz for MITDB records) to determine the sampling frequency of the input record if necessary.