PCG segmentation tutorial
Setup steps
import pyPCG as pcg
import pyPCG.io as pcg_io
import pyPCG.preprocessing as preproc
import pyPCG.segment as sgm
import numpy as np
import matplotlib.pyplot as plt
%matplotlib widget
Read in data and calculate envelope
from importlib.resources import files
data, fs = pcg_io.read_signal_file(str(files('pyPCG').joinpath("data").joinpath("example.wav")),"wav")
signal = pcg.pcg_signal(data,fs)
signal = pcg.normalize(signal)
bp_signal = preproc.filter(preproc.filter(signal,6,100,"LP"),6,20,"HP")
denoise_signal = preproc.wt_denoise(bp_signal)
env_signal = preproc.homomorphic(denoise_signal)
Segmentation with peak detection
We can detect the heartsounds by detecting peaks in the envelope.
Usual find_peaks methods have problems with detecting heartsound
peaks, even after proper paramtertization. This is why we use an
adaptive and advanced method.
peakvals, peaks = sgm.adv_peak(env_signal)
print(len(peaks))
266
These detections contain both the S1 and S2 sounds.
We can diferentiate between them, with multiple ways. But the simplest is by comparing the time-differences between detections.
(Usually S1-S2 sections are shorter than S2-S1 sections)
s1,s2 = sgm.peak_sort_diff(peaks)
Then, for proper segmentation we need the starting and ending time of each heartsound. This is also possible with multiple methods.
Our implementation uses the envelope and detects a significant drop in value compared to the peak as the boundaries of the heartsound.
peak_s1_start,peak_s1_end = sgm.segment_peaks(s1,env_signal,start_drop=0.8,end_drop=0.7)
Let’s visualize the result
plt.figure()
pcg.plot(signal)
pcg.plot(env_signal)
plt.plot(peak_s1_start/signal.fs,np.ones_like(peak_s1_start)*0.5,'rx')
plt.plot(peak_s1_end/signal.fs,np.ones_like(peak_s1_end)*0.5,'r.')
plt.legend(["Signal","Envelope","S1 start","S1 end"])
plt.title("S1 segmentation with peak detection")
plt.xlim((0,2))
(0.0, 2.0)
Segmentation with LR-HSMM
Based on Springer et al., a special HMM can be used to segment the PCG signal.
For this, first load a pre-trained model with compatible parameters.
Note: model training is not included yet
hsmm = sgm.load_hsmm("pre_trained.json")
This can be used directly to get the heartcycle states. Which is a vector with the same length as the input signal; its values are 1, 2, 3, 4 which correspond with S1, systole, S2, diastole states, respectively.
states = sgm.segment_hsmm(hsmm,signal)
We can illustrate this as:
t = np.linspace(0,signal.get_timelength(),len(signal.data))
plt.figure()
pcg.plot(signal)
pcg.plot(env_signal)
plt.plot(t,states)
plt.legend(["Signal","Envelope","Heartcycle states"])
plt.title("LR-HSMM output")
plt.xlim((0,2))
(0.0, 2.0)
For proper segmentation we can get the boundaries of a given state, like the following
hsmm_s1_start,hsmm_s1_end = sgm.convert_hsmm_states(states,sgm.heart_state.S1)
Visualize the results
plt.figure()
pcg.plot(signal)
pcg.plot(env_signal)
plt.plot(hsmm_s1_start/signal.fs,np.ones_like(hsmm_s1_start)*0.5,'rx')
plt.plot(hsmm_s1_end/signal.fs,np.ones_like(hsmm_s1_end)*0.5,'r.')
plt.legend(["Signal","Envelope","S1 start","S1 end"])
plt.title("S1 segmentation with LR-HSMM")
plt.xlim((0,2))
(0.0, 2.0)
Final notes
Every later calculation on segments, such as feature extraction, reqires only the boundaries and the signal.
With this solution, we can segment for different heartsounds, systole-diastole regions, or even total heartcycles. If a given segment is not directly segmentable, it can be deduced from other boundaries. For example: a total heartcycle can be segmented by offsetting the S1-start array by one, and using it as the ending boundary.