Home | Help Center

Endless possibilities in academia

Cardiac function state recognition model based on bimodal time–frequency representation

Mingzhi Zhang, Piding Li


School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China.


Address correspondence to: Piding Li, School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 516 Jungong Road, Yangpu District, Shanghai 200093, China. E-mail: lpdbyusst@163.com.


DOI: https://doi.org/10.61189/784716ypyhmm


Received November 28, 2025; Accepted February 27, 2026; Published June 24, 2026


Highlights

● We use two types of cardiac physiological signals together. They complement each other and help improve the final classification accuracy.

● This study converts phonocardiograms and electrocardiograms into time–frequency images, which helps increase the positive detection rate and enables automatic learning of modality-specific features through a neural network.

● This study modifies the baseline model to achieve a more streamlined neural network architecture and incorporates an attention mechanism to better focus on information correlations.

Abstract

Objective: This study uses dual-modality signals, including phonocardiogram (PCG) and electrocardiogram (ECG), together with machine learning methods to distinguish cardiac function states in subjects. Methods: We developed a model based on time–frequency representations. The model includes data preprocessing, a time–frequency conversion module, a feature extraction module, and a feature-fusion classifier module. The system uses complete ensemble empirical mode decomposition with adaptive noise to remove noise from the PCG and applies filters to reduce noise in the ECG. The system extracts Mel-frequency cepstral coefficients from the PCG and uses Fourier synchrosqueezed transform for the ECG. This study also improves VGG16 and ResNet18 as feature extractors by inserting a variant attention mechanism into the feature extraction networks. Finally, the system feeds the feature vector into a support vector machine for classification. Results: The dual-modality time–frequency method achieves 95.4% accuracy and 97.4% sensitivity for positive cases on public datasets, demonstrating strong performance in cardiac function classification. Conclusion: This research shows that the approach improves both diagnostic accuracy and sensitivity. The system provides valuable support for the preliminary screening of cardiac dysfunction.

Keywords: Multi-modal, Phonocardiogram signal, Electrocardiogram signal, Feature encoding, Heart disease screening

Cite

Zhang MZ, Li PD. Cardiac function state recognition model based on bimodal time–frequency representation. Prog Med Devices. 2026 Jun; 4 (2): 124-134. doi: 10.61189/784716ypyhmm

[Copy]