TransECG: Leveraging Transformers for Explainable ECG Re-identification Risk Analysis
Ziyu Wang, Elahe Khatibi, Kianoosh Kazemi, Iman Azimi, Sanaz Mousavi, Shaista Malik, Amir M. Rahmani
TL;DR
TransECG introduces a Vision Transformer based framework for ECG re-identification risk analysis that jointly delivers high predictive accuracy and interpretable insights. By leveraging self-attention, it identifies key ECG segments such as the R-wave and P-R/S-T intervals that drive gender, age, and participant ID classifications, providing explanations aligned with clinical knowledge. On four real-world datasets comprising 87 participants, TransECG achieves accuracies of 89.9% for gender, 89.9% for age, and 88.6% for ID re-identification, with ROC-AUCs up to 0.99, while offering explicit attention-based explanations of salient ECG components. This combination of performance and interpretability supports privacy-conscious ECG data sharing and informs privacy-preserving strategies in healthcare data spaces.
Abstract
Electrocardiogram (ECG) signals are widely shared across multiple clinical applications for diagnosis, health monitoring, and biometric authentication. While valuable for healthcare, they also carry unique biometric identifiers that pose privacy risks, especially when ECG data shared across multiple entities. These risks are amplified in shared environments, where re-identification threats can compromise patient privacy. Existing deep learning re-identification models prioritize accuracy but lack explainability, making it challenging to understand how the unique biometric characteristics encoded within ECG signals are recognized and utilized for identification. Without these insights, despite high accuracy, developing secure and trustable ECG data-sharing frameworks remains difficult, especially in diverse, multi-source environments. In this work, we introduce TransECG, a Vision Transformer (ViT)-based method that uses attention mechanisms to pinpoint critical ECG segments associated with re-identification tasks like gender, age, and participant ID. Our approach demonstrates high accuracy (89.9% for gender, 89.9% for age, and 88.6% for ID re-identification) across four real-world datasets with 87 participants. Importantly, we provide key insights into ECG components such as the R-wave, QRS complex, and P-Q interval in re-identification. For example, in the gender classification, the R wave contributed 58.29% to the model's attention, while in the age classification, the P-R interval contributed 46.29%. By combining high predictive performance with enhanced explainability, TransECG provides a robust solution for privacy-conscious ECG data sharing, supporting the development of secure and trusted healthcare data environment.
