anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding
Haitao Li, Ziyu Li, Yiheng Mao, Ziyi Liu, Zhoujian Sun, Zhengxing Huang
TL;DR
anyECG-chat introduces a generalist ECG-MLLM capable of handling dynamic-length, reduced-lead, and multi-ECG inputs to perform diverse tasks including report generation, fine-grained localization, and multi-ECG comparison. It relies on the anyECG dataset family and a three-stage curriculum that progressively aligns ECG perception with instruction-following in a LLaMA-based model augmented by LoRA adapters. The architecture couples a pre-trained ECG encoder with a modality connector and strategic input tokens to support dynamic ECG inputs, while the evaluation demonstrates robust cross-task performance, including zero-shot and multi-turn capabilities. This work provides a practical framework for flexible, interactive ECG understanding with potential impact on clinical workflow and home-monitoring contexts.
Abstract
The advent of multimodal large language models (MLLMs) has sparked interest in their application to electrocardiogram (ECG) analysis. However, existing ECG-focused MLLMs primarily focus on report generation tasks, often limited to single 12-lead, short-duration (10s) ECG inputs, thereby underutilizing the potential of MLLMs. To this end, we aim to develop a MLLM for ECG analysis that supports a broader range of tasks and more flexible ECG inputs. However, existing ECG-QA datasets are often monotonous. To address this gap, we first constructed the anyECG dataset, which encompasses a wide variety of tasks, including report generation, abnormal waveform localization, and open-ended question answering. In addition to standard hospital ECGs, we introduced long-duration reduced-lead ECGs for home environments and multiple ECG comparison scenarios commonly encountered in clinical practice. Furthermore, we propose the anyECG-chat model, which supports dynamic-length ECG inputs and multiple ECG inputs. We trained the model using a three-stage curriculum training recipe with the anyECG dataset. A comprehensive evaluation was conducted, demonstrating that anyECG-chat is capable of supporting various practical application scenarios, including not only common report generation tasks but also abnormal waveform localization for long-duration reduced-lead ECGs in home environments and comprehensive comparative analysis of multiple ECGs. Our code and data are available at: https://github.com/CuCl-2/anyECG-chat.
