Bridging the Compression-Precision Paradox: A Hybrid Architecture for Clinical EEG Report Generation with Guaranteed Measurement Accuracy
Wuyang Zhang, Zhen Luo, Chuqiao Gu, Jianming Ma, Yebo Cao, Wangming Yuan, Yinzhi Jin
TL;DR
The paper tackles the compression-precision paradox in automated clinical EEG report generation, showing that extreme data compression required to fit EEG data into LLM context windows can erase clinically critical measurements, such as a $0.5$ Hz distinction between seizure types. It proposes a measurement-first hybrid architecture that separates exact measurement extraction via signal processing from narrative generation, using hierarchical multirate sampling, a cross-modal bridge, graph-aware attention, and constrained decoding to preserve clinical values with provenance. A formal result (Theorem 1) demonstrates that end-to-end neural learning cannot preserve precise measurements under high compression, motivating the measurement-first approach and FDA-compliant traceability. Empirical evaluation on TUH, TUSZ, and CHB-MIT shows substantial gains: about 60% fewer false alarms, ~50% faster detection, and measurement accuracy within clinical tolerances, all with sub-minute latency; the approach also generalizes to other high-frequency biosignals requiring precise quantitative reporting alongside narrative text.
Abstract
Automated EEG monitoring requires clinician-level precision for seizure detection and reporting. Clinical EEG recordings exceed LLM context windows, requiring extreme compression (400:1+ ratios) that destroys fine-grained temporal precision. A 0.5 Hz error distinguishes absence epilepsy from Lennox-Gastaut syndrome. LLMs lack inherent time-series comprehension and rely on statistical associations from compressed representations. This dual limitation causes systems to hallucinate clinically incorrect measurement values. We separate measurement extraction from text generation. Our hybrid architecture computes exact clinical values via signal processing before compression, employs a cross-modal bridge for EEG-to-language translation, and uses parameter-efficient fine-tuning with constrained decoding around frozen slots. Multirate sampling maintains long-range context while preserving event-level precision. Evaluation on TUH and CHB-MIT datasets achieves 60% fewer false alarms, 50% faster detection, and sub-clinical measurement precision. This is the first system guaranteeing clinical measurement accuracy in automated EEG reports.
