TiBiX: Leveraging Temporal Information for Bidirectional X-ray and Report Generation
Santosh Sanjeev, Fadillah Adamsyah Maani, Arsen Abzhanov, Vijay Ram Papineni, Ibrahim Almakky, Bartłomiej W. Papież, Mohammad Yaqub
TL;DR
TiBiX tackles the omission of temporal context in chest X-ray to report generation by introducing a temporal bidirectional framework that jointly generates current CXR, current report, and prior CXR. It relies on a transformer with causal attention and a temporal token to fuse three modalities, and introduces MIMIC-T, a longitudinal dataset derived from MIMIC-CXR. It reports state-of-the-art results on report generation and competitive performance on image generation, with ablations confirming the benefit of including prior scans. This work provides a practical baseline for longitudinal, bidirectional CXR-to-report tasks and opens avenues for temporal-aware evaluation and knowledge-augmented radiology AI.
Abstract
With the emergence of vision language models in the medical imaging domain, numerous studies have focused on two dominant research activities: (1) report generation from Chest X-rays (CXR), and (2) synthetic scan generation from text or reports. Despite some research incorporating multi-view CXRs into the generative process, prior patient scans and reports have been generally disregarded. This can inadvertently lead to the leaving out of important medical information, thus affecting generation quality. To address this, we propose TiBiX: Leveraging Temporal information for Bidirectional X-ray and Report Generation. Considering previous scans, our approach facilitates bidirectional generation, primarily addressing two challenging problems: (1) generating the current image from the previous image and current report and (2) generating the current report based on both the previous and current images. Moreover, we extract and release a curated temporal benchmark dataset derived from the MIMIC-CXR dataset, which focuses on temporal data. Our comprehensive experiments and ablation studies explore the merits of incorporating prior CXRs and achieve state-of-the-art (SOTA) results on the report generation task. Furthermore, we attain on-par performance with SOTA image generation efforts, thus serving as a new baseline in longitudinal bidirectional CXR-to-report generation. The code is available at https://github.com/BioMedIA-MBZUAI/TiBiX.
