Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation
Zhusi Zhong, Jie Li, John Sollee, Scott Collins, Harrison Bai, Paul Zhang, Terrence Healey, Michael Atalay, Xinbo Gao, Zhicheng Jiao
TL;DR
This work tackles automated radiology report generation and survival prediction for COVID-19 CXRs by grounding textual descriptions in high-risk anatomical regions. It introduces MRANet, a framework that fuses region detection (Faster R-CNN with a Region Completer), multi-scale region-feature encoding, survival-guided sentence embedding, image-to-text LLM alignment (GatorTron and GPT-2), and a two-stage multi-modal survival predictor. The approach yields region-grounded sentences and prognostic signals, validated on Brown-COVID and Penn-COVID across multiple centers, with improvements in C-index and British-level clinical evaluation metrics. The study contributes to interpretability and trust in AI-assisted radiology by linking visual regions, descriptive text, and survival risk, and suggests directions for further enhancing clinical transparency.
Abstract
In response to the worldwide COVID-19 pandemic, advanced automated technologies have emerged as valuable tools to aid healthcare professionals in managing an increased workload by improving radiology report generation and prognostic analysis. This study proposes Multi-modality Regional Alignment Network (MRANet), an explainable model for radiology report generation and survival prediction that focuses on high-risk regions. By learning spatial correlation in the detector, MRANet visually grounds region-specific descriptions, providing robust anatomical regions with a completion strategy. The visual features of each region are embedded using a novel survival attention mechanism, offering spatially and risk-aware features for sentence encoding while maintaining global coherence across tasks. A cross LLMs alignment is employed to enhance the image-to-text transfer process, resulting in sentences rich with clinical detail and improved explainability for radiologist. Multi-center experiments validate both MRANet's overall performance and each module's composition within the model, encouraging further advancements in radiology report generation research emphasizing clinical interpretation and trustworthiness in AI models applied to medical studies. The code is available at https://github.com/zzs95/MRANet.
