End-to-End Breast Cancer Radiotherapy Planning via LMMs with Consistency Embedding
Kwanyoung Kim, Yujin Oh, Sangjoon Park, Hwa Kyung Byun, Joongyo Lee, Jin Sung Kim, Yong Bae Kim, Jong Chul Ye
TL;DR
This work presents RO-LMM, a large multimodal model designed to support end-to-end breast cancer radiotherapy planning, including clinical report summarization, radiotherapy strategy suggestion, and plan-guided 3D target-volume segmentation. It introduces Consistency Embedding Fine-Tuning ($CEFTune$) and Consistency Embedding Segmentation ($CESEG$) to mitigate error accumulation across sequential tasks and to improve robustness to noisy inputs. The approach demonstrates strong improvements over diverse baselines across internal and external validation cohorts, with ablations confirming the benefits of separate task-specific experts and consistency regularization. The framework offers practical potential to reduce clinical workload, preserve data privacy, and enable robust, end-to-end multimodal AI assistance in radiation oncology.
Abstract
Recent advances in AI foundation models have significant potential for lightening the clinical workload by mimicking the comprehensive and multi-faceted approaches used by medical professionals. In the field of radiation oncology, the integration of multiple modalities holds great importance, so the opportunity of foundational model is abundant. Inspired by this, here we present RO-LMM, a multi-purpose, comprehensive large multimodal model (LMM) tailored for the field of radiation oncology. This model effectively manages a series of tasks within the clinical workflow, including clinical context summarization, radiation treatment plan suggestion, and plan-guided target volume segmentation by leveraging the capabilities of LMM. In particular, to perform consecutive clinical tasks without error accumulation, we present a novel Consistency Embedding Fine-Tuning (CEFTune) technique, which boosts LMM's robustness to noisy inputs while preserving the consistency of handling clean inputs. We further extend this concept to LMM-driven segmentation framework, leading to a novel Consistency Embedding Segmentation (CESEG) techniques. Experimental results including multi-centre validation confirm that our RO-LMM with CEFTune and CESEG results in promising performance for multiple clinical tasks with generalization capabilities.
