Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models
Manh Duong Nguyen, Dac Thai Nguyen, Trung Viet Nguyen, Homi Yamada, Huy Hieu Pham, Phi Le Nguyen
TL;DR
This work tackles the subjectivity and variability in necrosis assessment for osteosarcoma by proposing FDDM, a two-stage framework that first performs patch-based classification with a LoRA-finetuned Vision Transformer and then refines region-level segmentation using a conditional Brownian-Bridge diffusion model. The region refiner integrates both the coarse classification masks and tissue context, guided by two training losses (transition and segmentation) to produce accurate segmentation. On a newly curated osteosarcoma dataset, FDDM achieves up to a 10% improvement in $mIOU$ and a 32.12% enhancement in necrosis-rate estimation compared to state-of-the-art methods, establishing a new benchmark for computational pathology in this domain. The approach highlights the potential of combining foundation-model-based patch classification with diffusion-based refinement to handle complex, context-dependent histopathology tasks and can be extended to other cancer types to support clinical decision-making.
Abstract
Osteosarcoma, the most common primary bone cancer, often requires accurate necrosis assessment from whole slide images (WSIs) for effective treatment planning and prognosis. However, manual assessments are subjective and prone to variability. In response, we introduce FDDM, a novel framework bridging the gap between patch classification and region-based segmentation. FDDM operates in two stages: patch-based classification, followed by region-based refinement, enabling cross-patch information intergation. Leveraging a newly curated dataset of osteosarcoma images, FDDM demonstrates superior segmentation performance, achieving up to a 10% improvement mIOU and a 32.12% enhancement in necrosis rate estimation over state-of-the-art methods. This framework sets a new benchmark in osteosarcoma assessment, highlighting the potential of foundation models and diffusion-based refinements in complex medical imaging tasks.
