Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model

Sangjoon Park; Yong Bae Kim; Jee Suk Chang; Seo Hee Choi; Hyungjin Chung; Ik Jae Lee; Hwa Kyung Byun

Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model

Sangjoon Park, Yong Bae Kim, Jee Suk Chang, Seo Hee Choi, Hyungjin Chung, Ik Jae Lee, Hwa Kyung Byun

TL;DR

Breast cosmesis evaluation after surgery is hampered by subjective labeling. The authors introduce AG-DDAD, an unsupervised anomaly-detection framework that couples a DINO self-supervised Vision Transformer attention mask with a diffusion-model-based reconstruction, enabling targeted, high-quality modifications of discriminative regions while preserving the rest of the image. Trained on predominantly normal cosmesis data and evaluated against expert consensus labels, AG-DDAD achieves state-of-the-art anomaly-detection performance and provides objective, interpretable anomaly maps and scores, outperforming rule-based tools like BCCT.core. This approach advances unsupervised anomaly detection in medical imaging and offers a scalable, automated, and explainable framework for cosmesis assessment that could improve treatment planning and retrospective evaluation.

Abstract

As advancements in the field of breast cancer treatment continue to progress, the assessment of post-surgical cosmetic outcomes has gained increasing significance due to its substantial impact on patients' quality of life. However, evaluating breast cosmesis presents challenges due to the inherently subjective nature of expert labeling. In this study, we present a novel automated approach, Attention-Guided Denoising Diffusion Anomaly Detection (AG-DDAD), designed to assess breast cosmesis following surgery, addressing the limitations of conventional supervised learning and existing anomaly detection models. Our approach leverages the attention mechanism of the distillation with no label (DINO) self-supervised Vision Transformer (ViT) in combination with a diffusion model to achieve high-quality image reconstruction and precise transformation of discriminative regions. By training the diffusion model on unlabeled data predominantly with normal cosmesis, we adopt an unsupervised anomaly detection perspective to automatically score the cosmesis. Real-world data experiments demonstrate the effectiveness of our method, providing visually appealing representations and quantifiable scores for cosmesis evaluation. Compared to commonly used rule-based programs, our fully automated approach eliminates the need for manual annotations and offers objective evaluation. Moreover, our anomaly detection model exhibits state-of-the-art performance, surpassing existing models in accuracy. Going beyond the scope of breast cosmesis, our research represents a significant advancement in unsupervised anomaly detection within the medical domain, thereby paving the way for future investigations.

Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model

TL;DR

Abstract

Paper Structure (25 sections, 16 equations, 6 figures, 5 tables)

This paper contains 25 sections, 16 equations, 6 figures, 5 tables.

Introduction
Related works
Denoising diffusion models
Unsupervised anomaly detection
Self-supervised vision transformer
Proposed framework
Overview of the proposed AG-DDAD model
Self-supervised ViT for attention mask generation
Anomaly scoring through high-quality reconstruction
Implementation details
Details of dataset
Details of DINO self-supervised ViT for attention guidance
Details of diffusion model training and sampling
Details of evaluation
Experimental results
...and 10 more sections

Figures (6)

Figure 1: The schematic illustration presents our proposed Attention-Guided Denoising Diffusion Anomaly Detection (AG-DDAD) model. This architecture harnesses the attention mechanism of the Vision Transformer (ViT), which has been self-trained via Distillation with no labels (DINO) methodology. This attention serves as a soft mask, fusing two disparate reverse sampling methods.
Figure 2: Examples of the soft mask, derived from the attention weights of self-supervised Vision Transformer (ViT) models, each corresponding to different cosmesis groups.
Figure 3: Comparative analysis of anomaly scores across different cosmesis groups. There exists a statistically significant escalation in the anomaly score concurrent with the deterioration of cosmesis, which is observed across all groups.
Figure 4: Comparison of reconstruction outcomes for normal cosmesis between the proposed methodologies and other diffusion model-based methods. Other diffusion model-based approaches either distort details of the image that should be preserved (red box) or fail to sufficiently transform areas that need modification for achieving normal cosmesis (yellow box). Conversely, the proposed framework effectively transforms only the requisite portions while preserving the rest, as illustrated.
Figure 5: Comparison between the conventional rule-based scoring method, the BCCT.core program and the proposed framework. (A) The BCCT.core program necessitates manual delineation of the breast contour and anatomical markers (red dots) to produce class results, yet it fails to provide any quantifiable scores or visualization outcomes. (B) The proposed framework utilizing the AG-DDAD model, on the other hand, generates quantifiable anomaly scores conducive to comparison, solely based on the image itself, without requiring any additional processes. This framework also facilitates the visualization of corresponding anomaly maps.
...and 1 more figures

Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model

TL;DR

Abstract

Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model

Authors

TL;DR

Abstract

Table of Contents

Figures (6)