Table of Contents
Fetching ...

SSAD: Self-supervised Auxiliary Detection Framework for Panoramic X-ray based Dental Disease Diagnosis

Zijian Cai, Xinquan Yang, Xuguang Li, Xiaoling Luo, Xuechen Li, Linlin Shen, He Meng, Yongqiang Deng

TL;DR

The paper addresses the challenge of limited annotated panoramic dental X-rays for disease diagnosis by proposing SSAD, a self-supervised auxiliary detection framework that jointly trains a reconstruction branch and a detection branch with a shared encoder. It introduces a Texture Consistency Loss $L_{TC}=L_ heta+L_d$, where $L_ heta=1-D_{cos}(v_D,v_{R'})$ and $L_d=ig|\|v_D\\|_2-\\|v_{R'}\ |_2ig|$, leveraging a strong feature extractor to align embeddings from the input and reconstructed images. The method is plug-and-play with any detector and demonstrates state-of-the-art performance on the DENTEX dataset across three tasks, while significantly reducing training time compared to SSL+FT baselines. These results suggest SSAD can effectively reduce annotation burden and accelerate deployment of dental disease diagnosis systems in clinical practice. The approach combines end-to-end optimization of texture-oriented reconstruction with detector training, enabling robust, fine-grained tooth disease detection in panoramic X-rays.

Abstract

Panoramic X-ray is a simple and effective tool for diagnosing dental diseases in clinical practice. When deep learning models are developed to assist dentist in interpreting panoramic X-rays, most of their performance suffers from the limited annotated data, which requires dentist's expertise and a lot of time cost. Although self-supervised learning (SSL) has been proposed to address this challenge, the two-stage process of pretraining and fine-tuning requires even more training time and computational resources. In this paper, we present a self-supervised auxiliary detection (SSAD) framework, which is plug-and-play and compatible with any detectors. It consists of a reconstruction branch and a detection branch. Both branches are trained simultaneously, sharing the same encoder, without the need for finetuning. The reconstruction branch learns to restore the tooth texture of healthy or diseased teeth, while the detection branch utilizes these learned features for diagnosis. To enhance the encoder's ability to capture fine-grained features, we incorporate the image encoder of SAM to construct a texture consistency (TC) loss, which extracts image embedding from the input and output of reconstruction branch, and then enforces both embedding into the same feature space. Extensive experiments on the public DENTEX dataset through three detection tasks demonstrate that the proposed SSAD framework achieves state-of-the-art performance compared to mainstream object detection methods and SSL methods. The code is available at https://github.com/Dylonsword/SSAD

SSAD: Self-supervised Auxiliary Detection Framework for Panoramic X-ray based Dental Disease Diagnosis

TL;DR

The paper addresses the challenge of limited annotated panoramic dental X-rays for disease diagnosis by proposing SSAD, a self-supervised auxiliary detection framework that jointly trains a reconstruction branch and a detection branch with a shared encoder. It introduces a Texture Consistency Loss , where and , leveraging a strong feature extractor to align embeddings from the input and reconstructed images. The method is plug-and-play with any detector and demonstrates state-of-the-art performance on the DENTEX dataset across three tasks, while significantly reducing training time compared to SSL+FT baselines. These results suggest SSAD can effectively reduce annotation burden and accelerate deployment of dental disease diagnosis systems in clinical practice. The approach combines end-to-end optimization of texture-oriented reconstruction with detector training, enabling robust, fine-grained tooth disease detection in panoramic X-rays.

Abstract

Panoramic X-ray is a simple and effective tool for diagnosing dental diseases in clinical practice. When deep learning models are developed to assist dentist in interpreting panoramic X-rays, most of their performance suffers from the limited annotated data, which requires dentist's expertise and a lot of time cost. Although self-supervised learning (SSL) has been proposed to address this challenge, the two-stage process of pretraining and fine-tuning requires even more training time and computational resources. In this paper, we present a self-supervised auxiliary detection (SSAD) framework, which is plug-and-play and compatible with any detectors. It consists of a reconstruction branch and a detection branch. Both branches are trained simultaneously, sharing the same encoder, without the need for finetuning. The reconstruction branch learns to restore the tooth texture of healthy or diseased teeth, while the detection branch utilizes these learned features for diagnosis. To enhance the encoder's ability to capture fine-grained features, we incorporate the image encoder of SAM to construct a texture consistency (TC) loss, which extracts image embedding from the input and output of reconstruction branch, and then enforces both embedding into the same feature space. Extensive experiments on the public DENTEX dataset through three detection tasks demonstrate that the proposed SSAD framework achieves state-of-the-art performance compared to mainstream object detection methods and SSL methods. The code is available at https://github.com/Dylonsword/SSAD
Paper Structure (11 sections, 9 equations, 5 figures, 4 tables)

This paper contains 11 sections, 9 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparison of the SSL+FT paradigm and SSAT paradigm.
  • Figure 2: The overview of the proposed self-supervised auxiliary detection framework.
  • Figure 3: The cost time of different paradigm on the task of disease diagnosis in training phase.
  • Figure 4: The AP value of different detection methods using different feature extractors in TC loss.
  • Figure 5: Detection results of detector with or without SSAD.