Table of Contents
Fetching ...

OMNI-Dent: Towards an Accessible and Explainable AI Framework for Automated Dental Diagnosis

Leeje Jang, Yao-Yi Chiang, Angela M. Hastings, Patimaporn Pungchanchaikul, Martha B. Lucas, Emily C. Schultz, Jeffrey P. Louie, Mohamed Estai, Wen-Chen Wang, Ryan H. L. Ip, Boyen Huang

TL;DR

OMNI-Dent tackles the problem of limited access to timely dental diagnosis by introducing an explainable, data-efficient framework that uses multi-view smartphone images and a general-purpose vision-language foundation without dental-specific fine-tuning. The approach combines a tooth-detection module, a clinically guided reasoning module, and a diagnosis integration module to deliver tooth-level diagnoses with interpretable reasoning traces. Experiments show that reasoning-guided guidance substantially improves abnormality detection and category-specific diagnoses over baselines, with strong recall across conditions; in-context learning can further boost performance for some defects. The work demonstrates the practical potential of at-home, smartphone-based dental screening that preserves clinical reasoning, supports early intervention, and remains complementary to professional care, particularly for underserved populations. Future work will broaden diagnostic scope, strengthen robustness across imaging conditions, and pursue prospective deployment in real-world settings.

Abstract

Accurate dental diagnosis is essential for oral healthcare, yet many individuals lack access to timely professional evaluation. Existing AI-based methods primarily treat diagnosis as a visual pattern recognition task and do not reflect the structured clinical reasoning used by dental professionals. These approaches also require large amounts of expert-annotated data and often struggle to generalize across diverse real-world imaging conditions. To address these limitations, we present OMNI-Dent, a data-efficient and explainable diagnostic framework that incorporates clinical reasoning principles into a Vision-Language Model (VLM)-based pipeline. The framework operates on multi-view smartphone photographs,embeds diagnostic heuristics from dental experts, and guides a general-purpose VLM to perform tooth-level evaluation without dental-specific fine-tuning of the VLM. By utilizing the VLM's existing visual-linguistic capabilities, OMNI-Dent aims to support diagnostic assessment in settings where curated clinical imaging is unavailable. Designed as an early-stage assistive tool, OMNI-Dent helps users identify potential abnormalities and determine when professional evaluation may be needed, offering a practical option for individuals with limited access to in-person care.

OMNI-Dent: Towards an Accessible and Explainable AI Framework for Automated Dental Diagnosis

TL;DR

OMNI-Dent tackles the problem of limited access to timely dental diagnosis by introducing an explainable, data-efficient framework that uses multi-view smartphone images and a general-purpose vision-language foundation without dental-specific fine-tuning. The approach combines a tooth-detection module, a clinically guided reasoning module, and a diagnosis integration module to deliver tooth-level diagnoses with interpretable reasoning traces. Experiments show that reasoning-guided guidance substantially improves abnormality detection and category-specific diagnoses over baselines, with strong recall across conditions; in-context learning can further boost performance for some defects. The work demonstrates the practical potential of at-home, smartphone-based dental screening that preserves clinical reasoning, supports early intervention, and remains complementary to professional care, particularly for underserved populations. Future work will broaden diagnostic scope, strengthen robustness across imaging conditions, and pursue prospective deployment in real-world settings.

Abstract

Accurate dental diagnosis is essential for oral healthcare, yet many individuals lack access to timely professional evaluation. Existing AI-based methods primarily treat diagnosis as a visual pattern recognition task and do not reflect the structured clinical reasoning used by dental professionals. These approaches also require large amounts of expert-annotated data and often struggle to generalize across diverse real-world imaging conditions. To address these limitations, we present OMNI-Dent, a data-efficient and explainable diagnostic framework that incorporates clinical reasoning principles into a Vision-Language Model (VLM)-based pipeline. The framework operates on multi-view smartphone photographs,embeds diagnostic heuristics from dental experts, and guides a general-purpose VLM to perform tooth-level evaluation without dental-specific fine-tuning of the VLM. By utilizing the VLM's existing visual-linguistic capabilities, OMNI-Dent aims to support diagnostic assessment in settings where curated clinical imaging is unavailable. Designed as an early-stage assistive tool, OMNI-Dent helps users identify potential abnormalities and determine when professional evaluation may be needed, offering a practical option for individuals with limited access to in-person care.
Paper Structure (34 sections, 2 figures, 3 tables)

This paper contains 34 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Overview of OMNI-Dent. Replicating dentists' clinical diagnostic reasoning processes, the framework processes multi-view smartphone photographs through tooth detection, clinical reasoning, and diagnosis integration modules. The output of OMNI-Dent provides tooth-level diagnostic conditions with corresponding reasoning.
  • Figure 2: Three-step diagnostic reasoning in the clinical reasoning module (left). For example, the module assesses tooth wear in three stages (right), replicating a dentist's diagnostic process: Step 1 localizes early surface changes; Step 2 examines structural and textural patterns using clinician-guided criteria (e.g., attrition, abfraction/abrasion, erosion); and Step 3 integrates these findings to produce the final diagnosis.