DE-KAN: A Kolmogorov Arnold Network with Dual Encoder for accurate 2D Teeth Segmentation
Md Mizanur Rahman Mustakim, Jianwu Li, Sumya Bhuiyan, Mohammad Mehedi Hasan, Bing Han
TL;DR
The paper tackles accurate tooth segmentation in panoramic radiographs by proposing DE-KAN, a Dual Encoder Kolmogorov Arnold Network that fuses global and local features through two encoders (ResNet-18 for augmented inputs and a custom CNN for original inputs) and KAN-based bottleneck blocks. This architecture enhances feature representation and interpretability, addressing challenges like overlapping teeth and sharp edges. Extensive experiments on CDPR and HTL datasets show DE-KAN achieving state-of-the-art metrics (mIoU, Dice, Accuracy, Recall) and outperforming several baselines, with ablation analyses underscoring the value of the dual-encoder and KAN components. While computational cost increases, the method maintains practical latency, supporting potential clinical deployment and paving the way for broader 2D/3D dental image segmentation tasks.
Abstract
Accurate segmentation of individual teeth from panoramic radiographs remains a challenging task due to anatomical variations, irregular tooth shapes, and overlapping structures. These complexities often limit the performance of conventional deep learning models. To address this, we propose DE-KAN, a novel Dual Encoder Kolmogorov Arnold Network, which enhances feature representation and segmentation precision. The framework employs a ResNet-18 encoder for augmented inputs and a customized CNN encoder for original inputs, enabling the complementary extraction of global and local spatial features. These features are fused through KAN-based bottleneck layers, incorporating nonlinear learnable activation functions derived from the Kolmogorov Arnold representation theorem to improve learning capacity and interpretability. Extensive experiments on two benchmark dental X-ray datasets demonstrate that DE-KAN outperforms state-of-the-art segmentation models, achieving mIoU of 94.5%, Dice coefficient of 97.1%, accuracy of 98.91%, and recall of 97.36%, representing up to +4.7% improvement in Dice compared to existing methods.
