Table of Contents
Fetching ...

DE-KAN: A Kolmogorov Arnold Network with Dual Encoder for accurate 2D Teeth Segmentation

Md Mizanur Rahman Mustakim, Jianwu Li, Sumya Bhuiyan, Mohammad Mehedi Hasan, Bing Han

TL;DR

The paper tackles accurate tooth segmentation in panoramic radiographs by proposing DE-KAN, a Dual Encoder Kolmogorov Arnold Network that fuses global and local features through two encoders (ResNet-18 for augmented inputs and a custom CNN for original inputs) and KAN-based bottleneck blocks. This architecture enhances feature representation and interpretability, addressing challenges like overlapping teeth and sharp edges. Extensive experiments on CDPR and HTL datasets show DE-KAN achieving state-of-the-art metrics (mIoU, Dice, Accuracy, Recall) and outperforming several baselines, with ablation analyses underscoring the value of the dual-encoder and KAN components. While computational cost increases, the method maintains practical latency, supporting potential clinical deployment and paving the way for broader 2D/3D dental image segmentation tasks.

Abstract

Accurate segmentation of individual teeth from panoramic radiographs remains a challenging task due to anatomical variations, irregular tooth shapes, and overlapping structures. These complexities often limit the performance of conventional deep learning models. To address this, we propose DE-KAN, a novel Dual Encoder Kolmogorov Arnold Network, which enhances feature representation and segmentation precision. The framework employs a ResNet-18 encoder for augmented inputs and a customized CNN encoder for original inputs, enabling the complementary extraction of global and local spatial features. These features are fused through KAN-based bottleneck layers, incorporating nonlinear learnable activation functions derived from the Kolmogorov Arnold representation theorem to improve learning capacity and interpretability. Extensive experiments on two benchmark dental X-ray datasets demonstrate that DE-KAN outperforms state-of-the-art segmentation models, achieving mIoU of 94.5%, Dice coefficient of 97.1%, accuracy of 98.91%, and recall of 97.36%, representing up to +4.7% improvement in Dice compared to existing methods.

DE-KAN: A Kolmogorov Arnold Network with Dual Encoder for accurate 2D Teeth Segmentation

TL;DR

The paper tackles accurate tooth segmentation in panoramic radiographs by proposing DE-KAN, a Dual Encoder Kolmogorov Arnold Network that fuses global and local features through two encoders (ResNet-18 for augmented inputs and a custom CNN for original inputs) and KAN-based bottleneck blocks. This architecture enhances feature representation and interpretability, addressing challenges like overlapping teeth and sharp edges. Extensive experiments on CDPR and HTL datasets show DE-KAN achieving state-of-the-art metrics (mIoU, Dice, Accuracy, Recall) and outperforming several baselines, with ablation analyses underscoring the value of the dual-encoder and KAN components. While computational cost increases, the method maintains practical latency, supporting potential clinical deployment and paving the way for broader 2D/3D dental image segmentation tasks.

Abstract

Accurate segmentation of individual teeth from panoramic radiographs remains a challenging task due to anatomical variations, irregular tooth shapes, and overlapping structures. These complexities often limit the performance of conventional deep learning models. To address this, we propose DE-KAN, a novel Dual Encoder Kolmogorov Arnold Network, which enhances feature representation and segmentation precision. The framework employs a ResNet-18 encoder for augmented inputs and a customized CNN encoder for original inputs, enabling the complementary extraction of global and local spatial features. These features are fused through KAN-based bottleneck layers, incorporating nonlinear learnable activation functions derived from the Kolmogorov Arnold representation theorem to improve learning capacity and interpretability. Extensive experiments on two benchmark dental X-ray datasets demonstrate that DE-KAN outperforms state-of-the-art segmentation models, achieving mIoU of 94.5%, Dice coefficient of 97.1%, accuracy of 98.91%, and recall of 97.36%, representing up to +4.7% improvement in Dice compared to existing methods.

Paper Structure

This paper contains 17 sections, 19 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: (a) shows the overlapped teeth, and (b) denotes the sharpness of teeth in different shapes.
  • Figure 2: A visual representation of Kolmogorov Arnold Network.
  • Figure 3: Overview of proposed segmentation network DE-KAN, integrating dual-encoder and KAN blocks. The upper portion illustrates the overall workflow, where the ResNet-18 backbone processes strongly augmented inputs and the CNN extractor processes original inputs, followed by merged feature representations. The hybrid bottleneck incorporates two KAN blocks, including Patch Embedding Layer, KAN Linear Layer, and convolutional components for optimized feature extraction. The decoder reconstructs the segmentation map using hierarchical up-sampling blocks. The lower portion provides detailed views of the CNN encoder, the KAN block architecture, the KAN linear layer with non-linear mappings, and the decoder design, emphasizing the role of non-linear learnable parameters and convolutional layers in pixel-level segmentation.
  • Figure 4: A visualization of the feature extractor module.
  • Figure 5: Block diagram of KAN Block, showcasing sequential operations within KAN Linear layers interleaved with convolutional blocks (Conv2D, Batch Normalization, and ReLU activation). The input $x$ undergoes transformation through these stages, and a residual connection ensures the combination of input with the final output $y$, enhancing learning efficiency and stability.
  • ...and 2 more figures