Table of Contents
Fetching ...

Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

Bharadwaj Dogga, Kaaustaaub Shankar, Gibin Raju, Wilhelm Louw, Kelly Cohen

TL;DR

The paper tackles the explainability gap in edge detection by proposing the sMoE U‑Net, a hybrid architecture that integrates Spatially-Adaptive Mixture-of-Experts blocks with a differentiable First-Order Takagi-Sugeno-Kang (TSK) fuzzy head. This design enables per-pixel gating between context-aware smoothing and boundary-preserving sharpening, and provides explicit IF-THEN rules for decisions, visualized via Strategy Maps and Rule Firing Maps. On BSDS500, the model achieves an ODS F-score of 0.7628, competitive with HED and superior to standard U‑Net, while offering interpretability without sacrificing accuracy. This Glass-Box approach holds promise for safety-critical applications like medical imaging and aerospace, where verifiability and auditable decisions are essential.

Abstract

Deep learning models like U-Net and its variants, have established state-of-the-art performance in edge detection tasks and are used by Generative AI services world-wide for their image generation models. However, their decision-making processes remain opaque, operating as "black boxes" that obscure the rationale behind specific boundary predictions. This lack of transparency is a critical barrier in safety-critical applications where verification is mandatory. To bridge the gap between high-performance deep learning and interpretable logic, we propose the Rule-Based Spatial Mixture-of-Experts U-Net (sMoE U-Net). Our architecture introduces two key innovations: (1) Spatially-Adaptive Mixture-of-Experts (sMoE) blocks integrated into the decoder skip connections, which dynamically gate between "Context" (smooth) and "Boundary" (sharp) experts based on local feature statistics; and (2) a Takagi-Sugeno-Kang (TSK) Fuzzy Head that replaces the standard classification layer. This fuzzy head fuses deep semantic features with heuristic edge signals using explicit IF-THEN rules. We evaluate our method on the BSDS500 benchmark, achieving an Optimal Dataset Scale (ODS) F-score of 0.7628, effectively matching purely deep baselines like HED (0.7688) while outperforming the standard U-Net (0.7437). Crucially, our model provides pixel-level explainability through "Rule Firing Maps" and "Strategy Maps," allowing users to visualize whether an edge was detected due to strong gradients, high semantic confidence, or specific logical rule combinations.

Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

TL;DR

The paper tackles the explainability gap in edge detection by proposing the sMoE U‑Net, a hybrid architecture that integrates Spatially-Adaptive Mixture-of-Experts blocks with a differentiable First-Order Takagi-Sugeno-Kang (TSK) fuzzy head. This design enables per-pixel gating between context-aware smoothing and boundary-preserving sharpening, and provides explicit IF-THEN rules for decisions, visualized via Strategy Maps and Rule Firing Maps. On BSDS500, the model achieves an ODS F-score of 0.7628, competitive with HED and superior to standard U‑Net, while offering interpretability without sacrificing accuracy. This Glass-Box approach holds promise for safety-critical applications like medical imaging and aerospace, where verifiability and auditable decisions are essential.

Abstract

Deep learning models like U-Net and its variants, have established state-of-the-art performance in edge detection tasks and are used by Generative AI services world-wide for their image generation models. However, their decision-making processes remain opaque, operating as "black boxes" that obscure the rationale behind specific boundary predictions. This lack of transparency is a critical barrier in safety-critical applications where verification is mandatory. To bridge the gap between high-performance deep learning and interpretable logic, we propose the Rule-Based Spatial Mixture-of-Experts U-Net (sMoE U-Net). Our architecture introduces two key innovations: (1) Spatially-Adaptive Mixture-of-Experts (sMoE) blocks integrated into the decoder skip connections, which dynamically gate between "Context" (smooth) and "Boundary" (sharp) experts based on local feature statistics; and (2) a Takagi-Sugeno-Kang (TSK) Fuzzy Head that replaces the standard classification layer. This fuzzy head fuses deep semantic features with heuristic edge signals using explicit IF-THEN rules. We evaluate our method on the BSDS500 benchmark, achieving an Optimal Dataset Scale (ODS) F-score of 0.7628, effectively matching purely deep baselines like HED (0.7688) while outperforming the standard U-Net (0.7437). Crucially, our model provides pixel-level explainability through "Rule Firing Maps" and "Strategy Maps," allowing users to visualize whether an edge was detected due to strong gradients, high semantic confidence, or specific logical rule combinations.
Paper Structure (20 sections, 3 equations, 8 figures, 1 table)

This paper contains 20 sections, 3 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Compact architecture of the proposed explainable sMoE U-Net with Sobel pre-processing and a TSK fuzzy head.
  • Figure 2: Architecture of Spatially-Adaptive Mixture-of-Experts with Sobel Edge signal
  • Figure 3: Architecture of proposed TSK Rule-Based head
  • Figure 4: Precision-Recall Curve
  • Figure 5: Edge detection comparison across various methods
  • ...and 3 more figures