Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

Bharadwaj Dogga; Kaaustaaub Shankar; Gibin Raju; Wilhelm Louw; Kelly Cohen

Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

Bharadwaj Dogga, Kaaustaaub Shankar, Gibin Raju, Wilhelm Louw, Kelly Cohen

TL;DR

The paper tackles the explainability gap in edge detection by proposing the sMoE U‑Net, a hybrid architecture that integrates Spatially-Adaptive Mixture-of-Experts blocks with a differentiable First-Order Takagi-Sugeno-Kang (TSK) fuzzy head. This design enables per-pixel gating between context-aware smoothing and boundary-preserving sharpening, and provides explicit IF-THEN rules for decisions, visualized via Strategy Maps and Rule Firing Maps. On BSDS500, the model achieves an ODS F-score of 0.7628, competitive with HED and superior to standard U‑Net, while offering interpretability without sacrificing accuracy. This Glass-Box approach holds promise for safety-critical applications like medical imaging and aerospace, where verifiability and auditable decisions are essential.

Abstract

Deep learning models like U-Net and its variants, have established state-of-the-art performance in edge detection tasks and are used by Generative AI services world-wide for their image generation models. However, their decision-making processes remain opaque, operating as "black boxes" that obscure the rationale behind specific boundary predictions. This lack of transparency is a critical barrier in safety-critical applications where verification is mandatory. To bridge the gap between high-performance deep learning and interpretable logic, we propose the Rule-Based Spatial Mixture-of-Experts U-Net (sMoE U-Net). Our architecture introduces two key innovations: (1) Spatially-Adaptive Mixture-of-Experts (sMoE) blocks integrated into the decoder skip connections, which dynamically gate between "Context" (smooth) and "Boundary" (sharp) experts based on local feature statistics; and (2) a Takagi-Sugeno-Kang (TSK) Fuzzy Head that replaces the standard classification layer. This fuzzy head fuses deep semantic features with heuristic edge signals using explicit IF-THEN rules. We evaluate our method on the BSDS500 benchmark, achieving an Optimal Dataset Scale (ODS) F-score of 0.7628, effectively matching purely deep baselines like HED (0.7688) while outperforming the standard U-Net (0.7437). Crucially, our model provides pixel-level explainability through "Rule Firing Maps" and "Strategy Maps," allowing users to visualize whether an edge was detected due to strong gradients, high semantic confidence, or specific logical rule combinations.

Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

TL;DR

Abstract

Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (8)