Multifractal Recalibration of Neural Networks for Medical Imaging Segmentation
Miguel L. Martins, Miguel T. Coimbra, Francesco Renna
TL;DR
This work introduces Monofractal and Multifractal Recalibration as inductive priors embedded in channel attention to leverage local scaling exponents and multifractal spectra for medical image segmentation. Grounded in multifractal formalism, the authors develop differentiable methods to compute Hölder exponents and implement two recalibration strategies, demonstrating substantial Dice-score gains on ISIC18, Kvasir-SEG, and BUSI compared to standard SE-based baselines. The study provides insights into excitation dynamics, the role of encoder depth, and the impact of global statistics on attention effectiveness, while also discussing computational considerations and potential extensions with wavelet-based approaches. Overall, the results show that higher-order fractal statistics can meaningfully enhance end-to-end segmentation models in medical imaging.
Abstract
Multifractal analysis has revealed regularities in many self-seeding phenomena, yet its use in modern deep learning remains limited. Existing end-to-end multifractal methods rely on heavy pooling or strong feature-space decimation, which constrain tasks such as semantic segmentation. Motivated by these limitations, we introduce two inductive priors: Monofractal and Multifractal Recalibration. These methods leverage relationships between the probability mass of the exponents and the multifractal spectrum to form statistical descriptions of encoder embeddings, implemented as channel-attention functions in convolutional networks. Using a U-Net-based framework, we show that multifractal recalibration yields substantial gains over a baseline equipped with other channel-attention mechanisms that also use higher-order statistics. Given the proven ability of multifractal analysis to capture pathological regularities, we validate our approach on three public medical-imaging datasets: ISIC18 (dermoscopy), Kvasir-SEG (endoscopy), and BUSI (ultrasound). Our empirical analysis also provides insights into the behavior of these attention layers. We find that excitation responses do not become increasingly specialized with encoder depth in U-Net architectures due to skip connections, and that their effectiveness may relate to global statistics of instance variability.
