Table of Contents
Fetching ...

Multifractal Recalibration of Neural Networks for Medical Imaging Segmentation

Miguel L. Martins, Miguel T. Coimbra, Francesco Renna

TL;DR

This work introduces Monofractal and Multifractal Recalibration as inductive priors embedded in channel attention to leverage local scaling exponents and multifractal spectra for medical image segmentation. Grounded in multifractal formalism, the authors develop differentiable methods to compute Hölder exponents and implement two recalibration strategies, demonstrating substantial Dice-score gains on ISIC18, Kvasir-SEG, and BUSI compared to standard SE-based baselines. The study provides insights into excitation dynamics, the role of encoder depth, and the impact of global statistics on attention effectiveness, while also discussing computational considerations and potential extensions with wavelet-based approaches. Overall, the results show that higher-order fractal statistics can meaningfully enhance end-to-end segmentation models in medical imaging.

Abstract

Multifractal analysis has revealed regularities in many self-seeding phenomena, yet its use in modern deep learning remains limited. Existing end-to-end multifractal methods rely on heavy pooling or strong feature-space decimation, which constrain tasks such as semantic segmentation. Motivated by these limitations, we introduce two inductive priors: Monofractal and Multifractal Recalibration. These methods leverage relationships between the probability mass of the exponents and the multifractal spectrum to form statistical descriptions of encoder embeddings, implemented as channel-attention functions in convolutional networks. Using a U-Net-based framework, we show that multifractal recalibration yields substantial gains over a baseline equipped with other channel-attention mechanisms that also use higher-order statistics. Given the proven ability of multifractal analysis to capture pathological regularities, we validate our approach on three public medical-imaging datasets: ISIC18 (dermoscopy), Kvasir-SEG (endoscopy), and BUSI (ultrasound). Our empirical analysis also provides insights into the behavior of these attention layers. We find that excitation responses do not become increasingly specialized with encoder depth in U-Net architectures due to skip connections, and that their effectiveness may relate to global statistics of instance variability.

Multifractal Recalibration of Neural Networks for Medical Imaging Segmentation

TL;DR

This work introduces Monofractal and Multifractal Recalibration as inductive priors embedded in channel attention to leverage local scaling exponents and multifractal spectra for medical image segmentation. Grounded in multifractal formalism, the authors develop differentiable methods to compute Hölder exponents and implement two recalibration strategies, demonstrating substantial Dice-score gains on ISIC18, Kvasir-SEG, and BUSI compared to standard SE-based baselines. The study provides insights into excitation dynamics, the role of encoder depth, and the impact of global statistics on attention effectiveness, while also discussing computational considerations and potential extensions with wavelet-based approaches. Overall, the results show that higher-order fractal statistics can meaningfully enhance end-to-end segmentation models in medical imaging.

Abstract

Multifractal analysis has revealed regularities in many self-seeding phenomena, yet its use in modern deep learning remains limited. Existing end-to-end multifractal methods rely on heavy pooling or strong feature-space decimation, which constrain tasks such as semantic segmentation. Motivated by these limitations, we introduce two inductive priors: Monofractal and Multifractal Recalibration. These methods leverage relationships between the probability mass of the exponents and the multifractal spectrum to form statistical descriptions of encoder embeddings, implemented as channel-attention functions in convolutional networks. Using a U-Net-based framework, we show that multifractal recalibration yields substantial gains over a baseline equipped with other channel-attention mechanisms that also use higher-order statistics. Given the proven ability of multifractal analysis to capture pathological regularities, we validate our approach on three public medical-imaging datasets: ISIC18 (dermoscopy), Kvasir-SEG (endoscopy), and BUSI (ultrasound). Our empirical analysis also provides insights into the behavior of these attention layers. We find that excitation responses do not become increasingly specialized with encoder depth in U-Net architectures due to skip connections, and that their effectiveness may relate to global statistics of instance variability.

Paper Structure

This paper contains 27 sections, 1 theorem, 35 equations, 9 figures, 4 tables.

Key Result

Theorem 4.1

Supose that $\mu$ is a multinomial measure as in equation eq:multinomial_pk with multifractal spectrum $f$, then almost surely (a.s.) for $\mathcal{R}=\{1, \ldots, k\}$ for sufficiently large $k$: where $D$ is the box-dimenson of the support.

Figures (9)

  • Figure 1: (a) The typical channel attention model where some statistics is pooled directly from the encoder output. (b) Our proposed approaches use statistics derived from the scaling exponents of the features maps.
  • Figure 2: (Top-left) The realization of binomial measure $\mu$ with $p=2/3$ supported in 2-dimension Euclidean space. (Top-right) The associated multifractal spectrum. Notice that $f$ is a concave parabola and $\mu$ is singular, but $f(E_{x}[\alpha(x)])$ will still be associated with the dimension of the support of $\mu$ which is 2. (Bottom-left+right) 3D visualization of distribution of the $\mu$ at two different scales $2^{-k}$, $k=\{2,4\}$. Notice how the surface of $\mu$ appears to be more irregular as $k$ increases.
  • Figure 3: Visualization of the joint encoding of $p_l$ at encoder $\Psi_{l}$. Each level set is characaterized by a Gaussian $p_l^{(q)}$ localized around $\alpha_l^{(q)}$.
  • Figure 4: (Left-most) The input KvasirSEG image (from validation set) and target region of interest (highlighted in red). (Right) Layer depth $l \in \{1,2,3\}$ versus normalized response of $\Psi_l$, $\Tilde{\mathbf{H}}_l$, and $|\Psi_l - \Psi^{\text{Multi}}_l|$. Note how $\Tilde{\mathbf{H}}_l$ encodes complementary textural information of $\Psi_l$ for $l=1$. For $l=2,3$ the preferred singularities relate more to luminance changes that highlight anatomical structures. This also illustrates the theory set forth by Vehel at al. vehel1994multifractal, where ach level set is associated with a distinct visual primitive.
  • Figure 5: The intensity distribution of the input is highly irregular. Converly, the target variable is very homogeneous and only relates information of the general region of interest. Sample taken from the ISIC18 dataset.
  • ...and 4 more figures

Theorems & Definitions (4)

  • Theorem 4.1: Fractal recalibration
  • Remark 4.2: Monofractal recalibration
  • Definition 4.3: Data-dependent multinomial measure
  • Remark 4.4: Multifractal recalibation