Table of Contents
Fetching ...

Uncertainty-Aware Decomposed Hybrid Networks

Sina Ditzel, Achref Jaziri, Iuliia Pliushch, Visvanathan Ramesh

TL;DR

The paper tackles robustness and interpretability in image recognition under limited labeled data by introducing a decomposed, uncertainty-aware hybrid network that couples task-specific quasi-invariant operators with neural encoders. It builds a Bayesian confidence framework that propagates per-operator confidence, via noise modeling and Mahalanobis-distance-based likelihoods, into a joint latent representation produced by a VAE-based encoder. The approach is instantiated for traffic sign recognition on GTSRB using rg-color and LBP operators, with per-operator priors and normalized convolutions to weight uncertain regions. Empirical results show that LBP is highly effective for traffic signs, and that a decomposed hybrid design (notably rg+LBP with confidence propagation) delivers competitive or superior performance, especially in semi-supervised and low-data regimes, highlighting the value of integrating model-based operators with learned representations for data-constrained applications.

Abstract

The robustness of image recognition algorithms remains a critical challenge, as current models often depend on large quantities of labeled data. In this paper, we propose a hybrid approach that combines the adaptability of neural networks with the interpretability, transparency, and robustness of domain-specific quasi-invariant operators. Our method decomposes the recognition into multiple task-specific operators that focus on different characteristics, supported by a novel confidence measurement tailored to these operators. This measurement enables the network to prioritize reliable features and accounts for noise. We argue that our design enhances transparency and robustness, leading to improved performance, particularly in low-data regimes. Experimental results in traffic sign detection highlight the effectiveness of the proposed method, especially in semi-supervised and unsupervised scenarios, underscoring its potential for data-constrained applications.

Uncertainty-Aware Decomposed Hybrid Networks

TL;DR

The paper tackles robustness and interpretability in image recognition under limited labeled data by introducing a decomposed, uncertainty-aware hybrid network that couples task-specific quasi-invariant operators with neural encoders. It builds a Bayesian confidence framework that propagates per-operator confidence, via noise modeling and Mahalanobis-distance-based likelihoods, into a joint latent representation produced by a VAE-based encoder. The approach is instantiated for traffic sign recognition on GTSRB using rg-color and LBP operators, with per-operator priors and normalized convolutions to weight uncertain regions. Empirical results show that LBP is highly effective for traffic signs, and that a decomposed hybrid design (notably rg+LBP with confidence propagation) delivers competitive or superior performance, especially in semi-supervised and low-data regimes, highlighting the value of integrating model-based operators with learned representations for data-constrained applications.

Abstract

The robustness of image recognition algorithms remains a critical challenge, as current models often depend on large quantities of labeled data. In this paper, we propose a hybrid approach that combines the adaptability of neural networks with the interpretability, transparency, and robustness of domain-specific quasi-invariant operators. Our method decomposes the recognition into multiple task-specific operators that focus on different characteristics, supported by a novel confidence measurement tailored to these operators. This measurement enables the network to prioritize reliable features and accounts for noise. We argue that our design enhances transparency and robustness, leading to improved performance, particularly in low-data regimes. Experimental results in traffic sign detection highlight the effectiveness of the proposed method, especially in semi-supervised and unsupervised scenarios, underscoring its potential for data-constrained applications.

Paper Structure

This paper contains 31 sections, 34 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: The proposed hybrid architecture decomposes the input image $x$ using task-specific quasi-invariances $T_{i}(x)$, with computed uncertainties $conf(T_{i}(x))$. We experimented with different neural architectures that propagate the confidences through initial layers via normalized convolution. The here visualized VAE based architecture encodes the transformed signal into a joint latent space z (with mean $\mu$ and standard deviation $\sigma$), which is then used for tasks, such as classification y.
  • Figure 2: Example of our method applied to the GTSRB dataset using the rg and LBP operators. The figure shows key intermediate steps, including operator outputs and confidence computation via Mahalanobis distance, calculated per pixel and for visualization also aggregated into a histogram. Both the transformed image and its confidence are propagated through the neural network for a downstream task. Confidence is shown in blue (low) to yellow (high).
  • Figure 3: Visualization of $H_0$ for a subset of signs: White pixels indicate regions classified under the null hypothesis (non-colored, homogeneous), while black pixels fall outside it. In the boundary area of the sign, half of the pixels are approximated as belonging to the null hypothesis.
  • Figure 4: Classification results in a supervised setting in the limited sample setting on the GTSRB and clustered-GTSRB dataset.
  • Figure 5: Classes of the clustered GTSRB dataset