Table of Contents
Fetching ...

Multi-Task Learning with Additive U-Net for Image Denoising and Classification

Vikram Lakkavalli, Neelam Sinha

TL;DR

The paper addresses how concatenative skip connections in U-Net can inflate representational capacity and complicate joint optimization for denoising and recognition. It proposes AddUNet, which uses additive skip fusion with fixed dimensionality and learnable nonnegative gates to regulate encoder–decoder information flow, optimized via a joint denoising/classification objective $\mathcal{L} = \mathcal{L}_{\mathrm{den}} + \lambda \mathcal{L}_{\mathrm{cls}}$. Key findings include improved training stability and generalization in denoising-centric MTL, competitive denoising performance under constrained capacity, and task-aware redistribution of skip weights that decouple reconstruction from discrimination. The learned gates offer an inference-time mechanism to modulate fusion, and experiments with baselines and frequency analyses demonstrate AddUNet as a principled architectural regularizer for stable, scalable multi-task learning without added parameter count.

Abstract

We investigate additive skip fusion in U-Net architectures for image denoising and denoising-centric multi-task learning (MTL). By replacing concatenative skips with gated additive fusion, the proposed Additive U-Net (AddUNet) constrains shortcut capacity while preserving fixed feature dimensionality across depth. This structural regularization induces controlled encoder-decoder information flow and stabilizes joint optimization. Across single-task denoising and joint denoising-classification settings, AddUNet achieves competitive reconstruction performance with improved training stability. In MTL, learned skip weights exhibit systematic task-aware redistribution: shallow skips favor reconstruction, while deeper features support discrimination. Notably, reconstruction remains robust even under limited classification capacity, indicating implicit task decoupling through additive fusion. These findings show that simple constraints on skip connections act as an effective architectural regularizer for stable and scalable multi-task learning without increasing model complexity.

Multi-Task Learning with Additive U-Net for Image Denoising and Classification

TL;DR

The paper addresses how concatenative skip connections in U-Net can inflate representational capacity and complicate joint optimization for denoising and recognition. It proposes AddUNet, which uses additive skip fusion with fixed dimensionality and learnable nonnegative gates to regulate encoder–decoder information flow, optimized via a joint denoising/classification objective . Key findings include improved training stability and generalization in denoising-centric MTL, competitive denoising performance under constrained capacity, and task-aware redistribution of skip weights that decouple reconstruction from discrimination. The learned gates offer an inference-time mechanism to modulate fusion, and experiments with baselines and frequency analyses demonstrate AddUNet as a principled architectural regularizer for stable, scalable multi-task learning without added parameter count.

Abstract

We investigate additive skip fusion in U-Net architectures for image denoising and denoising-centric multi-task learning (MTL). By replacing concatenative skips with gated additive fusion, the proposed Additive U-Net (AddUNet) constrains shortcut capacity while preserving fixed feature dimensionality across depth. This structural regularization induces controlled encoder-decoder information flow and stabilizes joint optimization. Across single-task denoising and joint denoising-classification settings, AddUNet achieves competitive reconstruction performance with improved training stability. In MTL, learned skip weights exhibit systematic task-aware redistribution: shallow skips favor reconstruction, while deeper features support discrimination. Notably, reconstruction remains robust even under limited classification capacity, indicating implicit task decoupling through additive fusion. These findings show that simple constraints on skip connections act as an effective architectural regularizer for stable and scalable multi-task learning without increasing model complexity.
Paper Structure (20 sections, 3 equations, 7 figures, 3 tables)

This paper contains 20 sections, 3 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Additive U-Net with scalar-gated additive skip fusion. Encoder and decoder operate in a shared feature space with fixed dimensionality across depth.
  • Figure 2: Visual comparison of denoising methods on a Kodak-17 grayscale image with Gaussian noise ($\sigma=15$). The zoomed region highlights that the proposed Real AddUNet preserves fine structural details more faithfully than AE, DnCNN, and Pseudo AddUNet, while effectively suppressing noise.
  • Figure 3: Comparison at $\sigma=50$. Pseudo-AddUNet achieves higher PSNR (22.11 dB), whereas Real-AddUNet yields higher SSIM (0.852 vs. 0.820), suggesting better structural preservation despite lower PSNR.
  • Figure 4: Denoising on a structured high-frequency checker pattern at $\sigma = 50$. Pseudo-AddUNet achieves higher PSNR (22.11 dB) compared to Real-AddUNet (20.83 dB), indicating lower pixel-wise error. Real-AddUNet has higher SSIM (0.852 vs. 0.820), reflecting improved structural consistency. pseudo-AddUNet favors intensity-level fidelity, and the constrained residual in Real-AddUNet promotes more uniform spatial reconstruction.
  • Figure 5: Effect of skip-gate modulation at inference. We vary the scalar gate $\alpha_3$ in the range $[0,1]$ for a fixed input image while keeping all network parameters fixed. Using the model with a 9-7-5-3-1 kernel schedule, PSNR and SSIM vary smoothly and peak near the value learned during training. Here, $\alpha_3$ controls the additive contribution of the third-level skip connection, providing a direct post-training control knob over skip fusion.
  • ...and 2 more figures