Table of Contents
Fetching ...

Decomposer Networks: Deep Component Analysis and Synthesis

Mohsen Joneidi

TL;DR

Decomposer Networks extend SVD-style factorization to nonlinear, semantic components by using a semantic autoencoder with N parallel branches that each model a residual input while competing through an all-but-one residual update. The model introduces per-sample scaling coefficients $\boldsymbol{\sigma}$ and a residual coordinate descent training scheme, including both NNLS-based synthesis weights and backpropagation through residuals. Empirically, linear, rank-1 instantiations recover PCA-like directions, while deeper or spatially masked variants yield interpretable, localized components and enable controllable synthesis by adjusting $\boldsymbol{\sigma}$. This framework unifies analysis and synthesis, enabling zero-shot semantic editing and interpretable generation without reliance on attention masks or implicit disentanglement alone. The approach provides a nonlinear, explainable generalization of SVD and offers practical avenues for structured decomposition and controllable generation in vision and signal domains.

Abstract

We propose the Decomposer Networks (DecompNet), a semantic autoencoder that factorizes an input into multiple interpretable components. Unlike classical autoencoders that compress an input into a single latent representation, the Decomposer Network maintains N parallel branches, each assigned a residual input defined as the original signal minus the reconstructions of all other branches. By unrolling a Gauss--Seidel style block-coordinate descent into a differentiable network, DecompNet enforce explicit competition among components, yielding parsimonious, semantically meaningful representations. We situate our model relative to linear decomposition methods (PCA, NMF), deep unrolled optimization, and object-centric architectures (MONet, IODINE, Slot Attention), and highlight its novelty as the first semantic autoencoder to implement an all-but-one residual update rule.

Decomposer Networks: Deep Component Analysis and Synthesis

TL;DR

Decomposer Networks extend SVD-style factorization to nonlinear, semantic components by using a semantic autoencoder with N parallel branches that each model a residual input while competing through an all-but-one residual update. The model introduces per-sample scaling coefficients and a residual coordinate descent training scheme, including both NNLS-based synthesis weights and backpropagation through residuals. Empirically, linear, rank-1 instantiations recover PCA-like directions, while deeper or spatially masked variants yield interpretable, localized components and enable controllable synthesis by adjusting . This framework unifies analysis and synthesis, enabling zero-shot semantic editing and interpretable generation without reliance on attention masks or implicit disentanglement alone. The approach provides a nonlinear, explainable generalization of SVD and offers practical avenues for structured decomposition and controllable generation in vision and signal domains.

Abstract

We propose the Decomposer Networks (DecompNet), a semantic autoencoder that factorizes an input into multiple interpretable components. Unlike classical autoencoders that compress an input into a single latent representation, the Decomposer Network maintains N parallel branches, each assigned a residual input defined as the original signal minus the reconstructions of all other branches. By unrolling a Gauss--Seidel style block-coordinate descent into a differentiable network, DecompNet enforce explicit competition among components, yielding parsimonious, semantically meaningful representations. We situate our model relative to linear decomposition methods (PCA, NMF), deep unrolled optimization, and object-centric architectures (MONet, IODINE, Slot Attention), and highlight its novelty as the first semantic autoencoder to implement an all-but-one residual update rule.

Paper Structure

This paper contains 33 sections, 18 equations, 4 figures.

Figures (4)

  • Figure 1: Decomposer Networks (3 components). Each residual summer adds $x$ and subtracts the other branches’ scaled reconstructions ($-\sigma_j\hat{x}_j$). Each color shows one component and colored arrows show components data; gains $\sigma_i$ feed both the final sum and the residual feedback. Each SubNet can be as simple as a rank-1 multiplication or as deep as a multi layer auto-encoder.
  • Figure 2: Experiment 1: Rank-1 linear subnetworks converge to PCA-like components on the AT&T dataset. The learned components resemble the top singular vectors of the data matrix.
  • Figure 3: Experiment 2: CNN-based subnetworks without spatial constraints. Each component contributes differently to the reconstruction but all retain global image structure.
  • Figure 4: Experiment 3: Decomposer Networks with fixed Gaussian spatial masks. Each component captures a semantically meaningful subregion of the input image.