Table of Contents
Fetching ...

Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery

Anand Gopalakrishnan, Aleksandar Stanić, Jürgen Schmidhuber, Michael Curtis Mozer

TL;DR

A fully convolutional autoencoder that performs iterative constraint satisfaction: at each iteration, a hidden layer bottleneck encodes statistically regular configurations of features in particular phase relationships; over iterations, local constraints propagate and the model converges to a globally consistent configuration of phase assignments.

Abstract

Current state-of-the-art synchrony-based models encode object bindings with complex-valued activations and compute with real-valued weights in feedforward architectures. We argue for the computational advantages of a recurrent architecture with complex-valued weights. We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction: at each iteration, a hidden layer bottleneck encodes statistically regular configurations of features in particular phase relationships; over iterations, local constraints propagate and the model converges to a globally consistent configuration of phase assignments. Binding is achieved simply by the matrix-vector product operation between complex-valued weights and activations, without the need for additional mechanisms that have been incorporated into current synchrony-based models. SynCx outperforms or is strongly competitive with current models for unsupervised object discovery. SynCx also avoids certain systematic grouping errors of current models, such as the inability to separate similarly colored objects without additional supervision.

Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery

TL;DR

A fully convolutional autoencoder that performs iterative constraint satisfaction: at each iteration, a hidden layer bottleneck encodes statistically regular configurations of features in particular phase relationships; over iterations, local constraints propagate and the model converges to a globally consistent configuration of phase assignments.

Abstract

Current state-of-the-art synchrony-based models encode object bindings with complex-valued activations and compute with real-valued weights in feedforward architectures. We argue for the computational advantages of a recurrent architecture with complex-valued weights. We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction: at each iteration, a hidden layer bottleneck encodes statistically regular configurations of features in particular phase relationships; over iterations, local constraints propagate and the model converges to a globally consistent configuration of phase assignments. Binding is achieved simply by the matrix-vector product operation between complex-valued weights and activations, without the need for additional mechanisms that have been incorporated into current synchrony-based models. SynCx outperforms or is strongly competitive with current models for unsupervised object discovery. SynCx also avoids certain systematic grouping errors of current models, such as the inability to separate similarly colored objects without additional supervision.
Paper Structure (32 sections, 5 equations, 10 figures, 11 tables)

This paper contains 32 sections, 5 equations, 10 figures, 11 tables.

Figures (10)

  • Figure 1: Local feature configurations are insufficient to determine whether features belong to the same object: highlighted horizontal and vertical configurations sometimes belong to the same object (green) and at others to different objects (red).
  • Figure 2: SynCx is a fully convolutional autoencoder that iteratively processes an input image. It starts with a randomly initialized phase $\bm{\phi}_x^1$ and the input image $\bm{\mu}_x$ in the magnitude-component updates the phases in a stateful manner, i.e., output phase at iteration 1 fed as input at iteration 2 ($\bm{\phi}_x^2 \leftarrow \bm{\phi}_z^1$) and so on. The magnitude-component at the input is always clamped to the input image $\bm{\mu}_x$. SynCx is trained to reconstruct $\bm{\mu}_x$ using the output magnitude-component $\bm{\mu}_z^n$ at every step.
  • Figure 3: Evolution of phase maps in radial and heatmap form (colors matched) across iterations in SynCx for two inputs from Tetrominoes (row 1) and dSprites (row 2).
  • Figure 4: Comparison between RF and SynCx grouping on Tetrominoes, dSprites and CLEVR. RF tends to systematically group similarly colored objects together while SynCx is more adept at separating them such as blue tetris blocks (left), green square and heart (middle) and yellow cylinders (right).
  • Figure 5: Reconstruction, object masks, radial phase plot and phase heatmaps (colors matched between columns 5 & 6) for SynCx without the bottleneck (row 1) and the full SynCx model (row 2).
  • ...and 5 more figures