Rethinking Disentanglement under Dependent Factors of Variation

Antonio Almudévar; Alfonso Ortega

Rethinking Disentanglement under Dependent Factors of Variation

Antonio Almudévar, Alfonso Ortega

TL;DR

This paper gives a definition of disentanglement based on information theory that is also valid when the factors of variation are not independent and proposes a method to measure the degree of disentanglement from the given definition that works when the factors of variation are not independent.

Abstract

Representation learning is an approach that allows to discover and extract the factors of variation from the data. Intuitively, a representation is said to be disentangled if it separates the different factors of variation in a way that is understandable to humans. Definitions of disentanglement and metrics to measure it usually assume that the factors of variation are independent of each other. However, this is generally false in the real world, which limits the use of these definitions and metrics to very specific and unrealistic scenarios. In this paper we give a definition of disentanglement based on information theory that is also valid when the factors of variation are not independent. Furthermore, we relate this definition to the Information Bottleneck Method. Finally, we propose a method to measure the degree of disentanglement from the given definition that works when the factors of variation are not independent. We show through different experiments that the method proposed in this paper correctly measures disentanglement with non-independent factors of variation, while other methods fail in this scenario.

Rethinking Disentanglement under Dependent Factors of Variation

TL;DR

Abstract

Paper Structure (57 sections, 4 theorems, 11 equations, 16 figures, 12 tables)

This paper contains 57 sections, 4 theorems, 11 equations, 16 figures, 12 tables.

Introduction
Related Work
Definition of Disentanglement
Metrics for Disentanglement
Disentanglement under Dependent Factors of Variation
Information Bottleneck in Representation Learning
Desirable Properties of a Disentangled Representation
Notation.
Problem Setup.
Desirable Properties
Measuring Disentanglement via Minimality and Sufficiency
Connection between Disentanglement and Minimality and Sufficiency
Why Measuring Disentanglement via Minimality and Sufficiency?
Defining metrics for Minimality and Sufficiency
Minimality
...and 42 more sections

Key Result

Theorem 1

Let $Y=\{Y_k\}_{k=1}^n$ denote a set of factors and $Z=\{Z_j\}_{j=1}^m$ a representation. If $Z_j$ is a minimal representation of $Y_i$, it follows that $Z_j$ is factors-invariant with respect to $Y_i$ and nuisances-invariant. Equivalently, we have that:

Figures (16)

Figure 1: If the factors are dependent, then—even with a perfectly disentangled encoder—each sub-representation will contain information about other factors. For example, if $z_1$ encodes the shape of a banana, the color is most likely yellow, possibly green if unripe, and rarely red. Likewise, if $z_2$ encodes the color yellow, the shape is most likely that of a banana or a lemon, but very unlikely that of a strawberry.
Figure 2: Comparison of different metrics (y-axis) across varying values of $\alpha$ (x-axis) and $\delta$ (color). Colors range from dark blue (highest dependence, $\delta = \frac{1}{n}$) to dark red (lowest dependence, $\delta = 1$).
Figure 3: Comparison of different metrics (y-axis) for different values of $\beta$ (x-axis)
Figure 4: Time (in seconds) for calculating the different metrics (y-axis) for different number of factors of variation (x-axis).
Figure 5: Rank correlation between disentanglement metrics and accuracy of a random forest classifier trained on 1000 samples across different levels of factor dependence.
...and 11 more figures

Theorems & Definitions (8)

Definition 1
Definition 2
Theorem 1
Theorem 2
Theorem 1
proof
Theorem 2
proof

Rethinking Disentanglement under Dependent Factors of Variation

TL;DR

Abstract

Rethinking Disentanglement under Dependent Factors of Variation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (16)

Theorems & Definitions (8)