Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning

Aapo Hyvarinen; Hiroaki Sasaki; Richard E. Turner

Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning

Aapo Hyvarinen, Hiroaki Sasaki, Richard E. Turner

TL;DR

This work introduces a general framework for nonlinear ICA built on auxiliary variables that modulate the latent sources, enabling identifiability beyond i.i.d. data. It unifies and extends prior temporally-based identifiability results (TCL, PCL) by allowing a wide range of auxiliary variables, including time, history, and class labels, and provides a practical, consistency-proven learning algorithm via contrastive logistic regression. Theoretical contributions center on conditional exponentiality and two main identifiability theorems, with distinctions based on the exponential-family order. Simulations demonstrate TCL-like performance without data segmentation and show the framework's capacity to incorporate nonstationarity, temporal dependencies, and supervised signals. Overall, the paper offers a versatile, principled path for recovering latent nonlinear components using auxiliary information, with broad applicability to self-supervised and supervised learning contexts.

Abstract

Nonlinear ICA is a fundamental problem for unsupervised representation learning, emphasizing the capacity to recover the underlying latent variables generating the data (i.e., identifiability). Recently, the very first identifiability proofs for nonlinear ICA have been proposed, leveraging the temporal structure of the independent components. Here, we propose a general framework for nonlinear ICA, which, as a special case, can make use of temporal structure. It is based on augmenting the data by an auxiliary variable, such as the time index, the history of the time series, or any other available information. We propose to learn nonlinear ICA by discriminating between true augmented data, or data in which the auxiliary variable has been randomized. This enables the framework to be implemented algorithmically through logistic regression, possibly in a neural network. We provide a comprehensive proof of the identifiability of the model as well as the consistency of our estimation method. The approach not only provides a general theoretical framework combining and generalizing previously proposed nonlinear ICA models and algorithms, but also brings practical advantages.

Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning

TL;DR

Abstract

Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (4)