State Representation Learning Using an Unbalanced Atlas
Li Meng, Morten Goodwin, Anis Yazidi, Paal Engelstad
TL;DR
This work addresses state representation learning in self-supervised settings by exploiting a manifold-based representation learned via an unbalanced atlas. It introduces DIM-UA, which adapts the ST-DIM framework to use dilated prediction targets and a maximal mean discrepancy-based UA loss to encourage informative chart usage. Across AtariARI and CIFAR10, DIM-UA outperforms ST-DIM and MSimCLR, with mean F1 around 0.75 when encoding dimensions are large, showing stability and scalable gains with more heads. The results indicate that an unbalanced atlas enables learning richer, scalable manifold representations, offering a practical path to improved SRL in SSL pipelines.
Abstract
The manifold hypothesis posits that high-dimensional data often lies on a lower-dimensional manifold and that utilizing this manifold as the target space yields more efficient representations. While numerous traditional manifold-based techniques exist for dimensionality reduction, their application in self-supervised learning has witnessed slow progress. The recent MSimCLR method combines manifold encoding with SimCLR but requires extremely low target encoding dimensions to outperform SimCLR, limiting its applicability. This paper introduces a novel learning paradigm using an unbalanced atlas (UA), capable of surpassing state-of-the-art self-supervised learning approaches. We investigated and engineered the DeepInfomax with an unbalanced atlas (DIM-UA) method by adapting the Spatiotemporal DeepInfomax (ST-DIM) framework to align with our proposed UA paradigm. The efficacy of DIM-UA is demonstrated through training and evaluation on the Atari Annotated RAM Interface (AtariARI) benchmark, a modified version of the Atari 2600 framework that produces annotated image samples for representation learning. The UA paradigm improves existing algorithms significantly as the number of target encoding dimensions grows. For instance, the mean F1 score averaged over categories of DIM-UA is ~75% compared to ~70% of ST-DIM when using 16384 hidden units.
