Table of Contents
Fetching ...

Learning Conditionally Independent Transformations using Normal Subgroups in Group Theory

Kayato Nishitsunoi, Yoshiyuki Ohmura, Takayuki Komatsu, Yasuo Kuniyoshi

TL;DR

This work addresses unsupervised disentanglement of conditionally independent, noncommutative transformations by grounding representation learning in Galois-inspired normal-subgroup decompositions. It introduces a homomorphism-based formulation where a normal subgroup $N$ corresponds to one transformation component and the quotient captures the residual, enabling conditional independence beyond commutativity. A NeuralODE-based coordinate transformation model learns two transformations $g$ and $v$, enforces a homomorphism constraint, and uses reconstruction, self-supervision, and isometry losses to discover rotation and translation in image sequences without labels. Empirical results on geometric transformations demonstrate successful separation and consistent identification of rotation and translation, suggesting a principled extension of representation learning to structured transformation decompositions with potential applicability to broader group-theoretic learning paradigms.

Abstract

Humans develop certain cognitive abilities to recognize objects and their transformations without explicit supervision, highlighting the importance of unsupervised representation learning. A fundamental challenge in unsupervised representation learning is to separate different transformations in learned feature representations. Although algebraic approaches have been explored, a comprehensive theoretical framework remains underdeveloped. Existing methods decompose transformations based on algebraic independence, but these methods primarily focus on commutative transformations and do not extend to cases where transformations are conditionally independent but noncommutative. To extend current representation learning frameworks, we draw inspiration from Galois theory, where the decomposition of groups through normal subgroups provides an approach for the analysis of structured transformations. Normal subgroups naturally extend commutativity under certain conditions and offer a foundation for the categorization of transformations, even when they do not commute. In this paper, we propose a novel approach that leverages normal subgroups to enable the separation of conditionally independent transformations, even in the absence of commutativity. Through experiments on geometric transformations in images, we show that our method successfully categorizes conditionally independent transformations, such as rotation and translation, in an unsupervised manner, suggesting a close link between group decomposition via normal subgroups and transformation categorization in representation learning.

Learning Conditionally Independent Transformations using Normal Subgroups in Group Theory

TL;DR

This work addresses unsupervised disentanglement of conditionally independent, noncommutative transformations by grounding representation learning in Galois-inspired normal-subgroup decompositions. It introduces a homomorphism-based formulation where a normal subgroup corresponds to one transformation component and the quotient captures the residual, enabling conditional independence beyond commutativity. A NeuralODE-based coordinate transformation model learns two transformations and , enforces a homomorphism constraint, and uses reconstruction, self-supervision, and isometry losses to discover rotation and translation in image sequences without labels. Empirical results on geometric transformations demonstrate successful separation and consistent identification of rotation and translation, suggesting a principled extension of representation learning to structured transformation decompositions with potential applicability to broader group-theoretic learning paradigms.

Abstract

Humans develop certain cognitive abilities to recognize objects and their transformations without explicit supervision, highlighting the importance of unsupervised representation learning. A fundamental challenge in unsupervised representation learning is to separate different transformations in learned feature representations. Although algebraic approaches have been explored, a comprehensive theoretical framework remains underdeveloped. Existing methods decompose transformations based on algebraic independence, but these methods primarily focus on commutative transformations and do not extend to cases where transformations are conditionally independent but noncommutative. To extend current representation learning frameworks, we draw inspiration from Galois theory, where the decomposition of groups through normal subgroups provides an approach for the analysis of structured transformations. Normal subgroups naturally extend commutativity under certain conditions and offer a foundation for the categorization of transformations, even when they do not commute. In this paper, we propose a novel approach that leverages normal subgroups to enable the separation of conditionally independent transformations, even in the absence of commutativity. Through experiments on geometric transformations in images, we show that our method successfully categorizes conditionally independent transformations, such as rotation and translation, in an unsupervised manner, suggesting a close link between group decomposition via normal subgroups and transformation categorization in representation learning.

Paper Structure

This paper contains 17 sections, 28 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overview of the proposed model. (a) Architecture of the encoder and reconstruction learning. CNNs output feature vector $\bm{z}_{i}$ for frame $I_{i}$ and transformation parameters $\lambda, \bm{c}$ are calculated using LSTM and a linear layer. Then we reconstruct the input sequence by creating geometric transformations $g$ and $v$ from the transformation parameters and the two ODEs. (b) Architecture of self-supervised learning. We set the elements of the computed transformation parameters $\lambda$ and $\bm{c}$ to 0, a new sequence is generated where either $g$ or $v$ becomes an identity transformation, and from this sequence, the encoder learns to predict the transformation parameters this model has created.
  • Figure 2: Two transformation fields learned from the Sequence 1 dataset. (a) Transformation fields computed by setting $(\lambda, \bm{c})=(2, \bm{0})$ for $g$ and $(\lambda, \bm{c})=(-3, \bm{0})$ for $v$, based on the learned ODE parameters for each transformation. (b), (c), (d) Transformations learned for each given sequence and the reconstructed sequences generated by the application of these transformations. $g$ and $v$ show the $g(\lambda^g_{0,1}, \bm{c}^g_{0,1})$ and $v(\lambda^v_{0,1}, \bm{c}^v_{0,1})$ transformations, respectively, and the given sequence and reconstructed sequence each display only the first four frames out of the seven frames.
  • Figure 7: Two transformation fields learned from the Sequence 2 dataset. (a) Transformation fields computed by setting $(\lambda, \bm{c})=(2, \bm{0})$ for $g$ and $(\lambda, \bm{c})=(4, \bm{0})$ for $v$, based on the learned ODE parameters for each transformation. (b), (c), (d) Transformations learned for each given sequence and the reconstructed sequences generated by the application of these transformations. $g$ and $v$ show the $g(\lambda^g_{0,1}, \bm{c}^g_{0,1})$ and $v(\lambda^v_{0,1}, \bm{c}^v_{0,1})$ transformations, respectively, and the given sequence and reconstructed sequence each display only the first four frames out of the seven frames.