Table of Contents
Fetching ...

On the Identifiability of Latent Action Policies

Sébastien Lachapelle

TL;DR

The paper analyzes the identifiability of latent action representations learned via LAPO from video data. It introduces a formal data-generating process, three desiderata for the inverse dynamics mapping, and an entropy-regularized objective that promotes determinism while enabling gradient-based optimization. Under continuity, injectivity, and connectivity conditions on the data-generating process, the authors prove that the optimal latent-action labeling is deterministic and injective, making latent actions identifiable with respect to the true actions. The results illuminate why discrete latent action representations are effective and clarify the structural assumptions needed for LAPO to recover meaningful action factors.

Abstract

We study the identifiability of latent action policy learning (LAPO), a framework introduced recently to discover representations of actions from video data. We formally describe desiderata for such representations, their statistical benefits and potential sources of unidentifiability. Finally, we prove that an entropy-regularized LAPO objective identifies action representations satisfying our desiderata, under suitable conditions. Our analysis provides an explanation for why discrete action representations perform well in practice.

On the Identifiability of Latent Action Policies

TL;DR

The paper analyzes the identifiability of latent action representations learned via LAPO from video data. It introduces a formal data-generating process, three desiderata for the inverse dynamics mapping, and an entropy-regularized objective that promotes determinism while enabling gradient-based optimization. Under continuity, injectivity, and connectivity conditions on the data-generating process, the authors prove that the optimal latent-action labeling is deterministic and injective, making latent actions identifiable with respect to the true actions. The results illuminate why discrete latent action representations are effective and clarify the structural assumptions needed for LAPO to recover meaningful action factors.

Abstract

We study the identifiability of latent action policy learning (LAPO), a framework introduced recently to discover representations of actions from video data. We formally describe desiderata for such representations, their statistical benefits and potential sources of unidentifiability. Finally, we prove that an entropy-regularized LAPO objective identifies action representations satisfying our desiderata, under suitable conditions. Our analysis provides an explanation for why discrete action representations perform well in practice.

Paper Structure

This paper contains 7 sections, 6 theorems, 28 equations, 1 figure.

Key Result

Theorem 1

Suppose $\hat{k} \geq k$ and let $(\hat{{\bm{g}}}, \hat{q})$ be a solutionUnder ass:continuous_gass:injective_g, a solution is guaranteed to exist by prop:min_existence in appendix. of Problem prob:population with hypothesis classes ${\mathcal{G}}$ (def:G) and ${\mathcal{Q}}$ (def:Q).

Figures (1)

  • Figure 1: Illustration of \ref{['ass:connected_p(x|a)', 'ass:intersect_p(x|a)']}. Assume ${\mathcal{A}} := \{1, 2\}$

Theorems & Definitions (18)

  • Example 1
  • Definition 1: FDM hypothesis space ${\mathcal{G}}$
  • Definition 2: IDM hypothesis space ${\mathcal{Q}}$
  • Remark 1
  • Example 2: No restriction on $\hat{{\mathcal{A}}}$
  • Example 3: Deterministic $\pi(a \mid {\bm{x}})$
  • Theorem 1
  • Lemma 2
  • proof
  • Lemma 3
  • ...and 8 more