Table of Contents
Fetching ...

Dual Goal Representations

Seohong Park, Deepinder Mann, Sergey Levine

TL;DR

The paper introduces dual goal representations for goal-conditioned RL, representing a goal by its temporal distances to all states to achieve sufficiency and noise invariance. It provides a practical offline learning recipe that uses a parameterized temporal distance approximator and a downstream offline GCRL algorithm, demonstrated across the OGBench suite. Theoretical results establish sufficiency and noise invariance, and empirically the approach improves offline goal-reaching performance across 20 state- and pixel-based tasks. This framework offers a robust, environment-invariant goal representation that can be plugged into existing GCRL pipelines to enhance generalization and learning efficiency.

Abstract

In this work, we introduce dual goal representations for goal-conditioned reinforcement learning (GCRL). A dual goal representation characterizes a state by "the set of temporal distances from all other states"; in other words, it encodes a state through its relations to every other state, measured by temporal distance. This representation provides several appealing theoretical properties. First, it depends only on the intrinsic dynamics of the environment and is invariant to the original state representation. Second, it contains provably sufficient information to recover an optimal goal-reaching policy, while being able to filter out exogenous noise. Based on this concept, we develop a practical goal representation learning method that can be combined with any existing GCRL algorithm. Through diverse experiments on the OGBench task suite, we empirically show that dual goal representations consistently improve offline goal-reaching performance across 20 state- and pixel-based tasks.

Dual Goal Representations

TL;DR

The paper introduces dual goal representations for goal-conditioned RL, representing a goal by its temporal distances to all states to achieve sufficiency and noise invariance. It provides a practical offline learning recipe that uses a parameterized temporal distance approximator and a downstream offline GCRL algorithm, demonstrated across the OGBench suite. Theoretical results establish sufficiency and noise invariance, and empirically the approach improves offline goal-reaching performance across 20 state- and pixel-based tasks. This framework offers a robust, environment-invariant goal representation that can be plugged into existing GCRL pipelines to enhance generalization and learning efficiency.

Abstract

In this work, we introduce dual goal representations for goal-conditioned reinforcement learning (GCRL). A dual goal representation characterizes a state by "the set of temporal distances from all other states"; in other words, it encodes a state through its relations to every other state, measured by temporal distance. This representation provides several appealing theoretical properties. First, it depends only on the intrinsic dynamics of the environment and is invariant to the original state representation. Second, it contains provably sufficient information to recover an optimal goal-reaching policy, while being able to filter out exogenous noise. Based on this concept, we develop a practical goal representation learning method that can be combined with any existing GCRL algorithm. Through diverse experiments on the OGBench task suite, we empirically show that dual goal representations consistently improve offline goal-reaching performance across 20 state- and pixel-based tasks.

Paper Structure

This paper contains 22 sections, 4 theorems, 6 equations, 5 figures, 9 tables, 1 algorithm.

Key Result

Theorem 3.1

Let ${\mathcal{M}}=({\mathcal{S}}, {\mathcal{A}}, p)$ be a CMP and $\varphi^\vee$ be its dual goal representation function. Then, there exists a deterministic policy $\pi^\vee: {\mathcal{S}} \times {\mathcal{S}}^\vee \to {\mathcal{A}}$ that takes a dual goal representation as input, such that its in

Figures (5)

  • Figure 1: Dual goal representations. A dual goal representation $\varphi^\vee(g)$ is defined as the set of temporal distances $d^*(\cdot, g)$ from all other states. This representation has a number of appealing properties: it only depends on the intrinsic dynamics of the environment, contains sufficient information to express an optimal goal-reaching policy, and is able to discard exogenous noise.
  • Figure 2: The "Lights Out" puzzle.
  • Figure 3: Original vs. dual representations.
  • Figure 4: Dual representations are robust to noise.
  • Figure 5: OGBench environments. OGBench provides diverse state- and pixel-based goal-conditioned tasks across robotic navigation and manipulation. This figure is taken from fql_park2025.

Theorems & Definitions (6)

  • Theorem 3.1: Sufficiency of Dual Goal Representations
  • Theorem 3.2: Noise Invariance of Dual Goal Representations
  • Theorem A.1: Sufficiency of Dual Goal Representations
  • proof
  • Theorem A.2: Noise Invariance of Dual Goal Representations
  • proof