Table of Contents
Fetching ...

Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction

Riccardo De Santi, Federico Arangath Joseph, Noah Liniger, Mirco Mutti, Andreas Krause

TL;DR

This work addresses experimental design in finite Markov decision processes by exploiting geometric invariances to overcome AE/Convex RL scalability bottlenecks. It introduces Geometric Active Exploration (GAE), which extends MDP homomorphisms to Convex RL and solves AE via abstraction over an abstract MDP, using a Frank-Wolfe scheme with optimistic gradients and policy lifting. A key theoretical result is a regret bound that scales with a geometric compression factor $\Phi$, showing that higher symmetry reduces the sample and computational requirements. Empirical results on diffusion-like and chemical-string environments demonstrate improved data efficiency and faster runtimes as symmetry-induced compression increases, validating the practical value of geometric priors in AE and Convex RL. Overall, the paper provides a principled framework for leveraging geometric structure to make AE more scalable and effective in scientific discovery contexts.

Abstract

How can a scientist use a Reinforcement Learning (RL) algorithm to design experiments over a dynamical system's state space? In the case of finite and Markovian systems, an area called Active Exploration (AE) relaxes the optimization problem of experiments design into Convex RL, a generalization of RL admitting a wider notion of reward. Unfortunately, this framework is currently not scalable and the potential of AE is hindered by the vastness of experiment spaces typical of scientific discovery applications. However, these spaces are often endowed with natural geometries, e.g., permutation invariance in molecular design, that an agent could leverage to improve the statistical and computational efficiency of AE. To achieve this, we bridge AE and MDP homomorphisms, which offer a way to exploit known geometric structures via abstraction. Towards this goal, we make two fundamental contributions: we extend MDP homomorphisms formalism to Convex RL, and we present, to the best of our knowledge, the first analysis that formally captures the benefit of abstraction via homomorphisms on sample efficiency. Ultimately, we propose the Geometric Active Exploration (GAE) algorithm, which we analyse theoretically and experimentally in environments motivated by problems in scientific discovery.

Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction

TL;DR

This work addresses experimental design in finite Markov decision processes by exploiting geometric invariances to overcome AE/Convex RL scalability bottlenecks. It introduces Geometric Active Exploration (GAE), which extends MDP homomorphisms to Convex RL and solves AE via abstraction over an abstract MDP, using a Frank-Wolfe scheme with optimistic gradients and policy lifting. A key theoretical result is a regret bound that scales with a geometric compression factor , showing that higher symmetry reduces the sample and computational requirements. Empirical results on diffusion-like and chemical-string environments demonstrate improved data efficiency and faster runtimes as symmetry-induced compression increases, validating the practical value of geometric priors in AE and Convex RL. Overall, the paper provides a principled framework for leveraging geometric structure to make AE more scalable and effective in scientific discovery contexts.

Abstract

How can a scientist use a Reinforcement Learning (RL) algorithm to design experiments over a dynamical system's state space? In the case of finite and Markovian systems, an area called Active Exploration (AE) relaxes the optimization problem of experiments design into Convex RL, a generalization of RL admitting a wider notion of reward. Unfortunately, this framework is currently not scalable and the potential of AE is hindered by the vastness of experiment spaces typical of scientific discovery applications. However, these spaces are often endowed with natural geometries, e.g., permutation invariance in molecular design, that an agent could leverage to improve the statistical and computational efficiency of AE. To achieve this, we bridge AE and MDP homomorphisms, which offer a way to exploit known geometric structures via abstraction. Towards this goal, we make two fundamental contributions: we extend MDP homomorphisms formalism to Convex RL, and we present, to the best of our knowledge, the first analysis that formally captures the benefit of abstraction via homomorphisms on sample efficiency. Ultimately, we propose the Geometric Active Exploration (GAE) algorithm, which we analyse theoretically and experimentally in environments motivated by problems in scientific discovery.
Paper Structure (25 sections, 23 theorems, 97 equations, 2 figures, 1 algorithm)

This paper contains 25 sections, 23 theorems, 97 equations, 2 figures, 1 algorithm.

Key Result

proposition 1

The geometry-aware estimation error $\bar{\xi}_n$ can be rewritten as a function of abstract states as:

Figures (2)

  • Figure 1: Radial diffusion process of a pollutant from a central point source. On the left, original MDP where each circle is an $f$-equivalence class, $L_g$ denotes a state symmetry acting on $f$, $K_g^s$ denotes a state-dependent action symmetry acting on $P$. On the right, the abstract MDP obtained via the MDP homomorphism $h = (\psi, \{\phi_s \mid s \in \mathcal{S}\})$, where $\psi$ maps $f$-equivalence classes to abstract states.
  • Figure 2: Comparison of GAE with AE. GAE shows better statistical and computational efficiency. Experiments were carried out over 15 seeds and confidence intervals shown are $\pm$ one standard deviation. \ref{['subfig:diff_stat']} the statistical advantage of GAE with compression $\Phi$ against AE for deterministic dynamics in the diffusion environment. \ref{['subfig:diff_stat_stoch']} same setting as \ref{['subfig:diff_stat']}, but with stochastic dynamics. \ref{['subfig:diffusion_interence_bias']} the (classic) estimation error taken over the abstract state space. \ref{['subfig:diffusion_runtime']} computational advantage of GAE over AE for different degrees of compression (standardized). \ref{['subfig:strings_stat']} the statistical advantage of GAE over AE in the strings environment. \ref{['subfig:strings_drawing']} the strings environment and the invariance of $f$ under permutation.

Theorems & Definitions (39)

  • definition 1: Sample Complexity Geometric Estimation
  • proposition 1
  • proposition 2: Convex Upper Bound of $\bar{\xi}_n$
  • proposition 3: Tractable Convex Upper Bound of $\bar{\xi}_n$
  • lemma 5.0: Variance Concentration panaganti2022sample
  • proposition 4: Gradient-Reward Invariances
  • definition 2: Geometric Compression Term
  • theorem 6.0: Regret Guarantee
  • theorem 6.0: Sample Complexity of Geometric Estimation Objective
  • proposition 5: Compression via Group Cardinality
  • ...and 29 more