Inductive Generalization in Reinforcement Learning from Specifications
Vignesh Subramanian, Rohit Kushwah, Subhajit Roy, Suguman Bansal
TL;DR
This work tackles zero-shot generalization in reinforcement learning when tasks are connected by an inductive structure expressed through logical specifications. It introduces a policy generator that, given an inductive task family, outputs per-instance policies by exploiting inductive relations on the edges of a common abstract graph. The core technical contribution is learning an inductive relation on edge policies, parameterized as an $m$-degree $\kappa$-polynomial, and composing these into path policies with learned guards to navigate the task DAG. Empirically, GenRL demonstrates strong generalization across long-horizon tasks in both simple and complex dynamics, including robot pick-and-place and classical control benchmarks, outperforming baselines that learn a single policy across tasks. This approach has practical implications for scalable, reusable policy generation in robotics and control, enabling rapid adaptation to unseen but structurally similar tasks while highlighting avenues for future theoretical guarantees and scalability.
Abstract
We present a novel inductive generalization framework for RL from logical specifications. Many interesting tasks in RL environments have a natural inductive structure. These inductive tasks have similar overarching goals but they differ inductively in low-level predicates and distributions. We present a generalization procedure that leverages this inductive relationship to learn a higher-order function, a policy generator, that generates appropriately adapted policies for instances of an inductive task in a zero-shot manner. An evaluation of the proposed approach on a set of challenging control benchmarks demonstrates the promise of our framework in generalizing to unseen policies for long-horizon tasks.
