In-Context Compositional Q-Learning for Offline Reinforcement Learning

Qiushui Xu; Yuhao Huang; Yushu Jiang; Lei Song; Jinyu Wang; Wenliang Zheng; Jiang Bian

In-Context Compositional Q-Learning for Offline Reinforcement Learning

Qiushui Xu, Yuhao Huang, Yushu Jiang, Lei Song, Jinyu Wang, Wenliang Zheng, Jiang Bian

Abstract

Accurate estimation of the Q-function is a central challenge in offline reinforcement learning. However, existing approaches often rely on a shared global Q-function, which is inadequate for capturing the compositional structure of tasks that consist of diverse subtasks. We propose In-context Compositional Q-Learning (ICQL), an offline RL framework that formulates Q-learning as a contextual inference problem and uses linear Transformers to adaptively infer local Q-functions from retrieved transitions without explicit subtask labels. Theoretically, we show that, under two assumptions -- linear approximability of the local Q-function and accurate inference of weights from retrieved context -- ICQL achieves a bounded approximation error for the Q-function and enables near-optimal policy extraction. Empirically, ICQL substantially improves performance in offline settings, achieving gains of up to 16.4% on kitchen tasks and up to 8.8% and 6.3% on MuJoCo and Adroit tasks, respectively. These results highlight the underexplored potential of in-context learning for robust and compositional value estimation and establish ICQL as a principled and effective framework for offline RL.

In-Context Compositional Q-Learning for Offline Reinforcement Learning

Abstract

In-Context Compositional Q-Learning for Offline Reinforcement Learning

Abstract

Paper Structure

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (20)