Optimal Control with Natural Images: Efficient Reinforcement Learning using Overcomplete Sparse Codes
Peter N. Loxley
TL;DR
This work addresses optimal control with sequences of natural images by treating images as potential sufficient statistics and showing that overcomplete sparse codes enable efficient, scalable reinforcement learning. It introduces a scalable image-patch benchmark based on a target-tracking dynamics and demonstrates that an overcomplete sparse-code representation dramatically expands the tractable state space while maintaining tractable training via Fitted Value Iteration. The key findings show that such representations accelerate learning, increase storage capacity, and allow exact or near-exact policy solutions on tasks orders of magnitude larger than with complete codes, without requiring deep networks. The practical impact lies in providing a principled route to efficient vision-based control and a testbed for comparing image representations in RL, with clear theoretical and computational advantages.
Abstract
Optimal control and sequential decision making are widely used in many complex tasks. Optimal control over a sequence of natural images is a first step towards understanding the role of vision in control. Here, we formalize this problem as a reinforcement learning task, and derive general conditions under which an image includes enough information to implement an optimal policy. Reinforcement learning is shown to provide a computationally efficient method for finding optimal policies when natural images are encoded into "efficient" image representations. This is demonstrated by introducing a new reinforcement learning benchmark that easily scales to large numbers of states and long horizons. In particular, by representing each image as an overcomplete sparse code, we are able to efficiently solve an optimal control task that is orders of magnitude larger than those tasks solvable using complete codes. Theoretical justification for this behaviour is provided. This work also demonstrates that deep learning is not necessary for efficient optimal control with natural images.
