Boosting Reinforcement Learning in 3D Visuospatial Tasks Through Human-Informed Curriculum Design
Markus D. Solbach, John K. Tsotsos
TL;DR
The paper evaluates modern reinforcement learning approaches on a challenging 3D Same-Different visuospatial task, replicated from human psychophysics in a Unity environment. It finds that PPO and imitation-learning methods struggle to scale across large discrete action spaces and sparse rewards, while curriculum learning—especially when informed by human performance data—can drive robust learning up to 48 viewpoints. The resulting agents achieve high accuracy but adopt strategies that differ from human observers, favoring limited, information-rich viewpoints rather than distributed exploration. The study underscores the importance of structured learning and cognitive priors (e.g., attention and memory) for scaling RL to active-perception tasks with high dimensionality and sparse rewards, suggesting avenues for cognitively inspired RL architectures.
Abstract
Reinforcement Learning is a mature technology, often suggested as a potential route towards Artificial General Intelligence, with the ambitious goal of replicating the wide range of abilities found in natural and artificial intelligence, including the complexities of human cognition. While RL had shown successes in relatively constrained environments, such as the classic Atari games and specific continuous control problems, recent years have seen efforts to expand its applicability. This work investigates the potential of RL in demonstrating intelligent behaviour and its progress in addressing more complex and less structured problem domains. We present an investigation into the capacity of modern RL frameworks in addressing a seemingly straightforward 3D Same-Different visuospatial task. While initial applications of state-of-the-art methods, including PPO, behavioural cloning and imitation learning, revealed challenges in directly learning optimal strategies, the successful implementation of curriculum learning offers a promising avenue. Effective learning was achieved by strategically designing the lesson plan based on the findings of a real-world human experiment.
