Table of Contents
Fetching ...

Learning to Abstract Visuomotor Mappings using Meta-Reinforcement Learning

Carlos A. Velazquez-Vargas, Isaac Ray Christian, Jordan A. Taylor, Sreejan Kumar

TL;DR

This paper investigates how contextual cues enable learning of multiple visuomotor mappings in a de novo skill task. By combining a grid-navigation experiment with meta-reinforcement-learning agents, it compares contexts that cue separate mappings to a context-free setup, revealing that contextual cues yield computational advantages in learning capacity and representation separation. Human and model data show two adaptive strategies: context-bound separate representations and shared representations without explicit context, with context cues allowing learning of more mappings before capacity limits bite. These findings advance our understanding of how context influences visuomotor learning and have implications for neural representations and capacity limits in both humans and artificial agents.

Abstract

We investigated the human capacity to acquire multiple visuomotor mappings for de novo skills. Using a grid navigation paradigm, we tested whether contextual cues implemented as different "grid worlds", allow participants to learn two distinct key-mappings more efficiently. Our results indicate that when contextual information is provided, task performance is significantly better. The same held true for meta-reinforcement learning agents that differed in whether or not they receive contextual information when performing the task. We evaluated their accuracy in predicting human performance in the task and analyzed their internal representations. The results indicate that contextual cues allow the formation of separate representations in space and time when using different visuomotor mappings, whereas the absence of them favors sharing one representation. While both strategies can allow learning of multiple visuomotor mappings, we showed contextual cues provide a computational advantage in terms of how many mappings can be learned.

Learning to Abstract Visuomotor Mappings using Meta-Reinforcement Learning

TL;DR

This paper investigates how contextual cues enable learning of multiple visuomotor mappings in a de novo skill task. By combining a grid-navigation experiment with meta-reinforcement-learning agents, it compares contexts that cue separate mappings to a context-free setup, revealing that contextual cues yield computational advantages in learning capacity and representation separation. Human and model data show two adaptive strategies: context-bound separate representations and shared representations without explicit context, with context cues allowing learning of more mappings before capacity limits bite. These findings advance our understanding of how context influences visuomotor learning and have implications for neural representations and capacity limits in both humans and artificial agents.

Abstract

We investigated the human capacity to acquire multiple visuomotor mappings for de novo skills. Using a grid navigation paradigm, we tested whether contextual cues implemented as different "grid worlds", allow participants to learn two distinct key-mappings more efficiently. Our results indicate that when contextual information is provided, task performance is significantly better. The same held true for meta-reinforcement learning agents that differed in whether or not they receive contextual information when performing the task. We evaluated their accuracy in predicting human performance in the task and analyzed their internal representations. The results indicate that contextual cues allow the formation of separate representations in space and time when using different visuomotor mappings, whereas the absence of them favors sharing one representation. While both strategies can allow learning of multiple visuomotor mappings, we showed contextual cues provide a computational advantage in terms of how many mappings can be learned.
Paper Structure (11 sections, 5 figures)

This paper contains 11 sections, 5 figures.

Figures (5)

  • Figure 1: Experimental task Subject perform a grid navigation using different key mappings randomly interleaved over trials. In the context group, the key-mappings where deterministically signalled by a unique grid world, whereas in the no-context group, participants used both mappings in the same world.
  • Figure 2: Performance of Humans and Meta-RL Agents (A). Mean performance of humans over the episodes they completed, measured by proportion of optimal arrivals. Shading represents 95% confidence intervals. Humans that were given context cues learned the task better. (B). Overall performance differences across different experimental groups in humans. Those in the context group did significantly better. (C). Mean reward over time during agent learning for both context and no-context agents. (D). Median reward after training. Agents that were given a external contextual cue for the mapping as additional input did significantly better.
  • Figure 3: In both experimental groups, humans are split between behaving like the context vs no context LSTM (A). Joint scatterplot and histograms for likelihood of subjects' actions under the context LSTM model vs. performance of subjects on the task. In both experimental groups (whether participants received external context input or not), there are two clusters of participants --- those whose actions are explained well by the context LSTM model and those whose are not. (B). If we examine the mean likelihood under both types of models (context vs no context LSTMs), we see that participants whose actions aren't as well explained by the context LSTM model have significantly higher likelihood under the no context LSTM model.
  • Figure 4: Context agents have lower representational similarity under the different key-mappings over space and time (A). Spatial RSA analysis. We correlated LSTM hidden state representations of the same episodes under the different key-mappings and showed the mean correlation across different spatial locations in the $9 \times 9$ grid. For context agents, this correlation is much lower, presumably because the representations of different key-mappings are more separated. (B). Temporal RSA analysis. We show how the correlation changes over time by plotting, for each timepoint, the mean correlation of LSTM hidden state representations under the different key-mappings. For context agents, this correlation goes down over time whereas for no-context agents, there is less change overtime. This suggests context agents exhibit more dissimilar representational similarity under different key-mappings overtime.
  • Figure 5: For context agents, scaling the LSTM capacity enables learning more mappings For both context and non-context agents, we varied the number of hidden units and exposed them to different number of mappings during training. For context agents, increasing the number of LSTM units allows for learning of more mappings until about 5. For no-context agents, varying the number of LSTM units does not have as strong an effect.