Table of Contents
Fetching ...

Contextual Control without Memory Growth in a Context-Switching Task

Song-Ju Kim

Abstract

Context-dependent sequential decision making is commonly addressed either by providing context explicitly as an input or by increasing recurrent memory so that contextual information can be represented internally. We study a third alternative: realizing contextual dependence by intervening on a shared recurrent latent state, without enlarging recurrent dimensionality. To this end, we introduce an intervention-based recurrent architecture in which a recurrent core first constructs a shared pre-intervention latent state, and context then acts through an additive, context-indexed operator. We evaluate this idea on a context-switching sequential decision task under partial observability. We compare three model families: a label-assisted baseline with direct context access, a memory baseline with enlarged recurrent state, and the proposed intervention model, which uses no direct context input to the recurrent core and no memory growth. On the main benchmark, the intervention model performs strongly without additional recurrent dimensions. We also evaluate the models using the conditional mutual information (I(C;O | S)) as a theorem-motivated operational probe of contextual dependence at fixed latent state. For task-relevant phase-1 outcomes, the intervention model exhibits positive conditional contextual information. Together, these results suggest that intervention on a shared recurrent state provides a viable alternative to recurrent memory growth for contextual control in this setting.

Contextual Control without Memory Growth in a Context-Switching Task

Abstract

Context-dependent sequential decision making is commonly addressed either by providing context explicitly as an input or by increasing recurrent memory so that contextual information can be represented internally. We study a third alternative: realizing contextual dependence by intervening on a shared recurrent latent state, without enlarging recurrent dimensionality. To this end, we introduce an intervention-based recurrent architecture in which a recurrent core first constructs a shared pre-intervention latent state, and context then acts through an additive, context-indexed operator. We evaluate this idea on a context-switching sequential decision task under partial observability. We compare three model families: a label-assisted baseline with direct context access, a memory baseline with enlarged recurrent state, and the proposed intervention model, which uses no direct context input to the recurrent core and no memory growth. On the main benchmark, the intervention model performs strongly without additional recurrent dimensions. We also evaluate the models using the conditional mutual information (I(C;O | S)) as a theorem-motivated operational probe of contextual dependence at fixed latent state. For task-relevant phase-1 outcomes, the intervention model exhibits positive conditional contextual information. Together, these results suggest that intervention on a shared recurrent state provides a viable alternative to recurrent memory growth for contextual control in this setting.

Paper Structure

This paper contains 17 sections, 21 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Problem setting of the context-switching sequential decision task. The agent acts in a $9\times 9$ maze and observes a local $3\times 3$ view together with a context token. The task contains two phases within each episode. In the AB order, the target is $G1$ in phase 0 and $G2$ in phase 1; in the BA order, the target is $G2$ in phase 0 and $G1$ in phase 1. The main benchmark conditions AB25 and BA30. Phase-1 success and reward are counted only after phase-0 success.
  • Figure 2: Overview of the benchmark conditions and model families. The upper row provides a compact timeline view of the two representative benchmark conditions, AB25 and BA30. The lower row compares the three model families. L directly receives the context token together with the spatial observation. M removes direct context input and instead enlarges the recurrent state from $d$ to $d+m$. I also removes direct context input from the recurrent core, but applies a context-indexed residual intervention to a shared pre-intervention latent state, yielding $z_t' = z_t + \alpha D_{c_t}(z_t)$. This comparison isolates three distinct implementations of contextual dependence: explicit conditioning, memory expansion, and intervention on a shared pre-intervention latent state.
  • Figure 3: Main performance on AB25 and BA30. Bars show the fraction of seeds (out of 10) that solved both phases for each model family. L solves both tasks perfectly. I achieves strong performance without additional recurrent dimensions. Among the memory baselines, performance is non-monotonic in memory size: M16 is strongest on BA30 and among the strongest settings on AB25, while both smaller and larger memory settings underperform it on BA30.