Table of Contents
Fetching ...

Context is All You Need

Jean Erik Delanois, Shruti Joshi, Ryan Golden, Teresa Nick, Maxim Bazhenov

Abstract

Artificial Neural Networks (ANNs) are increasingly deployed across diverse real-world settings, where they must operate under data distributions that differ from those seen during training. This challenge is central to Domain Generalization (DG), which trains models to generalize to unseen domains without target data, and Test-Time Adaptation (TTA), which improves robustness by adapting to unlabeled test data at deployment. Existing approaches to address these challenges are often complex, resource-intensive, and difficult to scale. We introduce CONTXT (Contextual augmentatiOn for Neural feaTure X Transforms), a simple and intuitive method for contextual adaptation. CONTXT modulates internal representations using simple additive and multiplicative feature transforms. Within a TTA setting, it yields consistent gains across discriminative tasks (e.g., ANN/CNN classification) and generative models (e.g., LLMs). The method is lightweight, easy to integrate, and incurs minimal overhead, enabling robust performance under domain shift without added complexity. More broadly, CONTXT provides a compact way to steer information flow and neural processing without retraining.

Context is All You Need

Abstract

Artificial Neural Networks (ANNs) are increasingly deployed across diverse real-world settings, where they must operate under data distributions that differ from those seen during training. This challenge is central to Domain Generalization (DG), which trains models to generalize to unseen domains without target data, and Test-Time Adaptation (TTA), which improves robustness by adapting to unlabeled test data at deployment. Existing approaches to address these challenges are often complex, resource-intensive, and difficult to scale. We introduce CONTXT (Contextual augmentatiOn for Neural feaTure X Transforms), a simple and intuitive method for contextual adaptation. CONTXT modulates internal representations using simple additive and multiplicative feature transforms. Within a TTA setting, it yields consistent gains across discriminative tasks (e.g., ANN/CNN classification) and generative models (e.g., LLMs). The method is lightweight, easy to integrate, and incurs minimal overhead, enabling robust performance under domain shift without added complexity. More broadly, CONTXT provides a compact way to steer information flow and neural processing without retraining.

Paper Structure

This paper contains 20 sections, 3 equations, 8 figures.

Figures (8)

  • Figure 1: CONTXT: Contextual augmentation via feature transforms. (a) At a chosen layer, compare the current feature vector $\mathbf{h}$ to a precomputed contextual feature representation $\mathbf{c}$ to form an "index" (their difference) $\mathbf{d}=\mathbf{c}-\mathbf{h}$. (b) Add a scaled version of this index, $\alpha\mathbf{d}$, to the features; $\alpha>0$ injects the context while $\alpha<0$ removes it. (c) Mix multiple contexts by linearly combining indices with separate scalars, e.g. $\alpha_i\mathbf{d}_i+\alpha_j\mathbf{d}_j$.
  • Figure 2: "Cow on a beach" example. (a) A representative input image shown alongside contextual examples. (b/c) The vertical axis reports the model’s maximum softmax confidence, while the horizontal axis sweeps the strength of the farm/city index injection; each subplot corresponds to a different fixed level of beach context removal (strength annotated above each panel). For both injection and removal, $\alpha = 0$ indicates that no context is applied. Text above each curve denotes the top-1 predicted class at that setting. Correct application of CONTXT yields proper classification.
  • Figure 3: Baseline accuracy for the CCT (a) and PACS (b) models. Models were trained on a single domain (Location 38 / Photo), performance on the training domain is highest while accuracy quickly degrades when tested in other domains.
  • Figure 4: Accuracy heatmaps for CCT (a) and PACS (b). Vertical axis: out-of-domain removal strength; horizontal axis: in-domain injection strength. Color encodes change in mean test accuracy averaged across all domains (trained and untrained) compared to baseline. CONTXT can improve performance by about 10%.
  • Figure 5: Domain-wise change in accuracy on CCT (a) and PACS (b). Source domains show zero shift - Photo in PACS and Location 38 in CCT - while most unseen target domains exhibit substantial improvements.
  • ...and 3 more figures