Beyond Demonstrations: Dynamic Vector Construction from Latent Representations
Wang Cai, Hsiu-Yuan Huang, Zhixiang Wang, Yunfang Wu
TL;DR
DyVec tackles the inefficiency and fragility of few-shot learning via In-Context Vectors by introducing a three-pronged approach: Exhaustive Query Rotation to derive robust latent representations, Dynamic Latent Segmentation to tailor vector granularity, and a REINFORCE-based strategy to learn optimal injection positions for inference-time intervention. The method leverages semantically aggregated latent representations from Multi-Head Attention and injects them into frozen LLMs to emulate or exceed few-shot ICL performance with zero training. Empirical results across six tasks and three models demonstrate that DyVec consistently outperforms few-shot ICL, LoRA, and prior ICV baselines, while maintaining high inference efficiency. This work advances practical, data-efficient, and robust task adaptation for large language models by revealing the value of structured latent signals and learned injection strategies.
Abstract
In-Context derived Vector (ICV) methods extract task-relevant representations from large language models (LLMs) and reinject them during inference, achieving comparable performance to few-shot In-Context Learning (ICL) without repeated demonstration processing. However, existing ICV methods remain sensitive to ICL-specific factors, often use coarse or semantically fragmented representations as the source of the vector, and rely on heuristic-based injection positions, limiting their applicability. To address these issues, we propose Dynamic Vector (DyVec), which incorporates an Exhaustive Query Rotation (EQR) strategy to extract robust semantically aggregated latent representations by mitigating variance introduced by ICL. It then applies Dynamic Latent Segmentation and Injection to adaptively partition representations based on task complexity and leverages REINFORCE-based optimization to learn optimal injection positions for each segment. Experiments results show that DyVec outperforms few-shot ICL, LoRA, and prior ICV baselines. Further analysis highlights the effectiveness of dynamically segmenting and injecting semantically aggregated latent representations. DyVec provides a lightweight and data-efficient solution for inference-time task adaptation.
