Affinity and Diversity: A Unified Metric for Demonstration Selection via Internal Representations
Mariko Kato, Hakaze Cho, Yoshihiro Sakai, Naoya Inoue
TL;DR
This work tackles the instability of In-Context Learning (ICL) arising from demonstration selection by introducing a unified affinity–diversity metric derived from internal model representations. By identifying the most influential induction head and operating in its $W_Q^{\hat{h},\top} W_K^{\hat{h}}$ subspace, the authors define affinity as the mean cosine similarity between query and label representations and diversity as the label representations' covariance-based variance. Empirically, affinity correlates with accuracy while diversity yields strong explanatory power ($R^2$) across tasks, and both metrics align with, yet unify, prior demonstration-selection methods that are often inconsistent. The proposed framework clarifies why existing methods diverge and suggests a joint selection criterion that could improve ICL performance in practical settings.
Abstract
The performance of In-Context Learning (ICL) is highly sensitive to the selected demonstrations. Existing approaches to demonstration selection optimize different objectives, yielding inconsistent results. To address this, we propose a unified metric--affinity and diversity--that leverages ICL model's internal representations. Our experiments show that both affinity and diversity strongly correlate with test accuracies, indicating their effectiveness for demonstration selection. Moreover, we show that our proposed metrics align well with various previous works to unify the inconsistency.
