How Far Can In-Context Alignment Go? Exploring the State of In-Context Alignment
Heyan Huang, Yinghao Li, Huashan Sun, Yu Bai, Yang Gao
TL;DR
This paper investigates the extent of In-Context Alignment (ICA) by decomposing the prompt into Format, System, and Example, and conducting ablation studies across model sizes. It finds that the Example component is the most influential for ICA, with System generally more impactful than Format, though a safety–helpfulness trade-off appears that shrinks as model size grows. Across knowledge and tool-use tasks, ICA can outperform fine-tuning methods, especially with larger models, but it struggles with multi-turn dialogue and instruction-following compared to Chat-based approaches. The work highlights ICA as a viable, parameter-free alignment mechanism that can rival or exceed some fine-tuning methods in specific domains, while also revealing limitations and the need for broader evaluation and data diversity.
Abstract
Recent studies have demonstrated that In-Context Learning (ICL), through the use of specific demonstrations, can align Large Language Models (LLMs) with human preferences known as In-Context Alignment (ICA), indicating that models can comprehend human instructions without requiring parameter adjustments. However, the exploration of the mechanism and applicability of ICA remains limited. In this paper, we begin by dividing the context text used in ICA into three categories: format, system prompt, and example. Through ablation experiments, we investigate the effectiveness of each part in enabling ICA to function effectively. We then examine how variants in these parts impact the model's alignment performance. Our findings indicate that the example part is crucial for enhancing the model's alignment capabilities, with changes in examples significantly affecting alignment performance. We also conduct a comprehensive evaluation of ICA's zero-shot capabilities in various alignment tasks. The results indicate that compared to parameter fine-tuning methods, ICA demonstrates superior performance in knowledge-based tasks and tool-use tasks. However, it still exhibits certain limitations in areas such as multi-turn dialogues and instruction following.
