Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning
Hyundong Cho, Karishma Sharma, Nicolaas Jedema, Leonardo F. R. Ribeiro, Alessandro Moschitti, Ravi Krishnan, Jonathan May
TL;DR
The paper tackles the problem that large language models generalize toward a generic, collective voice, limiting personalization for individual users. It proposes Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes text generation by expanding the in-context learning prompt with model-generated negative samples and explanations, avoiding any parameter updates. Empirical results across two author-style datasets show TICL achieves high win rates against the prior state-of-the-art and competitive baselines, with explanations contributing the largest gains. The approach front-loads computation to construct a user-specific prompt, offering a practical path to personalization in black-box LLM settings, albeit with increased test-time cost and dependence on long-context understanding.
Abstract
Language models are aligned to the collective voice of many, resulting in generic outputs that do not align with specific users' styles. In this work, we present Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes language models for text generation tasks with fewer than 10 examples per user. TICL iteratively expands an in-context learning prompt via a trial-error-explain process, adding model-generated negative samples and explanations that provide fine-grained guidance towards a specific user's style. TICL achieves favorable win rates on pairwise comparisons with LLM-as-a-judge up to 91.5% against the previous state-of-the-art and outperforms competitive tuning-free baselines for personalized alignment tasks of writing emails, essays and news articles. Both lexical and qualitative analyses show that the negative samples and explanations enable language models to learn stylistic context more effectively and overcome the bias towards structural and formal phrases observed in their zero-shot outputs. By front-loading inference compute to create a user-specific in-context learning prompt that does not require extra generation steps at test time, TICL presents a novel yet simple approach for personalized alignment.
