Table of Contents
Fetching ...

Modeling Ranking Properties with In-Context Learning

Nilanjan Sinhababu, Andrew Parry, Debasis Ganguly, Pabitra Mitra

TL;DR

This work addresses multi-objective information retrieval by balancing relevance with auxiliary objectives such as fairness and topical/diversity. It introduces an in-context learning (ICL) framework that conditions a language model on demonstrations of desired ranking properties drawn from similar queries, eliminating the need for task-specific training. The method defines target distributions over metadata attributes, using a greedy KL-divergence-based induction to reorder top-ranked documents so that the final list aligns with the specified objectives. Empirical results across MS MARCO, TREC DL, TREC Fairness, and Touché show that demonstration-based ICL improves diversity and fairness while maintaining relevance, outperforming prompt-based baselines and several post-hoc methods. The work suggests demonstration-guided model adaptation as a practical, training-free approach for dynamic, multi-objective ranking in real-world IR systems, while acknowledging ethical considerations and limitations such as reliance on existing query logs and model capacity.

Abstract

While standard IR models are mainly designed to optimize relevance, real-world search often needs to balance additional objectives such as diversity and fairness. These objectives depend on inter-document interactions and are commonly addressed using post-hoc heuristics or supervised learning methods, which require task-specific training for each ranking scenario and dataset. In this work, we propose an in-context learning (ICL) approach that eliminates the need for such training. Instead, our method relies on a small number of example rankings that demonstrate the desired trade-offs between objectives for past queries similar to the current input. We evaluate our approach on four IR test collections to investigate multiple auxiliary objectives: group fairness (TREC Fairness), polarity diversity (Touché), and topical diversity (TREC Deep Learning 2019/2020). We empirically validate that our method enables control over ranking behavior through demonstration engineering, allowing nuanced behavioral adjustments without explicit optimization.

Modeling Ranking Properties with In-Context Learning

TL;DR

This work addresses multi-objective information retrieval by balancing relevance with auxiliary objectives such as fairness and topical/diversity. It introduces an in-context learning (ICL) framework that conditions a language model on demonstrations of desired ranking properties drawn from similar queries, eliminating the need for task-specific training. The method defines target distributions over metadata attributes, using a greedy KL-divergence-based induction to reorder top-ranked documents so that the final list aligns with the specified objectives. Empirical results across MS MARCO, TREC DL, TREC Fairness, and Touché show that demonstration-based ICL improves diversity and fairness while maintaining relevance, outperforming prompt-based baselines and several post-hoc methods. The work suggests demonstration-guided model adaptation as a practical, training-free approach for dynamic, multi-objective ranking in real-world IR systems, while acknowledging ethical considerations and limitations such as reliance on existing query logs and model capacity.

Abstract

While standard IR models are mainly designed to optimize relevance, real-world search often needs to balance additional objectives such as diversity and fairness. These objectives depend on inter-document interactions and are commonly addressed using post-hoc heuristics or supervised learning methods, which require task-specific training for each ranking scenario and dataset. In this work, we propose an in-context learning (ICL) approach that eliminates the need for such training. Instead, our method relies on a small number of example rankings that demonstrate the desired trade-offs between objectives for past queries similar to the current input. We evaluate our approach on four IR test collections to investigate multiple auxiliary objectives: group fairness (TREC Fairness), polarity diversity (Touché), and topical diversity (TREC Deep Learning 2019/2020). We empirically validate that our method enables control over ranking behavior through demonstration engineering, allowing nuanced behavioral adjustments without explicit optimization.

Paper Structure

This paper contains 38 sections, 3 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Proposed ICL method for reranking a set of top-retrieved documents. An example constitutes a localized query along with its top-retrieved arranged to satisfy a desired ranking property, such as relevance, fairness, diversity, etc.
  • Figure 2: ICL Example for a Touche query. For this example, the target objective is to achieve a uniform distribution of the pro and the con arguments retrieved from the Touche collection. This figure shows the MS MARCO (train set) query - $Q$ - which is the most similar to the current input query - $Q^c$. This figure shows how the documents retrieved for $Q$ from the Touche collection are reranked to balance the pro:con ratio. This reranked list is added to the prompt as the example output.
  • Figure 3: The figure shows a sample input query from the Touche dataset. The ICL example of a related query from MS MARCO and its example output (balancing both relevance and pro:con parity, as shown in Figure \ref{['fig:icl_fair_examples']}) is used to control the current query's reranking.
  • Figure 4: The prompt template used in our work with the header identical to that of sun2023-chatgpt. Different from sun2023-chatgpt our prompt allows provision to include a target ranking for a similar query. In the figure, $Q^c$ denotes the current input query, and $D^c_i$ denotes the document at position $i$ of the input ranked list, which is to be re-ranked.
  • Figure 5: An example showing five localized queries that are retrieved for a test query in each test collection.

Theorems & Definitions (3)

  • Example 1
  • Example 2
  • Example 3