Multi-Conditional Ranking with Large Language Models

Pouya Pezeshkpour; Estevam Hruschka

Multi-Conditional Ranking with Large Language Models

Pouya Pezeshkpour, Estevam Hruschka

TL;DR

This paper defines and explores the task of multi-conditional ranking by introducing MCRank, a benchmark tailored for assessing multi-conditional ranking across various item types and conditions, and proposes a novel decomposed reasoning method, consisting of EXtracting and Sorting the conditions, and then Iteratively Ranking the items.

Abstract

Utilizing large language models (LLMs) to rank a set of items has become a common approach in recommendation and retrieval systems. Typically, these systems focus on ordering a substantial number of documents in a monotonic order based on a given query. However, real-world scenarios often present a different challenge: ranking a comparatively smaller set of items, but according to a variety of diverse and occasionally conflicting conditions. In this paper, we define and explore the task of multi-conditional ranking by introducing MCRank, a benchmark tailored for assessing multi-conditional ranking across various item types and conditions. Our analysis of LLMs using MCRank indicates a significant decrease in performance as the number and complexity of items and conditions grow. To overcome this limitation, we propose a novel decomposed reasoning method, consisting of EXtracting and Sorting the conditions, and then Iteratively Ranking the items (EXSIR). Our extensive experiments show that this decomposed reasoning method enhances LLMs' performance significantly, achieving up to a 14.4% improvement over existing LLMs. We also provide a detailed analysis of LLMs performance across various condition categories, and examine the effectiveness of decomposition step. Furthermore, we compare our method with existing approaches such as Chain-of-Thought and existing ranking models, demonstrating the superiority of our approach and complexity of MCR task. We released our dataset and code.

Multi-Conditional Ranking with Large Language Models

TL;DR

Abstract

Paper Structure (28 sections, 8 figures, 13 tables)

This paper contains 28 sections, 8 figures, 13 tables.

Introduction
Multi-Conditional Ranking
MCRank Benchmark
Extracting, Sorting, and Iteratively Ranking (EXSIR)
Experimental Details
Experiments
Ranking on MCRank Benchmark
Per-Category Breakdown
Accuracy of Decomposition
Zero-shot CoT vs Decomposed Reasoning
Existing Rankers and o1-mini
Existing rankers:
Related Work
Conclusion
Limitations
...and 13 more sections

Figures (8)

Figure 1: Overview of multi-conditional ranking. Instead of directly prompting LLMs to rank items based on the given conditions, we first extract and sort the conditions based on their priority. Then, we iteratively apply these sorted conditions to the item list.
Figure 2: LLMs performance on MCRank for token-level items.
Figure 3: LLMs performance on MCRank for paragraph-level items.
Figure 4: Evaluating the impact of EXSIR against zero-shot CoT prompting for token-level items. We additionally report SFR and RankGPT performances as representatives of existing rankers.
Figure 5: Evaluating the impact of EXSIR against zero-shot CoT prompting for paragraph-level items. We additionally report SFR and RankGPT performances as representatives of existing rankers.
...and 3 more figures

Multi-Conditional Ranking with Large Language Models

TL;DR

Abstract

Multi-Conditional Ranking with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (8)