Table of Contents
Fetching ...

Learning Compact Representations of LLM Abilities via Item Response Theory

Jianhao Chen, Chenxu Wang, Gengrui Zhang, Peng Ye, Lei Bai, Wei Hu, Yuzhong Qu, Shuyue Hu

TL;DR

This work addresses the challenge of managing a rapidly expanding landscape of LLMs by learning compact, interpretable representations of model abilities for downstream tasks. It introduces IrtNet, an item response theory–inspired framework that jointly learns a $d$-dimensional model embedding $\theta_m$ and query parameters $\alpha_q$ (discrimination) and $\beta_q$ (difficulty) through a Mixture-of-Experts architecture, producing $f_{\theta}(m,q)=\sigma(\alpha_q^T\theta_m-\beta_q)$. The approach achieves state-of-the-art model routing accuracy and data-efficient benchmark prediction, while revealing interpretable structure: $\alpha_q$ encodes distinct query demands and $\beta_q$ correlates with empirical difficulty (Pearson $r= -0.9721$), and model embeddings form meaningful clusters by family and specialization. Overall, IrtNet provides a scalable, interpretable tool for evaluation, selection, and management of large LLM ecosystems in real-world settings.

Abstract

Recent years have witnessed a surge in the number of large language models (LLMs), yet efficiently managing and utilizing these vast resources remains a significant challenge. In this work, we explore how to learn compact representations of LLM abilities that can facilitate downstream tasks, such as model routing and performance prediction on new benchmarks. We frame this problem as estimating the probability that a given model will correctly answer a specific query. Inspired by the item response theory (IRT) in psychometrics, we model this probability as a function of three key factors: (i) the model's multi-skill ability vector, (2) the query's discrimination vector that separates models of differing skills, and (3) the query's difficulty scalar. To learn these parameters jointly, we introduce a Mixture-of-Experts (MoE) network that couples model- and query-level embeddings. Extensive experiments demonstrate that our approach leads to state-of-the-art performance in both model routing and benchmark accuracy prediction. Moreover, analysis validates that the learned parameters encode meaningful, interpretable information about model capabilities and query characteristics.

Learning Compact Representations of LLM Abilities via Item Response Theory

TL;DR

This work addresses the challenge of managing a rapidly expanding landscape of LLMs by learning compact, interpretable representations of model abilities for downstream tasks. It introduces IrtNet, an item response theory–inspired framework that jointly learns a -dimensional model embedding and query parameters (discrimination) and (difficulty) through a Mixture-of-Experts architecture, producing . The approach achieves state-of-the-art model routing accuracy and data-efficient benchmark prediction, while revealing interpretable structure: encodes distinct query demands and correlates with empirical difficulty (Pearson ), and model embeddings form meaningful clusters by family and specialization. Overall, IrtNet provides a scalable, interpretable tool for evaluation, selection, and management of large LLM ecosystems in real-world settings.

Abstract

Recent years have witnessed a surge in the number of large language models (LLMs), yet efficiently managing and utilizing these vast resources remains a significant challenge. In this work, we explore how to learn compact representations of LLM abilities that can facilitate downstream tasks, such as model routing and performance prediction on new benchmarks. We frame this problem as estimating the probability that a given model will correctly answer a specific query. Inspired by the item response theory (IRT) in psychometrics, we model this probability as a function of three key factors: (i) the model's multi-skill ability vector, (2) the query's discrimination vector that separates models of differing skills, and (3) the query's difficulty scalar. To learn these parameters jointly, we introduce a Mixture-of-Experts (MoE) network that couples model- and query-level embeddings. Extensive experiments demonstrate that our approach leads to state-of-the-art performance in both model routing and benchmark accuracy prediction. Moreover, analysis validates that the learned parameters encode meaningful, interpretable information about model capabilities and query characteristics.

Paper Structure

This paper contains 21 sections, 6 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Overview of the IrtNet framework for learning LLM representations. IrtNet learns model embeddings based on models' past query answering performance and outputs probabilities that models answer correctly. The output probability can be directly applied to downstream tasks containing model routing and benchmark prediction.
  • Figure 2: The architecture of IrtNet. A query embedding is processed through a dense MoE layer and subsequent linear layers to generate the query's discrimination $\alpha_q$ and difficulty $\beta_q$ parameters. These parameters are then combined with an LLM embedding $\theta_m$ via the response function to compute the final output probability.
  • Figure 3: Predicted vs. true benchmark scores (in [0-1]) on three OOD benchmarks. The scatter plots represent the predicted LLM scores by IrtNet and EmbedLLM. IrtNet's predictions (blue dots) align more closely with the perfect prediction diagonal line on MMLU and PIQA, which means lower prediction errors. IrtNet and EmbedLLM are tied on MedMCQA with the predicted scores almost coinciding with the true scores.
  • Figure 4: T-SNE visualization of learned query discrimination vectors $\alpha_q$.
  • Figure 5: Comparison of intra-community and inter-community L2 distances for LLM embeddings.