HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering Matrices with the Consecutive Ones Property

Zixuan Chen; Subhodeep Mitra; R Ravi; Wolfgang Gatterbauer

HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering Matrices with the Consecutive Ones Property

Zixuan Chen, Subhodeep Mitra, R Ravi, Wolfgang Gatterbauer

TL;DR

Ability discovery ranks users by their latent ability to choose correct labels across heterogeneous items, framed as a dual of truth discovery. The authors introduce HITSnDIFFs (HnD), a spectral method that leverages the Consecutive Ones Property ($C1P$) and Item Response Theory (IRT) to recover the correct user ordering in ideal cases and to robustly handle non-ideal data. They show that HnD reconstructs the $C1P$ ordering via the second-largest eigenvector of a row-normalized update matrix and outperform ABH in scalability and stability, while providing entropy-based symmetry breaking to resolve order direction. Extensive synthetic-IRT experiments and real-world MCQ datasets demonstrate superior accuracy and linear scalability in the number of users and questions, with practical implications for crowd-sourced ability assessment and ranking.

Abstract

We analyze a general problem in a crowd-sourced setting where one user asks a question (also called item) and other users return answers (also called labels) for this question. Different from existing crowd sourcing work which focuses on finding the most appropriate label for the question (the "truth"), our problem is to determine a ranking of the users based on their ability to answer questions. We call this problem "ability discovery" to emphasize the connection to and duality with the more well-studied problem of "truth discovery". To model items and their labels in a principled way, we draw upon Item Response Theory (IRT) which is the widely accepted theory behind standardized tests such as SAT and GRE. We start from an idealized setting where the relative performance of users is consistent across items and better users choose better fitting labels for each item. We posit that a principled algorithmic solution to our more general problem should solve this ideal setting correctly and observe that the response matrices in this setting obey the Consecutive Ones Property (C1P). While C1P is well understood algorithmically with various discrete algorithms, we devise a novel variant of the HITS algorithm which we call "HITSNDIFFS" (or HND), and prove that it can recover the ideal C1P-permutation in case it exists. Unlike fast combinatorial algorithms for finding the consecutive ones permutation (if it exists), HND also returns an ordering when such a permutation does not exist. Thus it provides a principled heuristic for our problem that is guaranteed to return the correct answer in the ideal setting. Our experiments show that HND produces user rankings with robustly high accuracy compared to state-of-the-art truth discovery methods. We also show that our novel variant of HITS scales better in the number of users than ABH, the only prior spectral C1P reconstruction algorithm.

HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering Matrices with the Consecutive Ones Property

TL;DR

) and Item Response Theory (IRT) to recover the correct user ordering in ideal cases and to robustly handle non-ideal data. They show that HnD reconstructs the

ordering via the second-largest eigenvector of a row-normalized update matrix and outperform ABH in scalability and stability, while providing entropy-based symmetry breaking to resolve order direction. Extensive synthetic-IRT experiments and real-world MCQ datasets demonstrate superior accuracy and linear scalability in the number of users and questions, with practical implications for crowd-sourced ability assessment and ranking.

Abstract

Paper Structure (35 sections, 9 theorems, 25 equations, 15 figures, 1 table, 2 algorithms)

This paper contains 35 sections, 9 theorems, 25 equations, 15 figures, 1 table, 2 algorithms.

Introduction
Formal setup
Ability discovery problem formulation
The ideal case with consistent responses
Relation to Consecutive ones Property (C1P)
Relation to Item Response Theory (IRT)
A family of HITS algorithms
"HITS" and its variants for truth discovery
"avgHITS"
Our algorithm "HITSnDIFFs" (HnD)
Decile entropy-based symmetry breaking
Why HnD works better than ABH
Complexity Comparison
Experiments
Experimental setup
...and 20 more sections

Key Result

Lemma 1

$\boldsymbol{\mathbf{x}}$ is the 2nd largest eigenvector of $\boldsymbol{\mathbf{U}}$ iff $\boldsymbol{\mathbf{y}} = \boldsymbol{\mathbf{S}} \boldsymbol{\mathbf{x}}$ is the largest eigenvector of $\boldsymbol{\mathbf{U}}^{\textrm{diff}}$.

Figures (15)

Figure 1: (a) Ability discovery problem: $m\!=\!4$ users choose one from $k\!=\!3$ choices of labels A, B, C for each of $n=3$ items. (b) Input: the $(m \times k)$ response matrix $\boldsymbol{\mathbf{C}}'$, or equally its flattened $(m \times k n)$ binary response matrix $\boldsymbol{\mathbf{C}}$. (c) Model: the probability of picking the correct answer in terms of the user ability for Items 1,2,3. The abilities of all 4 users are marked on the horizontal axis.
Figure 2: Correspondences between the discussed IRT models. Orange numbers show number of free parameters per question. Arrows mean "specializes into." Dashed arrows imply specialization requires special assumptions: Bock to GRM: holds only approximately after fixing $a^{\textrm{Bock}}_{h} \!=\! h \cdot a^{\textrm{GRM}}$, Samejima to 3PL: for $k \!=\! 2$ when $c \!=\! 1/k$.
Figure 3: HITSnDIFFs uses a 3-partite graph of option weights, user scores, and user diffs. Contrast this graph with \ref{['fig:Fig_Intro']}. The update equations (see \ref{['alg:hnd']}) use two re-shaping matrices $\boldsymbol{\mathbf{S}}$ and $\boldsymbol{\mathbf{T}}$.
Figure 4: \ref{['sec:acc']}: Results of accuracy experiments (the legend is in the first figure).
Figure 5: \ref{['sec:sca']}: Scalability experiments with $n=100$ items and increasing numbers of users $m$ in (a), or $m=100$ users and increasing numbers of items $n$ in (b). The experiments confirm that our method (HnD) scales linearly in the number of items and users, whereas ABH (even trying various alternative methods) has an unavoidable quadratic scalability in the number of users.
...and 10 more figures

Theorems & Definitions (22)

Example 1: Student ranking
Example 2: Crowd workers ranking
Definition 1: Ability discovery
Example 3
Definition 2: Consistent Responses
Definition 3: C1P, P-matrix & pre-P-matrix ABH
Lemma 1: Eigenvector correspondence
Theorem 1: 2nd eigenvector of avgHITS recovers C1P
Theorem 2
Definition 4: R-matrix ABH
...and 12 more

HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering Matrices with the Consecutive Ones Property

TL;DR

Abstract

HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering Matrices with the Consecutive Ones Property

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (22)