Table of Contents
Fetching ...

Efficient Swap Multicalibration of Elicitable Properties

Lunjia Hu, Haipeng Luo, Spandan Senapati, Vatsal Sharan

TL;DR

This work introduces an oracle-efficient framework for online swap multicalibration of elicitable properties, extending multicalibration beyond means to arbitrary bounded, Lipschitz-identification properties. By leveraging an online agnostic learner and a careful decomposition into interval-based gains, the authors establish high-probability bounds of the form \\[SMCal_{\\Gamma, r}(\\mathcal{F}) = ilde{O}(T^{1/(r+1)}) \\] for any fixed r \\ge 2 and for hypothesis classes with bounded sequential Rademacher complexity, with a concrete improvement to the case r=2 yielding \\[SMCal_{\\Gamma, 2}(\\mathcal{F}) = ilde{O}(T^{1/3}).\] They present an inefficient baseline algorithm and then achieve oracle efficiency by running 2N copies of online agnostic learners, leading to strong improvements over prior online multicalibration results and resolving open questions about efficient, strong guarantees. The framework subsumes mean, quantile, and other elicitable properties via an identification function V, enabling swap multicalibration guarantees that scale favorably with the time horizon and the complexity of the hypothesis class. This work advances practical and theoretically sound guarantees for fair and calibrated online predictions in adversarial contexts.

Abstract

Multicalibration [HJKRR18] is an algorithmic fairness perspective that demands that the predictions of a predictor are correct conditional on themselves and membership in a collection of potentially overlapping subgroups of a population. The work of [NR23] established a surprising connection between multicalibration for an arbitrary property $Γ$ (e.g., mean or median) and property elicitation: a property $Γ$ can be multicalibrated if and only if it is elicitable, where elicitability is the notion that the true property value of a distribution can be obtained by solving a regression problem over the distribution. In the online setting, [NR23] proposed an inefficient algorithm that achieves $\sqrt T$ $\ell_2$-multicalibration error for a hypothesis class of group membership functions and an elicitable property $Γ$, after $T$ rounds of interaction between a forecaster and adversary. In this paper, we generalize multicalibration for an elicitable property $Γ$ from group membership functions to arbitrary bounded hypothesis classes and introduce a stronger notion -- swap multicalibration, following [GKR23]. Subsequently, we propose an oracle-efficient algorithm which, when given access to an online agnostic learner, achieves $T^{1/(r+1)}$ $\ell_r$-swap multicalibration error with high probability (for $r\ge2$) for a hypothesis class with bounded sequential Rademacher complexity and an elicitable property $Γ$. For the special case of $r=2$, this implies an oracle-efficient algorithm that achieves $T^{1/3}$ $\ell_2$-swap multicalibration error, which significantly improves on the previously established bounds for the problem [NR23, GMS25, LSS25a], and completely resolves an open question raised in [GJRR24] on the possibility of an oracle-efficient algorithm that achieves $\sqrt{T}$ $\ell_2$-mean multicalibration error by answering it in a strongly affirmative sense.

Efficient Swap Multicalibration of Elicitable Properties

TL;DR

This work introduces an oracle-efficient framework for online swap multicalibration of elicitable properties, extending multicalibration beyond means to arbitrary bounded, Lipschitz-identification properties. By leveraging an online agnostic learner and a careful decomposition into interval-based gains, the authors establish high-probability bounds of the form \ for any fixed r \\ge 2 and for hypothesis classes with bounded sequential Rademacher complexity, with a concrete improvement to the case r=2 yielding \ They present an inefficient baseline algorithm and then achieve oracle efficiency by running 2N copies of online agnostic learners, leading to strong improvements over prior online multicalibration results and resolving open questions about efficient, strong guarantees. The framework subsumes mean, quantile, and other elicitable properties via an identification function V, enabling swap multicalibration guarantees that scale favorably with the time horizon and the complexity of the hypothesis class. This work advances practical and theoretically sound guarantees for fair and calibrated online predictions in adversarial contexts.

Abstract

Multicalibration [HJKRR18] is an algorithmic fairness perspective that demands that the predictions of a predictor are correct conditional on themselves and membership in a collection of potentially overlapping subgroups of a population. The work of [NR23] established a surprising connection between multicalibration for an arbitrary property (e.g., mean or median) and property elicitation: a property can be multicalibrated if and only if it is elicitable, where elicitability is the notion that the true property value of a distribution can be obtained by solving a regression problem over the distribution. In the online setting, [NR23] proposed an inefficient algorithm that achieves -multicalibration error for a hypothesis class of group membership functions and an elicitable property , after rounds of interaction between a forecaster and adversary. In this paper, we generalize multicalibration for an elicitable property from group membership functions to arbitrary bounded hypothesis classes and introduce a stronger notion -- swap multicalibration, following [GKR23]. Subsequently, we propose an oracle-efficient algorithm which, when given access to an online agnostic learner, achieves -swap multicalibration error with high probability (for ) for a hypothesis class with bounded sequential Rademacher complexity and an elicitable property . For the special case of , this implies an oracle-efficient algorithm that achieves -swap multicalibration error, which significantly improves on the previously established bounds for the problem [NR23, GMS25, LSS25a], and completely resolves an open question raised in [GJRR24] on the possibility of an oracle-efficient algorithm that achieves -mean multicalibration error by answering it in a strongly affirmative sense.

Paper Structure

This paper contains 30 sections, 14 theorems, 88 equations, 1 table, 4 algorithms.

Key Result

Theorem 1

Fix a $r \ge 1$ and an elicitable property $\Gamma$ with a $\rho$-Lipschitz identification function, and assume that there exists an $\mathsf{OAL}$ for which $\mathsf{Reg}(\mathcal{F}, n) = \tilde{\mathcal{O}}(\sqrt{n}\mathsf{Comp}(\mathcal{F}))$, where $\mathsf{Comp}(\mathcal{F})$ is a complexity m with probability at least $1 - \delta$. Consequently, for $r \in [1, 2)$, alg:hp_algorithm_efficien

Theorems & Definitions (25)

  • Definition 1: Online Agnostic Learning ben2009agnosticbeygelzimer2015optimal
  • Theorem 1
  • Definition 2: Strictly consistent loss function, Property elicitation
  • Definition 3: Identification function
  • Definition 4: Sequential Rademacher Complexity
  • Proposition 1
  • proof
  • Lemma 1
  • Lemma 2
  • proof
  • ...and 15 more