An Axiomatic Approach to Loss Aggregation and an Adapted Aggregating Algorithm

Armando J. Cabrera Pacheco; Rabanus Derr; Robert C. Williamson

An Axiomatic Approach to Loss Aggregation and an Adapted Aggregating Algorithm

Armando J. Cabrera Pacheco, Rabanus Derr, Robert C. Williamson

TL;DR

This work advances online learning under expert advice by showing that broad, reasonable loss aggregations are exactly the quasi-sums $\mathbf{Q}^u_n(x_1,\dots,x_n)=u^{-1}\big(\sum_{i=1}^n u(x_i)\big)$. It then develops an Aggregating Algorithm variant (APA-QS) that handles these quasi-sums through a weighting profile $f$ and a transformed loss $u$, preserving Bayes-like updating and a time-invariant regret bound via a change of variables $u(x) = -\ln f(x)$. The authors prove optimality results for quasi-sum aggregations, extend the framework to non-mixable losses, and provide an interpretation of aggregation as encoding the forecaster's attitude toward losses, supported by a weather-forecasting experiment that demonstrates how different generators shape predictions and tail behavior. Overall, the paper unifies axiomatic aggregation with online learning guarantees and offers practical guidance on selecting aggregations to control extreme losses.

Abstract

Supervised learning has gone beyond the expected risk minimization framework. Central to most of these developments is the introduction of more general aggregation functions for losses incurred by the learner. In this paper, we turn towards online learning under expert advice. Via easily justified assumptions we characterize a set of reasonable loss aggregation functions as quasi-sums. Based upon this insight, we suggest a variant of the Aggregating Algorithm tailored to these more general aggregation functions. This variant inherits most of the nice theoretical properties of the AA, such as recovery of Bayes' updating and a time-independent bound on quasi-sum regret. Finally, we argue that generalized aggregations express the attitude of the learner towards losses.

An Axiomatic Approach to Loss Aggregation and an Adapted Aggregating Algorithm

TL;DR

This work advances online learning under expert advice by showing that broad, reasonable loss aggregations are exactly the quasi-sums

. It then develops an Aggregating Algorithm variant (APA-QS) that handles these quasi-sums through a weighting profile

and a transformed loss

, preserving Bayes-like updating and a time-invariant regret bound via a change of variables

. The authors prove optimality results for quasi-sum aggregations, extend the framework to non-mixable losses, and provide an interpretation of aggregation as encoding the forecaster's attitude toward losses, supported by a weather-forecasting experiment that demonstrates how different generators shape predictions and tail behavior. Overall, the paper unifies axiomatic aggregation with online learning guarantees and offers practical guidance on selecting aggregations to control extreme losses.

Abstract

Paper Structure (18 sections, 11 theorems, 80 equations, 2 figures, 3 tables)

This paper contains 18 sections, 11 theorems, 80 equations, 2 figures, 3 tables.

Introduction
A simple introduction to aggregating algorithm
Vovk's aggregating algorithm
Bayes' updating for weights
Generalizing the loss aggregation in learning under expert advice
The aggregating algorithm for quasi-sums
An aggregating pseudo-algorithm for quasi-sums
Using the APA-QS to make predictions
A change of variables for the AA
AA-optimality for quasi-sums
Aggregation algorithm for non-mixable losses
AA-optimality
How aggregation changes prediction
Aggregation and utility of losses
Weighting profiles describe the updating step
...and 3 more sections

Key Result

Lemma 4.2

Let $\mathbf{A} \colon \bigcup_{n \in \mathbb{N}} [0,\infty)^n \longrightarrow [0, \infty)$ be an aggregation function. Suppose that $\mathbf{A}$ is continuous, strictly increasing, associative and loss compatible, i.e., it satsifies thm:aggregation - continuity - thm:aggregation - loss compatibilit If furthermore $\mathbf{A}$ is positively homogeneous thm:aggregation - Positive Homogeneity, then

Figures (2)

Figure 1: Graphical Summary of the Steps in the Aggregating Algorithm. Experts $\theta_1$ and $\theta_2$ provide predictions $\xi(\theta_1)$ and $\xi(\theta_2)$, respectively, which are placed in the simplex (top-left) as $x_1 \coloneqq (\xi(\theta_1), 1- \xi(\theta_1))$ and $x_2 \coloneqq (\xi(\theta_2), 1- \xi(\theta_2))$ via $s \mapsto (s,1-s)$. The log-loss embeds the simplex as a curve in $\mathbb{R}^2$ (top-right), i.e. $s \mapsto -\ln s$ is applied coordinate-wise and maps $x_1$ and $x_2$ to $x_1'$ and $x_2'$. Then, the exponential mapping projects them into $[0,1]^2$. The aggregating algorithm forms a convex combination $\psi$ of the projected predictions $x_1"$ and $x_2"$ based on weights updated by a Bayesian-type formula (orange-brown), called a pseudo-prediction , which is substituted back to the simplex via a substitution function $\Sigma$ (darkgreen).
Figure 2: Comparative Example of Linear and Squared Utility. The horizontal axis denotes the loss value . The vertical axis the negative utility of the loss. We compare the negative utility function $u(x) = x$ to $u(x)=x^2$. In particular, for two values highlighted by a darkgreen arrow, low value, and an orange-brown arrow, high value.

Theorems & Definitions (39)

Remark 3.1
Example 3.2
Definition 4.1: Aggregation Functions
Lemma 4.2: Axiomatical Characterization of Loss-Aggregations
proof
Definition 4.3: Aggregation as Quasi-Sum
Example 4.4: $p$-Norms
Lemma 4.5
proof
Example 4.6
...and 29 more

An Axiomatic Approach to Loss Aggregation and an Adapted Aggregating Algorithm

TL;DR

Abstract

An Axiomatic Approach to Loss Aggregation and an Adapted Aggregating Algorithm

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (39)