Cost and Reward Infused Metric Elicitation

Chethan Bhateja; Joseph O'Brien; Afnaan Hashmi; Eva Prakash

Cost and Reward Infused Metric Elicitation

Chethan Bhateja, Joseph O'Brien, Afnaan Hashmi, Eva Prakash

TL;DR

This paper addresses the limitation of metric elicitation methods that rely solely on confusion matrices by incorporating bounded costs and rewards into the elicitation framework. It extends the multiclass metric elicitation approach DLPME, introducing a cost-reward augmented metric $\psi(\mathbf{d}, \mathbf{r}, \mathbf{c}) = \langle \mathbf{a^d}, \mathbf{d} \rangle + \langle \mathbf{a^c}, \mathbf{c} \rangle + \langle \mathbf{a^r}, \mathbf{r} \rangle$, and an algorithm that learns weight ratios by comparing classifiers and querying an oracle, using RBO for accuracies and Pareto-frontier-based decisions for costs/rewards. Experiments on synthetic data show rapid, logarithmic convergence toward the true metric with scalable queries, and the method provides a practical path to deploying metrics that reflect multi-objective trade-offs such as monetary cost and latency. The work also outlines future directions for real-data validation, non-linear utilities, and group-aware considerations, while addressing ethical and governance concerns in deployment.

Abstract

In machine learning, metric elicitation refers to the selection of performance metrics that best reflect an individual's implicit preferences for a given application. Currently, metric elicitation methods only consider metrics that depend on the accuracy values encoded within a given model's confusion matrix. However, focusing solely on confusion matrices does not account for other model feasibility considerations such as varied monetary costs or latencies. In our work, we build upon the multiclass metric elicitation framework of Hiranandani et al., extrapolating their proposed Diagonal Linear Performance Metric Elicitation (DLPME) algorithm to account for additional bounded costs and rewards. Our experimental results with synthetic data demonstrate our approach's ability to quickly converge to the true metric.

Cost and Reward Infused Metric Elicitation

TL;DR

, and an algorithm that learns weight ratios by comparing classifiers and querying an oracle, using RBO for accuracies and Pareto-frontier-based decisions for costs/rewards. Experiments on synthetic data show rapid, logarithmic convergence toward the true metric with scalable queries, and the method provides a practical path to deploying metrics that reflect multi-objective trade-offs such as monetary cost and latency. The work also outlines future directions for real-data validation, non-linear utilities, and group-aware considerations, while addressing ethical and governance concerns in deployment.

Abstract

Paper Structure (13 sections, 2 theorems, 8 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 2 theorems, 8 equations, 4 figures, 1 table, 1 algorithm.

Introduction
Preliminaries
Geometry of the Space of Cost and Reward Infused Classifiers
Method
Choosing Hypothesis Weights
Determining the Optimal Classifier
Optimal Weighted Accuracy Classifier
Cost-Accuracy Classifier
Algorithm
Experiments and Analysis
Future Work
Ethics and Society Review Statement
Conclusion

Key Result

Proposition 1

The space of diagonal confusions $\mathcal{D}$ is strictly convex, closed, and contained in the box $[0, \zeta_1] \times [0, \zeta_2] \times \dots \times [0, \zeta_k]$. The vertices of the space of diagonal confusions are given by $\zeta_i \mathbf{e}_i \forall i \in [k]$, where $\mathbf{e}_i$ denote

Figures (4)

Figure 1: Cost and reward infused metric elicitation framework adapted from hiranandani2019performancemetricelicitationpairwise.
Figure 2: Confusion Space of Classifiers for $k=3$multiclass
Figure 3: We choose the Pareto frontier to be a quarter ellipse to cover all metric slopes. Note that "Attribute" above can either correspond to a cost or a reward. If "Attribute" is a reward the vertical y axis is positive, and if "Attribute" is a cost then the vertical y axis is negative.
Figure 4: Plot showing the L1 error with each passing iteration (left). 3D representation of the algorithm learning with two classes and a single additional cost (right).

Theorems & Definitions (2)

Proposition 1: Confusion Space
Proposition 2: RBO Classifier

Cost and Reward Infused Metric Elicitation

TL;DR

Abstract

Cost and Reward Infused Metric Elicitation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (2)