Table of Contents
Fetching ...

Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information

Christoph Jansen

TL;DR

This habilitation addresses the challenge of building decision-theoretic foundations for machine learning and robust statistics under weak and non-standard information. It develops a unifying framework around preference systems $\,mathcal{A}=[A,R_1,R_2]$ and generalized stochastic dominance $R_{(\mathcal{A},\mathcal{M})}$ to integrate ordinal and partial cardinal information with credal uncertainty via $\mathcal{M}$. The work comprises ten Contributions across three parts: A (Decision-Theoretic Foundations) introduces elicitation, state-dependent preferences, and multi-target decision rules; B (Machine Learning under Weakly Structured Information) transfers these ideas to ML, enabling mixed-scale benchmarking and robust pseudo-label selection; C (Robust Statistics under Non-Standard Scales of Measurement) develops scale-robust orders and permutation-based tests for poset-valued data. Methodologically, it leverages linear programs to compute GSD-based comparisons, permutation tests with regularization, and depth-based models for posets, enabling information-efficient inference and robust benchmarking under non-standard data and uncertainty. The practical impact lies in providing principled, scalable tools for robust decision-making and reliable ML benchmarking when data do not conform to standard numerical scales or precise probabilistic assumptions.

Abstract

This habilitation thesis is cumulative and, therefore, is collecting and connecting research that I (together with several co-authors) have conducted over the last few years. Thus, the absolute core of the work is formed by the ten publications listed on page 5 under the name Contributions 1 to 10. The references to the complete versions of these articles are also found in this list, making them as easily accessible as possible for readers wishing to dive deep into the different research projects. The chapters following this thesis, namely Parts A to C and the concluding remarks, serve to place the articles in a larger scientific context, to (briefly) explain their respective content on a less formal level, and to highlight some interesting perspectives for future research in their respective contexts. Naturally, therefore, the following presentation has neither the level of detail nor the formal rigor that can (hopefully) be found in the papers. The purpose of the following text is to provide the reader an easy and high-level access to this interesting and important research field as a whole, thereby, advertising it to a broader audience.

Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information

TL;DR

This habilitation addresses the challenge of building decision-theoretic foundations for machine learning and robust statistics under weak and non-standard information. It develops a unifying framework around preference systems and generalized stochastic dominance to integrate ordinal and partial cardinal information with credal uncertainty via . The work comprises ten Contributions across three parts: A (Decision-Theoretic Foundations) introduces elicitation, state-dependent preferences, and multi-target decision rules; B (Machine Learning under Weakly Structured Information) transfers these ideas to ML, enabling mixed-scale benchmarking and robust pseudo-label selection; C (Robust Statistics under Non-Standard Scales of Measurement) develops scale-robust orders and permutation-based tests for poset-valued data. Methodologically, it leverages linear programs to compute GSD-based comparisons, permutation tests with regularization, and depth-based models for posets, enabling information-efficient inference and robust benchmarking under non-standard data and uncertainty. The practical impact lies in providing principled, scalable tools for robust decision-making and reliable ML benchmarking when data do not conform to standard numerical scales or precise probabilistic assumptions.

Abstract

This habilitation thesis is cumulative and, therefore, is collecting and connecting research that I (together with several co-authors) have conducted over the last few years. Thus, the absolute core of the work is formed by the ten publications listed on page 5 under the name Contributions 1 to 10. The references to the complete versions of these articles are also found in this list, making them as easily accessible as possible for readers wishing to dive deep into the different research projects. The chapters following this thesis, namely Parts A to C and the concluding remarks, serve to place the articles in a larger scientific context, to (briefly) explain their respective content on a less formal level, and to highlight some interesting perspectives for future research in their respective contexts. Naturally, therefore, the following presentation has neither the level of detail nor the formal rigor that can (hopefully) be found in the papers. The purpose of the following text is to provide the reader an easy and high-level access to this interesting and important research field as a whole, thereby, advertising it to a broader audience.
Paper Structure (23 sections, 16 equations, 6 figures)

This paper contains 23 sections, 16 equations, 6 figures.

Figures (6)

  • Figure 1: Organization of the contributions to this thesis. The paper highlighted in gray is not part of this habilitation thesis. However, it may be advisable to read it, as it discusses in detail the foundations of many concepts that are important for this thesis, in particular decision making based on preference systems. Apart from this, the graph is to be understood as follows: If a one-sided arrow leads from Contribution A to Contriution B, then B builds (in part) on A. If, on the other hand, A and B are connected by a double arrow, they relate in terms of content, without mathematically (in parts) building directly on each other.
  • Figure 2: The image shows the Hasse graph of the relation $R_1$ of an exemplary preference system $\mathcal{A}$ on the finite consequence set $A = \{a_1, \dots , a_8\}$. The edges of the Hasse graph are sorted by intensity according to the color spectrum below. This symbolizes the relation $R_2$ of the exemplary preference system. For example, every representation $u \in \mathcal{U}_{\mathcal{A}}$ must satisfy that $u(a_7)-u(a_3)> u(a_4)-u(a_1)$ applies, as the blue preference is more intense than the red preference.
  • Figure 3: The figure shows an exemplary credal set on a finite set of states $S=\{s_1,s_2,s_3\}$. You can see directly that this can be identified (for finite $S$) with a convex polyhedron. We repeatedly take advantage of this fact in some of the publications to solve various problems using (mixed-integer) linear programming.
  • Figure 4: Label eliciation with three labels (yellow $=3$$>$ red $=2$$>$ blue $=1$). Consider a decision problem with $A=\{a_1, \dots , a_8\}$, $S=\{s_1 , \dots , s_4\}$, $\mathcal{G}=\{X_1, X_2\}$, $X_1(S)=(a_8,a_5,a_2,a_3)$, $X_2(S)=(a_7,a_6,a_4,a_1)$, and $\mathcal{M}=\{\pi: \pi(\{s_1\}) \geq \pi(\{s_2\}) \geq\pi(\{s_4\}) \geq \pi(\{s_3\})\}$. After asking four ranking questions (see above the single elicitation steps), we can conclude $4\cdot(\mathbb{E}_{\pi}(u \circ X_1)-\mathbb{E}_{\pi}(u \circ X_2)) = (u_8-u_7)-(u_6-u_5)+(u_3-u_1)-(u_4-u_2) >0,$ for every $u \in \mathcal{U}_{\mathcal{A}^*}$. Hence, we can conclude taht $X_1$ is optimal after asking four simple ranking questions. (The example is similar to Example 1 in Contribution 1.)
  • Figure 5: The diagram shows a schematic comparison of the two different types of regularization in preference systems. Note that the identical diagram can be found in pmlr-v216-jansen23a.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5