Conformal Predictions for Probabilistically Robust Scalable Machine Learning Classification

Alberto Carlevaro; Teodoro Alamo Cantarero; Fabrizio Dabbene; Maurizio Mongelli

Conformal Predictions for Probabilistically Robust Scalable Machine Learning Classification

Alberto Carlevaro, Teodoro Alamo Cantarero, Fabrizio Dabbene, Maurizio Mongelli

TL;DR

The paper addresses the need for probabilistic guarantees in binary classification by integrating conformal prediction with scalable classifiers to produce conformal safety regions that cap misclassification probability at a user-specified $\varepsilon$. It derives a score function directly from the classifier, $s(\boldsymbol{x},\hat{y}) = -\hat{y}\bar{\rho}(\boldsymbol{x})$, where $\bar{\rho}(\boldsymbol{x})$ solves $f_\theta(\boldsymbol{x},\bar{\rho}(\boldsymbol{x}))=0$, and defines CSR as inputs whose scores place them on safe, one-label conformal sets. The analytical link between the CSR and the SC level set is established via $\rho_\varepsilon = |s_\varepsilon|$ and $\mathcal{S}_\varepsilon = \{\boldsymbol{x}: f_\theta(\boldsymbol{x},\rho_\varepsilon)<0\}$, with $\mathcal{S}_\varepsilon \subseteq \Sigma_\varepsilon$ and equality when $s_\varepsilon\le 0$; this is validated on a DNS tunneling detection task, demonstrating reliable and efficient conformal predictions. The framework thus offers region-specific probabilistic guarantees, improved interpretability, and potential regulatory advantages, with future work extending to multi-class scenarios and broader domains.

Abstract

Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined $\varepsilon$ error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.

Conformal Predictions for Probabilistically Robust Scalable Machine Learning Classification

TL;DR

. It derives a score function directly from the classifier,

, where

solves

, and defines CSR as inputs whose scores place them on safe, one-label conformal sets. The analytical link between the CSR and the SC level set is established via

and

, with

and equality when

; this is validated on a DNS tunneling detection task, demonstrating reliable and efficient conformal predictions. The framework thus offers region-specific probabilistic guarantees, improved interpretability, and potential regulatory advantages, with future work extending to multi-class scenarios and broader domains.

Abstract

error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.

Paper Structure (12 sections, 5 theorems, 32 equations, 6 figures)

This paper contains 12 sections, 5 theorems, 32 equations, 6 figures.

Introduction
Context
Contribution
Background: Scalable Classifiers and Conformal Prediction
Scalable Classifiers
Conformal Prediction
Notion of Score Function for Scalable Classifiers and Conformal Safety Sets
Natural definition of Score Function for Scalable Classifiers
Conformal Safety Regions
Analytical Form of Conformal Safety Regions for Scalable Classifiers
A real world application: detection of SSH-DNS tunnelling
Conclusions

Key Result

Proposition 1

Figures (6)

Figure 1: Relationship between the SVDD classifier and the corresponding score function: the absolute value of the score function assigns to a sample its distance to the circumference boundary. The color bar on the right helps to understand the behavior of the score function: darker colors indicate regions with less conformity with the target class, warmer the opposite. The zero value of the score function is obtained exactly on the boundary.
Figure 2: Scatter-plots of the conformal set varying $\varepsilon$ for cubic LR. Green and red points correspond to singleton conformal set ($C_\varepsilon(\boldsymbol{x})=\{+1\}$ and $C_\varepsilon(\boldsymbol{x})=\{-1\}$ respectively) yellow points to double predictions ($C_\varepsilon(\boldsymbol{x})=\{+1,-1\}$) and purpleblack points to empty prediction ($C_\varepsilon(\boldsymbol{x})=\varnothing$).
Figure 3: CSR computed with a Gaussian SVM at $\varepsilon = 0.05$. Scattered CSR $\Sigma_\varepsilon$, \ref{['fig:ex3a']}, coincides with the analytical CSR $\mathcal{S}_\varepsilon$, \ref{['fig:ex3b']} that coincides with the level set $z = \rho_\varepsilon$ of the score function, \ref{['fig:ex3c']}. Figure \ref{['fig:ex3d']} is the planar representation on ${x_1}-{x_2}$ plane of the score function.
Figure 4: Trend of the average error as $\varepsilon$ varies in $[0.05, 0.5]$ for different classifiers. The errors vary in $[0,0.6]$ for SVM, $[0,0.8]$ for SVDD and $[0,0.6]$ for LR.
Figure 5: Trend of the average size of conformal sets as $\varepsilon$ varies in $[0.05, 0.5]$ for different classifiers. The size varies from 0 (empty) to 1 (full).
...and 1 more figures

Theorems & Definitions (17)

Definition 1: Score Function for Scalable Classifier
Example 1
Definition 2: Conformal Safety Region
Example 2
Proposition 1
proof
Corollary 1
proof
Proposition 2
proof
...and 7 more

Conformal Predictions for Probabilistically Robust Scalable Machine Learning Classification

TL;DR

Abstract

Conformal Predictions for Probabilistically Robust Scalable Machine Learning Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (17)