Table of Contents
Fetching ...

Functional worst risk minimization

Philip Kennerberg, Ernst C. Wit

TL;DR

This work extends worst-risk minimization to the functional domain by embedding target and covariates in $L^2$ spaces and using a (potentially unbounded) transfer operator $\\mathcal{T}$ with $(I-\\mathcal{T})^{-1}$ bounded. It derives a functional worst-risk decomposition that upper-bounds the worst-case out-of-sample risk over shifts in an out-of-sample set $C^\\gamma_{\\mathcal{A}}(A)$ by a linear combination of the pooled risk and the risk difference between environments, mirroring the non-functional case: $\sup_{A'\in C^\\gamma_{\\mathcal{A}}(A)} R_{A'}(\\beta) = \tfrac{1}{2}R_+(\\beta) + (\\gamma-\tfrac{1}{2})R_Δ(\\beta)$. The authors establish necessary and sufficient conditions for the existence and uniqueness of minimizers in $L^2([T_1,T_2]^2)^p$, provide representations in arbitrary ON-bases that avoid eigenfunction estimation, and prove consistent estimators for practical implementation. Through theoretical development and finite-sample illustrations, the paper demonstrates robust out-of-sample prediction under distributional shifts in functional regression, with potential impact on fields requiring reliable performance under changing conditions. Overall, the framework offers a rigorous, operator-based route to robust functional learning beyond score-space or RKHS approaches.

Abstract

The aim of this paper is to extend worst risk minimization, also called worst average loss minimization, to the functional realm. This means finding a functional regression representation that will be robust to future distribution shifts on the basis of data from two environments. In the classical non-functional realm, structural equations are based on a transfer matrix $B$. In section~\ref{sec:sfr}, we generalize this to consider a linear operator $\mathcal{T}$ on square integrable processes that plays the the part of $B$. By requiring that $(I-\mathcal{T})^{-1}$ is bounded -- as opposed to $\mathcal{T}$ -- this will allow for a large class of unbounded operators to be considered. Section~\ref{sec:worstrisk} considers two separate cases that both lead to the same worst-risk decomposition. Remarkably, this decomposition has the same structure as in the non-functional case. We consider any operator $\mathcal{T}$ that makes $(I-\mathcal{T})^{-1}$ bounded and define the future shift set in terms of the covariance functions of the shifts. In section~\ref{sec:minimizer}, we prove a necessary and sufficient condition for existence of a minimizer to this worst risk in the space of square integrable kernels. Previously, such minimizers were expressed in terms of the unknown eigenfunctions of the target and covariate integral operators (see for instance \cite{HeMullerWang} and \cite{YaoAOS}). This means that in order to estimate the minimizer, one must first estimate these unknown eigenfunctions. In contrast, the solution provided here will be expressed in any arbitrary ON-basis. This completely removes any necessity of estimating eigenfunctions. This pays dividends in section~\ref{sec:estimation}, where we provide a family of estimators, that are consistent with a large sample bound. Proofs of all the results are provided in the appendix.

Functional worst risk minimization

TL;DR

This work extends worst-risk minimization to the functional domain by embedding target and covariates in spaces and using a (potentially unbounded) transfer operator with bounded. It derives a functional worst-risk decomposition that upper-bounds the worst-case out-of-sample risk over shifts in an out-of-sample set by a linear combination of the pooled risk and the risk difference between environments, mirroring the non-functional case: . The authors establish necessary and sufficient conditions for the existence and uniqueness of minimizers in , provide representations in arbitrary ON-bases that avoid eigenfunction estimation, and prove consistent estimators for practical implementation. Through theoretical development and finite-sample illustrations, the paper demonstrates robust out-of-sample prediction under distributional shifts in functional regression, with potential impact on fields requiring reliable performance under changing conditions. Overall, the framework offers a rigorous, operator-based route to robust functional learning beyond score-space or RKHS approaches.

Abstract

The aim of this paper is to extend worst risk minimization, also called worst average loss minimization, to the functional realm. This means finding a functional regression representation that will be robust to future distribution shifts on the basis of data from two environments. In the classical non-functional realm, structural equations are based on a transfer matrix . In section~\ref{sec:sfr}, we generalize this to consider a linear operator on square integrable processes that plays the the part of . By requiring that is bounded -- as opposed to -- this will allow for a large class of unbounded operators to be considered. Section~\ref{sec:worstrisk} considers two separate cases that both lead to the same worst-risk decomposition. Remarkably, this decomposition has the same structure as in the non-functional case. We consider any operator that makes bounded and define the future shift set in terms of the covariance functions of the shifts. In section~\ref{sec:minimizer}, we prove a necessary and sufficient condition for existence of a minimizer to this worst risk in the space of square integrable kernels. Previously, such minimizers were expressed in terms of the unknown eigenfunctions of the target and covariate integral operators (see for instance \cite{HeMullerWang} and \cite{YaoAOS}). This means that in order to estimate the minimizer, one must first estimate these unknown eigenfunctions. In contrast, the solution provided here will be expressed in any arbitrary ON-basis. This completely removes any necessity of estimating eigenfunctions. This pays dividends in section~\ref{sec:estimation}, where we provide a family of estimators, that are consistent with a large sample bound. Proofs of all the results are provided in the appendix.

Paper Structure

This paper contains 26 sections, 11 theorems, 316 equations, 4 figures.

Key Result

Lemma 3.1

If $A'_n\xrightarrow{\mathcal{V}}A'$ then $R_{A'_n}(\beta)\to R_{A'}(\beta)$ for any $\beta\in (L^2([T_1,T_2]^2))^p$.

Figures (4)

  • Figure 1: Observational environment: a functional system that serves as an illustration of a structural system throughout the manuscript, in which $X(1)$ is the cause of $Y$ and $Y$ is the cause of $X(2)$. Our aim is to minimize the out-of-distribution prediction error of $Y$ using both $X(1)$ and $X(2)$.
  • Figure 2: Interventional environment: the structural functional system is also observed under a slightly intervened conditions. In particular, the scores $\xi_1$ of $X_t(1)$ and $\xi_2$ of $X_t(2)$ are affected by shifts $A_1$ and $A_2$, respectively.
  • Figure 3: Sample $(Y^A,X^A(1),X^A(2))$ from the shifted environment. Note that both $X(1)$ and $X(2)$ seem quite predictive for $Y$, but only $X(1)$ is causal --- and therefore $X(1)$ has the most robust out-of-sample risk behaviour, if $Y$ is not intervened, as in this example.
  • Figure 4: Functional regression coefficients $\beta_{x_1y}$ and $\beta_{x_2y}$ shown on the same x-y-z scale: (a) True causal parameters; (b) population values, pooling the two data-environments, "mistakenly" finds that $X(2)$ affects $Y$; (c) population values minimizing the out-of-sample risk in $\mathcal{C}_{\gamma=500}$ recovering largely the causal parameters; (d) empirical estimates in a data-setting with $n=1000$ samples in an observational and a slightly perturbed environment, using regularization parameter $\gamma=10$.

Theorems & Definitions (38)

  • Example 2.1
  • Example 2.2
  • Remark 2.3
  • Example 2.4
  • Example 2.5
  • Remark 2.6
  • Lemma 3.1
  • Definition 3.2
  • Proposition 3.3
  • Proposition 3.4
  • ...and 28 more