Functional worst risk minimization
Philip Kennerberg, Ernst C. Wit
TL;DR
This work extends worst-risk minimization to the functional domain by embedding target and covariates in $L^2$ spaces and using a (potentially unbounded) transfer operator $\\mathcal{T}$ with $(I-\\mathcal{T})^{-1}$ bounded. It derives a functional worst-risk decomposition that upper-bounds the worst-case out-of-sample risk over shifts in an out-of-sample set $C^\\gamma_{\\mathcal{A}}(A)$ by a linear combination of the pooled risk and the risk difference between environments, mirroring the non-functional case: $\sup_{A'\in C^\\gamma_{\\mathcal{A}}(A)} R_{A'}(\\beta) = \tfrac{1}{2}R_+(\\beta) + (\\gamma-\tfrac{1}{2})R_Δ(\\beta)$. The authors establish necessary and sufficient conditions for the existence and uniqueness of minimizers in $L^2([T_1,T_2]^2)^p$, provide representations in arbitrary ON-bases that avoid eigenfunction estimation, and prove consistent estimators for practical implementation. Through theoretical development and finite-sample illustrations, the paper demonstrates robust out-of-sample prediction under distributional shifts in functional regression, with potential impact on fields requiring reliable performance under changing conditions. Overall, the framework offers a rigorous, operator-based route to robust functional learning beyond score-space or RKHS approaches.
Abstract
The aim of this paper is to extend worst risk minimization, also called worst average loss minimization, to the functional realm. This means finding a functional regression representation that will be robust to future distribution shifts on the basis of data from two environments. In the classical non-functional realm, structural equations are based on a transfer matrix $B$. In section~\ref{sec:sfr}, we generalize this to consider a linear operator $\mathcal{T}$ on square integrable processes that plays the the part of $B$. By requiring that $(I-\mathcal{T})^{-1}$ is bounded -- as opposed to $\mathcal{T}$ -- this will allow for a large class of unbounded operators to be considered. Section~\ref{sec:worstrisk} considers two separate cases that both lead to the same worst-risk decomposition. Remarkably, this decomposition has the same structure as in the non-functional case. We consider any operator $\mathcal{T}$ that makes $(I-\mathcal{T})^{-1}$ bounded and define the future shift set in terms of the covariance functions of the shifts. In section~\ref{sec:minimizer}, we prove a necessary and sufficient condition for existence of a minimizer to this worst risk in the space of square integrable kernels. Previously, such minimizers were expressed in terms of the unknown eigenfunctions of the target and covariate integral operators (see for instance \cite{HeMullerWang} and \cite{YaoAOS}). This means that in order to estimate the minimizer, one must first estimate these unknown eigenfunctions. In contrast, the solution provided here will be expressed in any arbitrary ON-basis. This completely removes any necessity of estimating eigenfunctions. This pays dividends in section~\ref{sec:estimation}, where we provide a family of estimators, that are consistent with a large sample bound. Proofs of all the results are provided in the appendix.
