Approximately optimal domain adaptation with Fisher's Linear Discriminant
Hayden S. Helm, Ashwin De Silva, Joshua T. Vogelstein, Carey E. Priebe, Weiwei Yang
TL;DR
This work introduces a Fisher's Linear Discriminant (FLD)–based, data-adaptive domain adaptation framework that convexly combines a source-task average classifier with a target-task classifier to improve performance under limited target data. The authors derive an analytically tractable risk expression under a generative model and provide a computable approximation for selecting the optimal convex weight $\alpha^*$ based on asymptotic distributions of projection vectors. Through simulations and three physiological prediction tasks (EEG/ECG), they demonstrate that the approximately optimal classifier often outperforms both the average-source and target classifiers, particularly in low-data regimes, and they discuss practical considerations such as privacy, computational cost, and visualization of projection vectors. Limitations include the two-class setting and a single global $\alpha$, with future directions pointing toward multi-class extensions and multimodal source modeling to capture more complex task distributions.
Abstract
We propose a class of models based on Fisher's Linear Discriminant (FLD) in the context of domain adaptation. The class is the convex combination of two hypotheses: i) an average hypothesis representing previously seen source tasks and ii) a hypothesis trained on a new target task. For a particular generative setting we derive the optimal convex combination of the two models under 0-1 loss, propose a computable approximation, and study the effect of various parameter settings on the relative risks between the optimal hypothesis, hypothesis i), and hypothesis ii). We demonstrate the effectiveness of the proposed optimal classifier in the context of EEG- and ECG-based classification settings and argue that the optimal classifier can be computed without access to direct information from any of the individual source tasks. We conclude by discussing further applications, limitations, and possible future directions.
