Learning from Complementary Features
Kosuke Sugiyama, Masato Uchida
TL;DR
CFL addresses learning when some inputs are CFs that indicate what a feature is not, by deriving an information-theoretic objective that upper-bounds standard supervised loss through $J_{KL}$ and $J_{MI}$. The method first estimates CFs' exact values via a graph-based confidence propagation scheme (with a hypothetical self-referenced setting) and then trains the label predictor on these estimates, using practical approximations such as margin-based confidences and $k$-nearest-neighbor weight estimation to maintain tractability. Empirical results on Bank Marketing and Adult datasets show that the proposed approach improves CF-value estimation and downstream prediction in many cases, especially for CFs with few unique values, and provide guidance on when to apply soft vs hard CF estimates and how to select CFs. The work advances learning under privacy or cost constraints by effectively leveraging complementary information, with implications for interpretable decision-making and robust predictive modeling in domains with restricted feature observability.
Abstract
While precise data observation is essential for the learning processes of predictive models, it can be challenging owing to factors such as insufficient observation accuracy, high collection costs, and privacy constraints. In this paper, we examines cases where some qualitative features are unavailable as precise information indicating "what it is," but rather as complementary information indicating "what it is not." We refer to features defined by precise information as ordinary features (OFs) and those defined by complementary information as complementary features (CFs). We then formulate a new learning scenario termed Complementary Feature Learning (CFL), where predictive models are constructed using instances consisting of OFs and CFs. The simplest formalization of CFL applies conventional supervised learning directly using the observed values of CFs. However, this approach does not resolve the ambiguity associated with CFs, making learning challenging and complicating the interpretation of the predictive model's specific predictions. Therefore, we derive an objective function from an information-theoretic perspective to estimate the OF values corresponding to CFs and to predict output labels based on these estimations. Based on this objective function, we propose a theoretically guaranteed graph-based estimation method along with its practical approximation, for estimating OF values corresponding to CFs. The results of numerical experiments conducted with real-world data demonstrate that our proposed method effectively estimates OF values corresponding to CFs and predicts output labels.
