Table of Contents
Fetching ...

Conditional Temporal Neural Processes with Covariance Loss

Boseon Yoo, Jiwoo Lee, Janghoon Ju, Seijun Chung, Soyeon Kim, Jaesik Choi

TL;DR

The paper addresses regression under noisy or partially informative observations by introducing Covariance Loss, a regularization that aligns learned basis-function covariances with empirical target covariances to capture dependencies akin to Gaussian processes and conditional neural processes. By combining an MSE term with a covariance-matching regularizer, the approach guides neural networks to reflect target-variable dependencies in the learned representations. The authors analyze the induced constraints, connect Covariance Loss to CNPs, and validate the method across classification and spatio-temporal regression tasks using STGCN and GWNET on traffic datasets, showing improved robustness and accuracy. This work offers a practical, architecture-agnostic path to incorporate dependency structure into neural models, with potential impact on robust time-series forecasting and structured prediction.

Abstract

We introduce a novel loss function, Covariance Loss, which is conceptually equivalent to conditional neural processes and has a form of regularization so that is applicable to many kinds of neural networks. With the proposed loss, mappings from input variables to target variables are highly affected by dependencies of target variables as well as mean activation and mean dependencies of input and target variables. This nature enables the resulting neural networks to become more robust to noisy observations and recapture missing dependencies from prior information. In order to show the validity of the proposed loss, we conduct extensive sets of experiments on real-world datasets with state-of-the-art models and discuss the benefits and drawbacks of the proposed Covariance Loss.

Conditional Temporal Neural Processes with Covariance Loss

TL;DR

The paper addresses regression under noisy or partially informative observations by introducing Covariance Loss, a regularization that aligns learned basis-function covariances with empirical target covariances to capture dependencies akin to Gaussian processes and conditional neural processes. By combining an MSE term with a covariance-matching regularizer, the approach guides neural networks to reflect target-variable dependencies in the learned representations. The authors analyze the induced constraints, connect Covariance Loss to CNPs, and validate the method across classification and spatio-temporal regression tasks using STGCN and GWNET on traffic datasets, showing improved robustness and accuracy. This work offers a practical, architecture-agnostic path to incorporate dependency structure into neural models, with potential impact on robust time-series forecasting and structured prediction.

Abstract

We introduce a novel loss function, Covariance Loss, which is conceptually equivalent to conditional neural processes and has a form of regularization so that is applicable to many kinds of neural networks. With the proposed loss, mappings from input variables to target variables are highly affected by dependencies of target variables as well as mean activation and mean dependencies of input and target variables. This nature enables the resulting neural networks to become more robust to noisy observations and recapture missing dependencies from prior information. In order to show the validity of the proposed loss, we conduct extensive sets of experiments on real-world datasets with state-of-the-art models and discuss the benefits and drawbacks of the proposed Covariance Loss.

Paper Structure

This paper contains 20 sections, 10 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Covariance Loss
  • Figure 2: The distribution of the effect of cross-terms for prediction (i.e. contribution) in log-scale, measured by Equation \ref{['eq:cross-term-contribution']} on PeMSD7(M) dataset. The fraction of cross-terms that have zero effect: 33% (STGCN), 87% (STGCN-Cov)
  • Figure 3: Covariance matrix of X, basis function of DNN, basis function of DNN-Cov, and one-hot-encoded label. Brighter color indicates higher covariance value. (b) is similar to (a) and some basis functions have high covariance with basis function that belongs to other classes. In contrast, (c) is similar to (d) and basis functions in each class are mutually exclusive.
  • Figure 4: An ambiguous sample in MNIST dataset. The sample (left) is similar to samples of class '0' and the prediction with mean activation of input variable is incorrect (middle). In contrast, the optimization with Covariance Loss shows correct prediction for the ambiguous sample (right).
  • Figure 5: While variance of basis function of STGCN is irrelevant to that of target variables (left), basis function of STGCN-Cov has variance of target variables (right).
  • ...and 13 more figures

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3