Table of Contents
Fetching ...

Neural Network Models for Contextual Regression

Seksan Kiatsupaibul, Pakawan Chansiripas

Abstract

We propose a neural network model for contextual regression in which the regression model depends on contextual features that determine the active submodel and an algorithm to fit the model. The proposed simple contextual neural network (SCtxtNN) separates context identification from context-specific regression, resulting in a structured and interpretable architecture with fewer parameters than a fully connected feed-forward network. We show mathematically that the proposed architecture is sufficient to represent contextual linear regression models using only standard neural network components. Numerical experiments are provided to support the theoretical result, showing that the proposed model achieves lower excess mean squared error and more stable performance than feed-forward neural networks with comparable numbers of parameters, while larger networks improve accuracy only at the cost of increased complexity. The results suggest that incorporating contextual structure can improve model efficiency while preserving interpretability.

Neural Network Models for Contextual Regression

Abstract

We propose a neural network model for contextual regression in which the regression model depends on contextual features that determine the active submodel and an algorithm to fit the model. The proposed simple contextual neural network (SCtxtNN) separates context identification from context-specific regression, resulting in a structured and interpretable architecture with fewer parameters than a fully connected feed-forward network. We show mathematically that the proposed architecture is sufficient to represent contextual linear regression models using only standard neural network components. Numerical experiments are provided to support the theoretical result, showing that the proposed model achieves lower excess mean squared error and more stable performance than feed-forward neural networks with comparable numbers of parameters, while larger networks improve accuracy only at the cost of increased complexity. The results suggest that incorporating contextual structure can improve model efficiency while preserving interpretability.

Paper Structure

This paper contains 7 sections, 1 theorem, 34 equations, 3 figures, 1 table.

Key Result

Proposition 1

Let $S\subset\hat{\mathcal{X}}$ be a compact subset of the regressor space and let $T$ be a bounded interval in $\mathbb R$ such that $T\cap I_j \neq \emptyset$ for $j=1,\ldots,c$. For a simple contextual linear regression model, there exists SCtxtNN with output $y$ such that

Figures (3)

  • Figure 1: Simple contextual neural network (SCtxtNN) architecture.
  • Figure 2: Training and validation MSE over epochs for SCtxtNN, Small FF, and Large FF models.
  • Figure 3: Excess test MSE over 50 simulations for SCtxtNN, Small FF, and Large FF. Excess MSE is defined as test MSE minus the noise variance, so that zero corresponds to the optimal achievable error.

Theorems & Definitions (2)

  • Proposition 1
  • proof