Table of Contents
Fetching ...

DCIts -- Deep Convolutional Interpreter for time series

Davor Horvatic, Domjan Baric

TL;DR

DCIts tackles the challenge of interpretable forecasting for multivariate time series by learning a transition operator α via a two-stage Focuser–Modeler architecture that factorizes α into α = C ∘ F. The approach uses a sliding window Q_t and a rich convolutional backbone to produce interpretable, per-sample coefficients that reveal which series and lags drive predictions, and extends to higher-order interactions with bias terms. Quantitative tests on Barić et al. benchmarks show DCIts matching or beating a strong baseline (IMV-LSTM) with superior interpretability, and qualitative analyses demonstrate faithful recovery of autoregressive, cross-correlated, and switching dynamics, including nonlinear extensions. The method provides a practical pathway to mechanistic insight in dynamic systems, enabling reconstruction of generating equations and identification of key drivers while maintaining forecasting accuracy; future work will address computation for very large or high-frequency data and validation on real-world domains.

Abstract

We introduce an interpretable deep learning model for multivariate time series forecasting that prioritizes both predictive performance and interpretability - key requirements for understanding complex physical phenomena. Our model not only matches but often surpasses existing interpretability methods, achieving this without compromising accuracy. Through extensive experiments, we demonstrate its ability to identify the most relevant time series and lags that contribute to forecasting future values, providing intuitive and transparent explanations for its predictions. To minimize the need for manual supervision, the model is designed so one can robustly determine the optimal window size that captures all necessary interactions within the smallest possible time frame. Additionally, it effectively identifies the optimal model order, balancing complexity when incorporating higher-order terms. These advancements hold significant implications for modeling and understanding dynamic systems, making the model a valuable tool for applied and computational physicists.

DCIts -- Deep Convolutional Interpreter for time series

TL;DR

DCIts tackles the challenge of interpretable forecasting for multivariate time series by learning a transition operator α via a two-stage Focuser–Modeler architecture that factorizes α into α = C ∘ F. The approach uses a sliding window Q_t and a rich convolutional backbone to produce interpretable, per-sample coefficients that reveal which series and lags drive predictions, and extends to higher-order interactions with bias terms. Quantitative tests on Barić et al. benchmarks show DCIts matching or beating a strong baseline (IMV-LSTM) with superior interpretability, and qualitative analyses demonstrate faithful recovery of autoregressive, cross-correlated, and switching dynamics, including nonlinear extensions. The method provides a practical pathway to mechanistic insight in dynamic systems, enabling reconstruction of generating equations and identification of key drivers while maintaining forecasting accuracy; future work will address computation for very large or high-frequency data and validation on real-world domains.

Abstract

We introduce an interpretable deep learning model for multivariate time series forecasting that prioritizes both predictive performance and interpretability - key requirements for understanding complex physical phenomena. Our model not only matches but often surpasses existing interpretability methods, achieving this without compromising accuracy. Through extensive experiments, we demonstrate its ability to identify the most relevant time series and lags that contribute to forecasting future values, providing intuitive and transparent explanations for its predictions. To minimize the need for manual supervision, the model is designed so one can robustly determine the optimal window size that captures all necessary interactions within the smallest possible time frame. Additionally, it effectively identifies the optimal model order, balancing complexity when incorporating higher-order terms. These advancements hold significant implications for modeling and understanding dynamic systems, making the model a valuable tool for applied and computational physicists.
Paper Structure (17 sections, 25 equations, 15 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 25 equations, 15 figures, 1 table, 1 algorithm.

Figures (15)

  • Figure 1: Figure shows a high-level architecture of DCIts, which consists of two modules, Focuser and linear Modeler. Focuser uses input to calculate the most important data points, i.e. tries to find which parts impact output. The linear Modeler calculates linear coefficients for each time series and lag, which are used to calculate the next step in the time series. The final result $\bm{X}_t$ is calculated as the product of input, Focuser output and Modeler output. In Section \ref{['higher-order']} we show how to extend the architecture to include nonlinear parts.
  • Figure 2: The left figure illustrates the architecture of the Focuser, while the right figure depicts that of the Modeler. Observing their structures, we find a notable similarity in their foundational design. Each begins with a sequence of convolutional layers, which are subsequently concatenated and directed into a succession of fully connected linear layers activated by $\tanh$ functions. However, a difference is present in the final layer: the Focuser employs a sigmoid function to eliminate inputs of lesser importance, effectively distinguishing between noise and signal. Conversely, the Modeler is devoid of any activation function, aimed at calculating coefficients for every input value. This aspect of the Modeler aligns it closely with the principles of linear regression, treating inputs as variables within such a framework. The absence of an activation function is deliberate, ensuring that the modeling of coefficients remains unrestricted. This is crucial, for instance, in modeling anti-correlation, which would not be feasible with the application of $ReLu$ or similar activation functions.
  • Figure 3: The architecture of the extended DCIts model incorporates bias and higher-order terms by adding new, parallel Focuser and Modeler modules. All Modeler modules, including all multiplications, utilize the Focuser's input and output.
  • Figure 4: The stability of prediction performance by dataset is assessed by plotting the standard deviation of the Mean Squared Error (MSE) divided by the mean MSE value on the y-axis. The x-axis represents the index of each dataset.
  • Figure 5: The left figure illustrates how the stability of prediction performance for a specific model varies with noise frequency $f$ for Dataset 2. DCITs is more resilient to changes in noise frequency, with increasing instability only at the highest noise frequencies. The right figure examines the stability of prediction performance as the number of time series $N$ increases in Dataset 2. Here, DCITs again demonstrates greater stability compared to IMV-LSTM across all values of $N$.
  • ...and 10 more figures