Table of Contents
Fetching ...

FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time Series

Qiqi Su, Christos Kloukinas, Artur d'Avila Garcez

TL;DR

FocusLearn tackles the interpretability gap in time-series modeling by integrating an RNN-based temporal encoder with an attention-driven feature selector and a bank of modular neural networks. The architecture yields additive, feature-wise explanations while achieving predictive performance on par with top non-interpretable methods like LSTM and XGBoost, and outperforming interpretable baselines NAM and SPAM. Key innovations include Attention-Based Feature Selection (AFS) and Attention-Based Node Bootstrapping (ANB), which guide and weight modular units to produce faithful, NAM-like explanations. The approach offers practical impact by delivering transparent predictions and reliable explanations for complex multivariate time-series tasks, with strong empirical results across multiple datasets and tasks.

Abstract

Multivariate time series have many applications, from healthcare and meteorology to life science. Although deep learning models have shown excellent predictive performance for time series, they have been criticised for being "black-boxes" or non-interpretable. This paper proposes a novel modular neural network model for multivariate time series prediction that is interpretable by construction. A recurrent neural network learns the temporal dependencies in the data while an attention-based feature selection component selects the most relevant features and suppresses redundant features used in the learning of the temporal dependencies. A modular deep network is trained from the selected features independently to show the users how features influence outcomes, making the model interpretable. Experimental results show that this approach can outperform state-of-the-art interpretable Neural Additive Models (NAM) and variations thereof in both regression and classification of time series tasks, achieving a predictive performance that is comparable to the top non-interpretable methods for time series, LSTM and XGBoost.

FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time Series

TL;DR

FocusLearn tackles the interpretability gap in time-series modeling by integrating an RNN-based temporal encoder with an attention-driven feature selector and a bank of modular neural networks. The architecture yields additive, feature-wise explanations while achieving predictive performance on par with top non-interpretable methods like LSTM and XGBoost, and outperforming interpretable baselines NAM and SPAM. Key innovations include Attention-Based Feature Selection (AFS) and Attention-Based Node Bootstrapping (ANB), which guide and weight modular units to produce faithful, NAM-like explanations. The approach offers practical impact by delivering transparent predictions and reliable explanations for complex multivariate time-series tasks, with strong empirical results across multiple datasets and tasks.

Abstract

Multivariate time series have many applications, from healthcare and meteorology to life science. Although deep learning models have shown excellent predictive performance for time series, they have been criticised for being "black-boxes" or non-interpretable. This paper proposes a novel modular neural network model for multivariate time series prediction that is interpretable by construction. A recurrent neural network learns the temporal dependencies in the data while an attention-based feature selection component selects the most relevant features and suppresses redundant features used in the learning of the temporal dependencies. A modular deep network is trained from the selected features independently to show the users how features influence outcomes, making the model interpretable. Experimental results show that this approach can outperform state-of-the-art interpretable Neural Additive Models (NAM) and variations thereof in both regression and classification of time series tasks, achieving a predictive performance that is comparable to the top non-interpretable methods for time series, LSTM and XGBoost.
Paper Structure (25 sections, 4 equations, 7 figures, 4 tables)

This paper contains 25 sections, 4 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Attention Modular Networks (FocusLearn) architecture. FocusLearn consists of two main paths (training and inference) with five main components: 1Recurrent Neural Network (RNN) (see Section \ref{['sec: rnn']}), 2Attention-based Feature Selection (AFS) with 3 AFS masking (see Section \ref{['sec: afs']}), and 4 a group of interpretable Modular Neural Networks (MNN) that learn from the top $n$ input features selected by the AFS (see Section \ref{['sec: mod_main']}). The 5Attention-based Node Bootstrapping (ANB) in each module's first layer is weighted by the AFS's attention weights (see Section \ref{['sec: mod_main']}). MNNs outputs are then aggregated. All components are required for training, but to maintain the interpretability of FocusLearn, only the components in the inference path box are used after training (see Section \ref{['sec: paths']}).
  • Figure 2: Prediction visualisations. The x-axis represents predicted time-steps and the y-axis represents feature values. Note: data are pre-processed differently for XGBoost, which supports missing values by default, whereas missing values are interpolated for training with FocusLearn and LSTM.
  • Figure 3: NAM-style explanations for selected features learned by MNN. Network outputs are shown on the y-axis. Feature values are shown on the x-axis (normalised values above and actual values below). The blue line represents the learned shape function. Normalised data densities are shown using the red bars, the darker the red, the more data is available in that region.
  • Figure 4: OitReal Graphs learned by FocusLearn in predicting future hearing aid usage (regression) on the OtiReal dataset. These plots show top 10 features selected by the AFS component in the FocusLearn, where selected features with normalised and original values are on the x-axis, and daily future hearing aid usage prediction contribution are on the y-axis. In OtiReal, two timesteps of data are transformed and used as inputs, such that we are using data at $t_i$ and $t_{i+1}$ to predict hearing aid usage at $t_{i+2}$.
  • Figure 5: Air Graphs learned by FocusLearn in predicting future PM$_{2.5}$ values (regression) on the Air dataset. These plots show top 10 features selected by the AFS component in the FocusLearn, where selected features with normalised and original values are on the x-axis, and future PM$_{2.5}$ values prediction contribution are on the y-axis.
  • ...and 2 more figures