NIMO: a Nonlinear Interpretable MOdel
Shijian Xu, Marcello Massimo Negri, Volker Roth
TL;DR
NIMO addresses the interpretability–accuracy tension by combining a linear backbone with per-instance nonlinear corrections learned through a shared neural network. The method yields marginal effects at the mean (MEM) equal to the global linear coefficients, enabling faithful, globally interpretable feature effects while providing local, instance-specific explanations. Training relies on parameter elimination to express the linear coefficients as a function of neural parameters and an adaptive ridge framework to impose sparsity, with extensions to generalized linear models via IRLS. Empirical results on synthetic and real datasets show competitive predictive performance and clear interpretability, including interpretable MEMs and meaningful per-feature nonlinear interactions. The approach offers a practical, scalable path to intrinsically interpretable neural models with robust explanations for high-stakes applications.
Abstract
Deep learning has achieved remarkable success across many domains, but it has also created a growing demand for interpretability in model predictions. Although many explainable machine learning methods have been proposed, post-hoc explanations lack guaranteed fidelity and are sensitive to hyperparameter choices, highlighting the appeal of inherently interpretable models. For example, linear regression provides clear feature effects through its coefficients. However, such models are often outperformed by more complex neural networks (NNs) that usually lack inherent interpretability. To address this dilemma, we introduce NIMO, a framework that combines inherent interpretability with the expressive power of neural networks. Building on the simple linear regression, NIMO is able to provide flexible and intelligible feature effects. Relevantly, we develop an optimization method based on parameter elimination, that allows for optimizing the NN parameters and linear coefficients effectively and efficiently. By relying on adaptive ridge regression we can easily incorporate sparsity as well. We show empirically that our model can provide faithful and intelligible feature effects while maintaining good predictive performance.
