Genetic Generalized Additive Models
Kaaustaaub Shankar, Kelly Cohen
TL;DR
This work addresses the need for interpretable yet accurate models by applying NSGA-II to optimize Generalized Additive Models (GAMs) across accuracy and interpretability objectives. The method defines a complexity penalty $C = 0.70 U + 0.30 S$, where $U$ is the mean width of the 95% confidence intervals and $S$ measures sparsity, and evaluates models via cross-validated $RMSE$ on the California Housing dataset. Results show NSGA-II discovers GAMs that outperform a baseline LinearGAM in predictive accuracy or match its performance with substantially lower complexity and smoother, more honest confidence intervals. This demonstrates a general, automated framework for designing transparent, high-performing models, with an implementation available at the project repository.
Abstract
Generalized Additive Models (GAMs) balance predictive accuracy and interpretability, but manually configuring their structure is challenging. We propose using the multi-objective genetic algorithm NSGA-II to automatically optimize GAMs, jointly minimizing prediction error (RMSE) and a Complexity Penalty that captures sparsity, smoothness, and uncertainty. Experiments on the California Housing dataset show that NSGA-II discovers GAMs that outperform baseline LinearGAMs in accuracy or match performance with substantially lower complexity. The resulting models are simpler, smoother, and exhibit narrower confidence intervals, enhancing interpretability. This framework provides a general approach for automated optimization of transparent, high-performing models. The code can be found at https://github.com/KaaustaaubShankar/GeneticAdditiveModels.
