Table of Contents
Fetching ...

Genetic Generalized Additive Models

Kaaustaaub Shankar, Kelly Cohen

TL;DR

This work addresses the need for interpretable yet accurate models by applying NSGA-II to optimize Generalized Additive Models (GAMs) across accuracy and interpretability objectives. The method defines a complexity penalty $C = 0.70 U + 0.30 S$, where $U$ is the mean width of the 95% confidence intervals and $S$ measures sparsity, and evaluates models via cross-validated $RMSE$ on the California Housing dataset. Results show NSGA-II discovers GAMs that outperform a baseline LinearGAM in predictive accuracy or match its performance with substantially lower complexity and smoother, more honest confidence intervals. This demonstrates a general, automated framework for designing transparent, high-performing models, with an implementation available at the project repository.

Abstract

Generalized Additive Models (GAMs) balance predictive accuracy and interpretability, but manually configuring their structure is challenging. We propose using the multi-objective genetic algorithm NSGA-II to automatically optimize GAMs, jointly minimizing prediction error (RMSE) and a Complexity Penalty that captures sparsity, smoothness, and uncertainty. Experiments on the California Housing dataset show that NSGA-II discovers GAMs that outperform baseline LinearGAMs in accuracy or match performance with substantially lower complexity. The resulting models are simpler, smoother, and exhibit narrower confidence intervals, enhancing interpretability. This framework provides a general approach for automated optimization of transparent, high-performing models. The code can be found at https://github.com/KaaustaaubShankar/GeneticAdditiveModels.

Genetic Generalized Additive Models

TL;DR

This work addresses the need for interpretable yet accurate models by applying NSGA-II to optimize Generalized Additive Models (GAMs) across accuracy and interpretability objectives. The method defines a complexity penalty , where is the mean width of the 95% confidence intervals and measures sparsity, and evaluates models via cross-validated on the California Housing dataset. Results show NSGA-II discovers GAMs that outperform a baseline LinearGAM in predictive accuracy or match its performance with substantially lower complexity and smoother, more honest confidence intervals. This demonstrates a general, automated framework for designing transparent, high-performing models, with an implementation available at the project repository.

Abstract

Generalized Additive Models (GAMs) balance predictive accuracy and interpretability, but manually configuring their structure is challenging. We propose using the multi-objective genetic algorithm NSGA-II to automatically optimize GAMs, jointly minimizing prediction error (RMSE) and a Complexity Penalty that captures sparsity, smoothness, and uncertainty. Experiments on the California Housing dataset show that NSGA-II discovers GAMs that outperform baseline LinearGAMs in accuracy or match performance with substantially lower complexity. The resulting models are simpler, smoother, and exhibit narrower confidence intervals, enhancing interpretability. This framework provides a general approach for automated optimization of transparent, high-performing models. The code can be found at https://github.com/KaaustaaubShankar/GeneticAdditiveModels.
Paper Structure (12 sections, 4 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 4 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Pareto Front for seed 7
  • Figure 2: Pareto Front for seed 42
  • Figure 3: Pareto Front for seed 123
  • Figure 4: Pareto Front for seed 225
  • Figure 5: Pareto Front for seed 729
  • ...and 2 more figures