Genetic Generalized Additive Models

Kaaustaaub Shankar; Kelly Cohen

Genetic Generalized Additive Models

Kaaustaaub Shankar, Kelly Cohen

TL;DR

This work addresses the need for interpretable yet accurate models by applying NSGA-II to optimize Generalized Additive Models (GAMs) across accuracy and interpretability objectives. The method defines a complexity penalty $C = 0.70 U + 0.30 S$, where $U$ is the mean width of the 95% confidence intervals and $S$ measures sparsity, and evaluates models via cross-validated $RMSE$ on the California Housing dataset. Results show NSGA-II discovers GAMs that outperform a baseline LinearGAM in predictive accuracy or match its performance with substantially lower complexity and smoother, more honest confidence intervals. This demonstrates a general, automated framework for designing transparent, high-performing models, with an implementation available at the project repository.

Abstract

Generalized Additive Models (GAMs) balance predictive accuracy and interpretability, but manually configuring their structure is challenging. We propose using the multi-objective genetic algorithm NSGA-II to automatically optimize GAMs, jointly minimizing prediction error (RMSE) and a Complexity Penalty that captures sparsity, smoothness, and uncertainty. Experiments on the California Housing dataset show that NSGA-II discovers GAMs that outperform baseline LinearGAMs in accuracy or match performance with substantially lower complexity. The resulting models are simpler, smoother, and exhibit narrower confidence intervals, enhancing interpretability. This framework provides a general approach for automated optimization of transparent, high-performing models. The code can be found at https://github.com/KaaustaaubShankar/GeneticAdditiveModels.

Genetic Generalized Additive Models

TL;DR

, where

is the mean width of the 95% confidence intervals and

measures sparsity, and evaluates models via cross-validated

on the California Housing dataset. Results show NSGA-II discovers GAMs that outperform a baseline LinearGAM in predictive accuracy or match its performance with substantially lower complexity and smoother, more honest confidence intervals. This demonstrates a general, automated framework for designing transparent, high-performing models, with an implementation available at the project repository.

Abstract

Paper Structure (12 sections, 4 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 4 equations, 7 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Generalized Additive Models (GAMs)
Genetic Algorithms
Pareto Front
Methodology
NSGA-II
California Housing Dataset
Baseline Models
Results
Discussion
Future Work

Figures (7)

Figure 1: Pareto Front for seed 7
Figure 2: Pareto Front for seed 42
Figure 3: Pareto Front for seed 123
Figure 4: Pareto Front for seed 225
Figure 5: Pareto Front for seed 729
...and 2 more figures

Genetic Generalized Additive Models

TL;DR

Abstract

Genetic Generalized Additive Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)