Table of Contents
Fetching ...

Kryptonite-N: Machine Learning Strikes Back

Albus Li, Nathan Bailey, Will Sumerfield, Kira Kim

TL;DR

Kryptonite-N investigates whether universal function approximation can fail on high-dimensional XOR-like datasets. The authors demonstrate that a combination of data standardization, neural networks, and polynomial-basis expansion logistic regression can achieve strong performance across N=9–18, with L1 regularization and feature selection unlocking sparsity in high-dimensional spaces. The work reveals the dataset's hidden structure (1/3 irrelevants, 2/3 informatives) and provides practical guidance on model choice, hyperparameter tuning, and sustainability considerations for deployment. Overall, the results support universal function approximation and highlight when simple linear models, augmented with careful feature engineering, can outperform more complex architectures on structured, high-dimensional tasks.

Abstract

Quinn et al propose challenge datasets in their work called ``Kryptonite-N". These datasets aim to counter the universal function approximation argument of machine learning, breaking the notation that machine learning can ``approximate any continuous function" \cite{original_paper}. Our work refutes this claim and shows that universal function approximations can be applied successfully; the Kryptonite datasets are constructed predictably, allowing logistic regression with sufficient polynomial expansion and L1 regularization to solve for any dimension N.

Kryptonite-N: Machine Learning Strikes Back

TL;DR

Kryptonite-N investigates whether universal function approximation can fail on high-dimensional XOR-like datasets. The authors demonstrate that a combination of data standardization, neural networks, and polynomial-basis expansion logistic regression can achieve strong performance across N=9–18, with L1 regularization and feature selection unlocking sparsity in high-dimensional spaces. The work reveals the dataset's hidden structure (1/3 irrelevants, 2/3 informatives) and provides practical guidance on model choice, hyperparameter tuning, and sustainability considerations for deployment. Overall, the results support universal function approximation and highlight when simple linear models, augmented with careful feature engineering, can outperform more complex architectures on structured, high-dimensional tasks.

Abstract

Quinn et al propose challenge datasets in their work called ``Kryptonite-N". These datasets aim to counter the universal function approximation argument of machine learning, breaking the notation that machine learning can ``approximate any continuous function" \cite{original_paper}. Our work refutes this claim and shows that universal function approximations can be applied successfully; the Kryptonite datasets are constructed predictably, allowing logistic regression with sufficient polynomial expansion and L1 regularization to solve for any dimension N.
Paper Structure (41 sections, 8 equations, 19 figures, 5 tables)

This paper contains 41 sections, 8 equations, 19 figures, 5 tables.

Figures (19)

  • Figure 1: PMF Across Dimensions
  • Figure 2: Correlation Matrix of Dimensions and Label
  • Figure 3: Correlation Matrix of Dimensions and Label
  • Figure 4: Dimensions 1 and 2
  • Figure 5: Cross Validation Accuracy Score Distribution of Logistic Regression Model with varying degrees of Polynomial basis
  • ...and 14 more figures