Table of Contents
Fetching ...

Chaotic Map based Compression Approach to Classification

Harikrishnan N B, Anuja Vats, Nithin Nagaraj, Marius Pedersen

TL;DR

The paper reframes learning as an information-theoretic encoding problem under the MDL principle, proposing a chaotic GLS coding classifier that maps data to intervals of initial conditions in a skew tent map. By using backward iteration and a second-return map driven by class-specific symbol-pair probabilities, the method achieves a compact parameter budget of $4 \times \text{number of classes}$ and a single learning hyperparameter, while maintaining interpretability. It demonstrates competitive results on multiple datasets (e.g., $92.98\%$ accuracy on Breast Cancer Wisconsin, close to Naive Bayes at $94.74\%$) and establishes a theoretical link to MDL optimality via the description-length formulation $L_{total} = -\log_2 (U - L) - \log_2 (P(h))$. The approach highlights the potential of combining information theory and chaotic dynamics to obtain simple, efficient, and transparent classifiers with practical impact. This work suggests promising directions for sustainable AI by prioritizing compressibility and interpretability over ever-increasing model complexity.

Abstract

Modern machine learning approaches often prioritize performance at the cost of increased complexity, computational demands, and reduced interpretability. This paper introduces a novel framework that challenges this trend by reinterpreting learning from an information-theoretic perspective, viewing it as a search for encoding schemes that capture intrinsic data structures through compact representations. Rather than following the conventional approach of fitting data to complex models, we propose a fundamentally different method that maps data to intervals of initial conditions in a dynamical system. Our GLS (Generalized Lüroth Series) coding compression classifier employs skew tent maps - a class of chaotic maps - both for encoding data into initial conditions and for subsequent recovery. The effectiveness of this simple framework is noteworthy, with performance closely approaching that of well-established machine learning methods. On the breast cancer dataset, our approach achieves 92.98\% accuracy, comparable to Naive Bayes at 94.74\%. While these results do not exceed state-of-the-art performance, the significance of our contribution lies not in outperforming existing methods but in demonstrating that a fundamentally simpler, more interpretable approach can achieve competitive results.

Chaotic Map based Compression Approach to Classification

TL;DR

The paper reframes learning as an information-theoretic encoding problem under the MDL principle, proposing a chaotic GLS coding classifier that maps data to intervals of initial conditions in a skew tent map. By using backward iteration and a second-return map driven by class-specific symbol-pair probabilities, the method achieves a compact parameter budget of and a single learning hyperparameter, while maintaining interpretability. It demonstrates competitive results on multiple datasets (e.g., accuracy on Breast Cancer Wisconsin, close to Naive Bayes at ) and establishes a theoretical link to MDL optimality via the description-length formulation . The approach highlights the potential of combining information theory and chaotic dynamics to obtain simple, efficient, and transparent classifiers with practical impact. This work suggests promising directions for sustainable AI by prioritizing compressibility and interpretability over ever-increasing model complexity.

Abstract

Modern machine learning approaches often prioritize performance at the cost of increased complexity, computational demands, and reduced interpretability. This paper introduces a novel framework that challenges this trend by reinterpreting learning from an information-theoretic perspective, viewing it as a search for encoding schemes that capture intrinsic data structures through compact representations. Rather than following the conventional approach of fitting data to complex models, we propose a fundamentally different method that maps data to intervals of initial conditions in a dynamical system. Our GLS (Generalized Lüroth Series) coding compression classifier employs skew tent maps - a class of chaotic maps - both for encoding data into initial conditions and for subsequent recovery. The effectiveness of this simple framework is noteworthy, with performance closely approaching that of well-established machine learning methods. On the breast cancer dataset, our approach achieves 92.98\% accuracy, comparable to Naive Bayes at 94.74\%. While these results do not exceed state-of-the-art performance, the significance of our contribution lies not in outperforming existing methods but in demonstrating that a fundamentally simpler, more interpretable approach can achieve competitive results.

Paper Structure

This paper contains 11 sections, 24 equations, 4 figures, 2 tables, 4 algorithms.

Figures (4)

  • Figure 1: Plot of the skew tent map with $b=0.3$.
  • Figure 2: Trajectory of the skew tent map for the first 100 values after removing the initial 500 transients. The parameter $b$ and initial value $x_0$ are 0.34 and 0.26, respectively.
  • Figure 3: Plot of the average compressed file size as a function of the skew parameter $b$. The red dashed line highlights the optimal value $b = 0.45$, corresponding to the empirical probability of zeros in the symbolic sequence.
  • Figure 4: Plot of the second return skew tent map with symbolic sequence regions labeled as $00, 01, 11, 10$. The arrows indicate the corresponding intervals for each sequence. The skewness parameters used are $p_{00} = \frac{2}{5}$, $p_{01} = \frac{1}{5}$, $p_{11} = \frac{1}{5}$, and $p_{10} = \frac{1}{5}$.