Table of Contents
Fetching ...

Wavelet-Based Feature Extraction and Unsupervised Clustering for Parity Detection: A Feature Engineering Perspective

Ertugrul Mutlu

TL;DR

The paper investigates parity detection as a testbed for feature engineering using wavelet transforms and unsupervised clustering. It introduces a six-step pipeline that converts integers to binary signals, applies a level-3 Haar DWT, extracts statistics such as $E$, $\|\cdot\|_2$, and MAV, and clusters features with $k$-means to estimate oddness probabilities $P_{n,l,f}$. An oddness score $S_n = \frac{1}{Z}\sum_{l=1}^L w_l \sum_{f=1}^F P_{n,l,f}$ with a threshold of $0.5$ drives the final decision. On the 0–10,000 range, the method achieves $69.67\%$ accuracy, illustrating that traditional signal-processing tools can reveal latent structure in discrete domains while also exposing limitations in fully capturing symbolic parity.

Abstract

This paper explores a deliberately over-engineered approach to the classical problem of parity detection -- determining whether a number is odd or even -- by combining wavelet-based feature extraction with unsupervised clustering. Instead of relying on modular arithmetic, integers are transformed into wavelet-domain representations, from which multi-scale statistical features are extracted and clustered using the k-means algorithm. The resulting feature space reveals meaningful structural differences between odd and even numbers, achieving a classification accuracy of approximately 69.67% without any label supervision. These results suggest that classical signal-processing techniques, originally designed for continuous data, can uncover latent structure even in purely discrete symbolic domains. Beyond parity detection, the study provides an illustrative perspective on how feature engineering and clustering may be repurposed for unconventional machine learning problems, potentially bridging symbolic reasoning and feature-based learning.

Wavelet-Based Feature Extraction and Unsupervised Clustering for Parity Detection: A Feature Engineering Perspective

TL;DR

The paper investigates parity detection as a testbed for feature engineering using wavelet transforms and unsupervised clustering. It introduces a six-step pipeline that converts integers to binary signals, applies a level-3 Haar DWT, extracts statistics such as , , and MAV, and clusters features with -means to estimate oddness probabilities . An oddness score with a threshold of drives the final decision. On the 0–10,000 range, the method achieves accuracy, illustrating that traditional signal-processing tools can reveal latent structure in discrete domains while also exposing limitations in fully capturing symbolic parity.

Abstract

This paper explores a deliberately over-engineered approach to the classical problem of parity detection -- determining whether a number is odd or even -- by combining wavelet-based feature extraction with unsupervised clustering. Instead of relying on modular arithmetic, integers are transformed into wavelet-domain representations, from which multi-scale statistical features are extracted and clustered using the k-means algorithm. The resulting feature space reveals meaningful structural differences between odd and even numbers, achieving a classification accuracy of approximately 69.67% without any label supervision. These results suggest that classical signal-processing techniques, originally designed for continuous data, can uncover latent structure even in purely discrete symbolic domains. Beyond parity detection, the study provides an illustrative perspective on how feature engineering and clustering may be repurposed for unconventional machine learning problems, potentially bridging symbolic reasoning and feature-based learning.

Paper Structure

This paper contains 18 sections, 3 equations, 2 figures.

Figures (2)

  • Figure 1: Overall pipeline of the proposed method. The system transforms integer inputs into binary signals, applies multi-level wavelet decomposition and feature extraction, and uses k-means clustering and probability aggregation to predict parity with approximately 69.67% accuracy.
  • Figure 2: Scatter plot of predicted oddness scores for integers $0$--$1000$. Red points represent odd numbers and blue points represent even numbers. The majority of odd numbers are positioned above the $0.5$ decision threshold, while even numbers cluster below it. This visualization highlights the feature space separation achieved by the proposed wavelet-based clustering method.