Wavelet-Based Feature Extraction and Unsupervised Clustering for Parity Detection: A Feature Engineering Perspective
Ertugrul Mutlu
TL;DR
The paper investigates parity detection as a testbed for feature engineering using wavelet transforms and unsupervised clustering. It introduces a six-step pipeline that converts integers to binary signals, applies a level-3 Haar DWT, extracts statistics such as $E$, $\|\cdot\|_2$, and MAV, and clusters features with $k$-means to estimate oddness probabilities $P_{n,l,f}$. An oddness score $S_n = \frac{1}{Z}\sum_{l=1}^L w_l \sum_{f=1}^F P_{n,l,f}$ with a threshold of $0.5$ drives the final decision. On the 0–10,000 range, the method achieves $69.67\%$ accuracy, illustrating that traditional signal-processing tools can reveal latent structure in discrete domains while also exposing limitations in fully capturing symbolic parity.
Abstract
This paper explores a deliberately over-engineered approach to the classical problem of parity detection -- determining whether a number is odd or even -- by combining wavelet-based feature extraction with unsupervised clustering. Instead of relying on modular arithmetic, integers are transformed into wavelet-domain representations, from which multi-scale statistical features are extracted and clustered using the k-means algorithm. The resulting feature space reveals meaningful structural differences between odd and even numbers, achieving a classification accuracy of approximately 69.67% without any label supervision. These results suggest that classical signal-processing techniques, originally designed for continuous data, can uncover latent structure even in purely discrete symbolic domains. Beyond parity detection, the study provides an illustrative perspective on how feature engineering and clustering may be repurposed for unconventional machine learning problems, potentially bridging symbolic reasoning and feature-based learning.
