Unsupervised Dynamic Feature Selection for Robust Latent Spaces in Vision Tasks
Bruno Corcuera, Carlos Eiras-Franco, Brais Cancela
TL;DR
This work tackles the degradation of vision latent representations caused by noisy or irrelevant features by introducing Dynamic Feature Selection (DDS), an unsupervised module that masks input features per sample to keep at most $M$ features before downstream processing. DDS uses a differentiable hard-concrete gate to generate a per-sample top-$M$ mask, preserving 2-D structure and enabling integration with existing architectures for unsupervised tasks such as clustering and world-model latent learning. The authors demonstrate substantial gains: improved clustering performance across multiple datasets with reduced input features, and enhanced reconstruction fidelity and agent performance in world-model RL settings, with competitive or lower parameter counts. By providing a label-free, architecture-agnostic feature selection mechanism, DDS enhances robustness and interpretability of latent spaces in vision tasks and holds promise for broad application in unsupervised learning and generative modeling.
Abstract
Latent representations are critical for the performance and robustness of machine learning models, as they encode the essential features of data in a compact and informative manner. However, in vision tasks, these representations are often affected by noisy or irrelevant features, which can degrade the model's performance and generalization capabilities. This paper presents a novel approach for enhancing latent representations using unsupervised Dynamic Feature Selection (DFS). For each instance, the proposed method identifies and removes misleading or redundant information in images, ensuring that only the most relevant features contribute to the latent space. By leveraging an unsupervised framework, our approach avoids reliance on labeled data, making it broadly applicable across various domains and datasets. Experiments conducted on image datasets demonstrate that models equipped with unsupervised DFS achieve significant improvements in generalization performance across various tasks, including clustering and image generation, while incurring a minimal increase in the computational cost.
