Multi-label feature selection based on binary hashing learning and dynamic graph constraints
Cong Guo, Changqin Huang, Wenhua Zhou, Xiaodi Huang
TL;DR
High-dimensional multi-label data pose challenging supervision and noise issues. The authors introduce BHDG, a novel method that uses binary hashing to generate low-dimensional pseudo-labels and couples them with dynamic graph and label-graph constraints to robustly guide feature selection. The approach jointly optimizes W, P, and binary B under an augmented Lagrangian framework, balancing regression, hashing, sparsity, and manifold regularization. Extensive experiments on ten datasets across six metrics demonstrate that BHDG consistently outperforms state-of-the-art baselines, highlighting its robustness and practical impact in multi-label feature selection.
Abstract
Multi-label learning poses significant challenges in extracting reliable supervisory signals from the label space. Existing approaches often employ continuous pseudo-labels to replace binary labels, improving supervisory information representation. However, these methods can introduce noise from irrelevant labels and lead to unreliable graph structures. To overcome these limitations, this study introduces a novel multi-label feature selection method called Binary Hashing and Dynamic Graph Constraint (BHDG), the first method to integrate binary hashing into multi-label learning. BHDG utilizes low-dimensional binary hashing codes as pseudo-labels to reduce noise and improve representation robustness. A dynamically constrained sample projection space is constructed based on the graph structure of these binary pseudo-labels, enhancing the reliability of the dynamic graph. To further enhance pseudo-label quality, BHDG incorporates label graph constraints and inner product minimization within the sample space. Additionally, an $l_{2,1}$-norm regularization term is added to the objective function to facilitate the feature selection process. The augmented Lagrangian multiplier (ALM) method is employed to optimize binary variables effectively. Comprehensive experiments on 10 benchmark datasets demonstrate that BHDG outperforms ten state-of-the-art methods across six evaluation metrics. BHDG achieves the highest overall performance ranking, surpassing the next-best method by an average of at least 2.7 ranks per metric, underscoring its effectiveness and robustness in multi-label feature selection.
