Table of Contents
Fetching ...

Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate

Shiwei Zeng, Jie Shen

TL;DR

The paper studies robust, attribute-efficient PAC learning of an $s$-sparse halfspace in $\mathbb{R}^d$ under a constant malicious noise rate. It integrates an $L_\infty$-norm filter, soft outlier removal via SDP relaxation, and hinge-loss minimization over an $L_1$/$L_2$ sparsity-constrained set to achieve robustness under a margin-and-concentration framework with a mixture of logconcave marginals. The main technical contribution is a gradient analysis under the joint sparsity constraints that ensures the learned halfspace remains close to the ground truth despite adversarial perturbations, yielding poly$(s,\log d)$ sample complexity and tolerance to constant malicious noise. This work advances practical, provably robust sparse classification in high-dimensional, noisy environments by combining ideas from robust statistics, compressed sensing, and hinge-loss optimization.

Abstract

Attribute-efficient learning of sparse halfspaces has been a fundamental problem in machine learning theory. In recent years, machine learning algorithms are faced with prevalent data corruptions or even adversarial attacks. It is of central interest to design efficient algorithms that are robust to noise corruptions. In this paper, we consider that there exists a constant amount of malicious noise in the data and the goal is to learn an underlying $s$-sparse halfspace $w^* \in \mathbb{R}^d$ with $\text{poly}(s,\log d)$ samples. Specifically, we follow a recent line of works and assume that the underlying distribution satisfies a certain concentration condition and a margin condition at the same time. Under such conditions, we show that attribute-efficiency can be achieved by simple variants to existing hinge loss minimization programs. Our key contribution includes: 1) an attribute-efficient PAC learning algorithm that works under constant malicious noise rate; 2) a new gradient analysis that carefully handles the sparsity constraint in hinge loss minimization.

Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate

TL;DR

The paper studies robust, attribute-efficient PAC learning of an -sparse halfspace in under a constant malicious noise rate. It integrates an -norm filter, soft outlier removal via SDP relaxation, and hinge-loss minimization over an / sparsity-constrained set to achieve robustness under a margin-and-concentration framework with a mixture of logconcave marginals. The main technical contribution is a gradient analysis under the joint sparsity constraints that ensures the learned halfspace remains close to the ground truth despite adversarial perturbations, yielding poly sample complexity and tolerance to constant malicious noise. This work advances practical, provably robust sparse classification in high-dimensional, noisy environments by combining ideas from robust statistics, compressed sensing, and hinge-loss optimization.

Abstract

Attribute-efficient learning of sparse halfspaces has been a fundamental problem in machine learning theory. In recent years, machine learning algorithms are faced with prevalent data corruptions or even adversarial attacks. It is of central interest to design efficient algorithms that are robust to noise corruptions. In this paper, we consider that there exists a constant amount of malicious noise in the data and the goal is to learn an underlying -sparse halfspace with samples. Specifically, we follow a recent line of works and assume that the underlying distribution satisfies a certain concentration condition and a margin condition at the same time. Under such conditions, we show that attribute-efficiency can be achieved by simple variants to existing hinge loss minimization programs. Our key contribution includes: 1) an attribute-efficient PAC learning algorithm that works under constant malicious noise rate; 2) a new gradient analysis that carefully handles the sparsity constraint in hinge loss minimization.

Paper Structure

This paper contains 24 sections, 37 theorems, 89 equations, 3 algorithms.

Key Result

Theorem 2

Assume that Assumption ass:dataset-margin and ass:distribution-marginal hold. Let $S$ be a set of samples drawn from $\text{EX}(\mathcal{D},w^*,\eta)$ with $\left\lvert S \right\rvert\geq \Omega$s^2γ̅^2⋅^5dδϵ$$ and $\eta \leq \eta_0 \leq \frac{1}{2^{32}}$. For any $\epsilon\in(0,\frac{1}{2}),\delta\

Theorems & Definitions (69)

  • Definition 1: Learning sparse halfspaces with malicious noise
  • Theorem 2: Main result
  • Remark 3: Comparison to existing works
  • Remark 4: Noise rate bound
  • Remark 5: Adversarial label noise
  • Definition 6: Dense pancake condition
  • Definition 7: Gradient norm
  • Lemma 8
  • proof
  • Theorem 9: Correctness of $\hat{w}$ on good $(x,y)$
  • ...and 59 more