Table of Contents
Fetching ...

Adaptive Sparse Möbius Transforms for Learning Polynomials

Yigit Efe Erginbas, Justin Singh Kang, Elizabeth Polito, Kannan Ramchandran

TL;DR

This work tackles the problem of exactly learning $s$-sparse real-valued Boolean polynomials of degree $d$ on the Boolean cube using the $\mathsf{AND}$ (Möbius) basis, where coherence in the basis makes standard compressed sensing ineffective. It introduces two constructively adaptive algorithms, PASMT and FASMT, that leverage additive group testing designs to recover nonzero Möbius coefficients with provably efficient query complexity; FASMT achieves $O\big(sd\log(n/d)\big)$ adaptive queries and near-linear time, while PASMT uses $O\big(sd^2\log n\big)$ queries with $O(d^2\log n)$ rounds. A matching information-theoretic lower bound of $\Omega\big(\frac{sd\log(n/d)}{\log s}\big)$ situates FASMT as near-optimal across regimes. The methods yield improvements in hypergraph reconstruction via induced-subgraph queries, achieving $O(sd\log n)$ additive queries, and are demonstrated on real and synthetic hypergraphs. The work provides a practical, adaptive framework linking Möbius inversion, group testing, and sparse learning with potential impact on graph and network inference tasks.

Abstract

We consider the problem of exactly learning an $s$-sparse real-valued Boolean polynomial of degree $d$ of the form $f:\{ 0,1\}^n \rightarrow \mathbb{R}$. This problem corresponds to decomposing functions in the AND basis and is known as taking a Möbius transform. While the analogous problem for the parity basis (Fourier transform) $f: \{-1,1 \}^n \rightarrow \mathbb{R}$ is well-understood, the AND basis presents a unique challenge: the basis vectors are coherent, precluding standard compressed sensing methods. We overcome this challenge by identifying that we can exploit adaptive group testing to provide a constructive, query-efficient implementation of the Möbius transform (also known as Möbius inversion) for sparse functions. We present two algorithms based on this insight. The Fully-Adaptive Sparse Möbius Transform (FASMT) uses $O(sd \log(n/d))$ adaptive queries in $O((sd + n) sd \log(n/d))$ time, which we show is near-optimal in query complexity. Furthermore, we also present the Partially-Adaptive Sparse Möbius Transform (PASMT), which uses $O(sd^2\log(n/d))$ queries, trading a factor of $d$ to reduce the number of adaptive rounds to $O(d^2\log(n/d))$, with no dependence on $s$. When applied to hypergraph reconstruction from edge-count queries, our results improve upon baselines by avoiding the combinatorial explosion in the rank $d$. We demonstrate the practical utility of our method for hypergraph reconstruction by applying it to learning real hypergraphs in simulations.

Adaptive Sparse Möbius Transforms for Learning Polynomials

TL;DR

This work tackles the problem of exactly learning -sparse real-valued Boolean polynomials of degree on the Boolean cube using the (Möbius) basis, where coherence in the basis makes standard compressed sensing ineffective. It introduces two constructively adaptive algorithms, PASMT and FASMT, that leverage additive group testing designs to recover nonzero Möbius coefficients with provably efficient query complexity; FASMT achieves adaptive queries and near-linear time, while PASMT uses queries with rounds. A matching information-theoretic lower bound of situates FASMT as near-optimal across regimes. The methods yield improvements in hypergraph reconstruction via induced-subgraph queries, achieving additive queries, and are demonstrated on real and synthetic hypergraphs. The work provides a practical, adaptive framework linking Möbius inversion, group testing, and sparse learning with potential impact on graph and network inference tasks.

Abstract

We consider the problem of exactly learning an -sparse real-valued Boolean polynomial of degree of the form . This problem corresponds to decomposing functions in the AND basis and is known as taking a Möbius transform. While the analogous problem for the parity basis (Fourier transform) is well-understood, the AND basis presents a unique challenge: the basis vectors are coherent, precluding standard compressed sensing methods. We overcome this challenge by identifying that we can exploit adaptive group testing to provide a constructive, query-efficient implementation of the Möbius transform (also known as Möbius inversion) for sparse functions. We present two algorithms based on this insight. The Fully-Adaptive Sparse Möbius Transform (FASMT) uses adaptive queries in time, which we show is near-optimal in query complexity. Furthermore, we also present the Partially-Adaptive Sparse Möbius Transform (PASMT), which uses queries, trading a factor of to reduce the number of adaptive rounds to , with no dependence on . When applied to hypergraph reconstruction from edge-count queries, our results improve upon baselines by avoiding the combinatorial explosion in the rank . We demonstrate the practical utility of our method for hypergraph reconstruction by applying it to learning real hypergraphs in simulations.
Paper Structure (45 sections, 14 theorems, 29 equations, 3 figures, 2 tables)

This paper contains 45 sections, 14 theorems, 29 equations, 3 figures, 2 tables.

Key Result

Lemma 1

Consider $\mathbf H \in \{0,1\}^{n \times t}$ and $\boldsymbol{\ell} \in \{0,1\}^t$. Define the query vector $\mathbf x = \neg(\mathbf H \cdot \neg\boldsymbol{\ell})$. Then where the inequality is component-wise over the Boolean semiring.

Figures (3)

  • Figure 1: Illustration of PASMT (left) and FASMT (right). Each node represents a bin $\mathcal{B}(\boldsymbol{\ell})$ with bin sum $V$. Dashed nodes indicate pruned bins with $V=0$. Circled numbers in the top-left corner of each node indicate the processing order. (a) In PASMT, nodes at the same depth are processed jointly using shared test vectors $\mathbf h_1 = 0011$, $\mathbf h_2 = 1010$, $\mathbf h_3 = 1100$. At the leaf nodes, the coefficient location $\mathbf k$ is recovered via $\textsc{Decode}(\mathbf H, \boldsymbol{\ell})$. (b) In FASMT, test vectors $\mathbf h_{\boldsymbol{\epsilon}} = 1011$, $\mathbf h_0 = 0100$, $\mathbf h_1 = 1010$, $\mathbf h_{10} = 0100$ are chosen adaptively based on the node location $\boldsymbol{\ell}$. Bins are processed sequentially in lexicographic order, with coefficients recovered via GBSA.
  • Figure 2: Optimality Ratio for FASMT on sparse hypergraph reconstruction. Each bar corresponds to a distinct hypergraph. We define the Optimality Ratio as $\frac{q \log(s)}{sd\log{(n/d)}}$. We evaluate performance on two datasets: the BiGG metabolic pathways ($n \in [485, 1877], d \in [75,331], s\in[29,73]$), modeling compounds as vertices and reactions as hyperedges; and the ISCAS-85 benchmark ($n \in [243, 3719], d \in [9, 17], s\in[211, 3611]$), modeling logic gates as vertices and wires as hyperedges. Further details are provided in Appendix \ref{['app:experiment_details']}. Standard compressed sensing methods are computationally intractable for hypergraphs of these dimensions.
  • Figure 3: The plots display the number of queries required as a function of $n, s$ and $d$. The top row illustrates the scaling of the PASMT algorithm, while the bottom row illustrates the FASMT algorithm alongside the Information Theoretic lower bound. Each data point averages $10$ samples and shaded region is standard deviation. Note the difference in scale on the $y$-axes between the top and bottom rows.

Theorems & Definitions (33)

  • Definition 1: Sparsity and Degree
  • Definition 2: Bins and Bin Sums
  • Lemma 1: Subsampling
  • proof
  • Lemma 1: Bin Refinement
  • proof
  • Theorem 2: Correctness and Complexity
  • proof
  • Definition 3: Residual Function
  • Lemma 2: Bin Refinement
  • ...and 23 more