Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More

Ce Jin; Yinzhan Xu

Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More

Ce Jin, Yinzhan Xu

TL;DR

The paper introduces a novel application of the large sieve inequality to mod-prime hashing, enabling faster sparse-convolution-based algorithms while addressing collision and derandomization drawbacks. It delivers a spectrum of results: a Monte Carlo $O(t\log t)$-time Sparse General Convolution in small universes, a Las Vegas $O(t\log t)$-time Sparse Nonnegative Convolution with high-probability guarantees, and deterministic improvements for Text-to-Pattern Hamming Distances together with a deterministic modulus-selection framework. The techniques combine Prony's method (in several variants), FFT-based bucketing, universe-reduction, and conditional-expectation-based derandomization, achieving near-linear scaling in the output sparsity and constructive results under word-RAM assumptions. The findings yield faster algorithms for sparse convolution, pattern-matching subproblems, and the Constellation problem, with practical implications for efficient sparse-additive computations in the standard RAM model. The work also opens directions for relaxing universes, extending to general convolution, and further derandomization using analytic-number-theory tools.

Abstract

In sparse convolution-type problems, a common technique is to hash the input integers modulo a random prime $p\in [Q/2,Q]$ for some parameter $Q$, which reduces the range of the input integers while preserving their additive structure. However, this hash family suffers from two drawbacks, which led to bottlenecks in many state-of-the-art algorithms: (1) The collision probability of two elements from $[N]$ is $O(\frac{\log N}{Q})$ rather than $O(\frac{1}{Q})$; (2) It is difficult to derandomize the choice of $p$; known derandomization techniques lead to super-logarithmic overhead [Chan, Lewenstein STOC'15]. In this paper, we partially overcome these drawbacks in certain scenarios, via novel applications of the large sieve inequality from analytic number theory. Consequently, we obtain the following improved algorithms for various problems (in the standard word RAM model): Sparse Nonnegative Convolution: We obtain an $O(t\log t)$-time Las Vegas algorithm that computes the convolution $A\star B$ of two nonnegative integer vectors $A,B$, where $t$ is the output sparsity $\|A\star B\|_0$. Moreover, our algorithm terminates in $O(t\log t)$ time with $1-1/\mathrm{poly}(t)$ probability. Text-to-Pattern Hamming Distances: Given a length-$m$ pattern $P$ and a length-$n$ text $T$, we obtain a deterministic $O(n\sqrt{m\log \log m})$-time algorithm that exactly computes the Hamming distance between $P$ and every length-$m$ substring of $T$. Sparse General Convolution: We also give a Monte Carlo $O(t\log t)$ time algorithm for sparse convolution with possibly negative input in the restricted case where the length $N$ of the input vectors satisfies $N\le t^{1.99}$.

Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More

TL;DR

-time Sparse General Convolution in small universes, a Las Vegas

-time Sparse Nonnegative Convolution with high-probability guarantees, and deterministic improvements for Text-to-Pattern Hamming Distances together with a deterministic modulus-selection framework. The techniques combine Prony's method (in several variants), FFT-based bucketing, universe-reduction, and conditional-expectation-based derandomization, achieving near-linear scaling in the output sparsity and constructive results under word-RAM assumptions. The findings yield faster algorithms for sparse convolution, pattern-matching subproblems, and the Constellation problem, with practical implications for efficient sparse-additive computations in the standard RAM model. The work also opens directions for relaxing universes, extending to general convolution, and further derandomization using analytic-number-theory tools.

Abstract

In sparse convolution-type problems, a common technique is to hash the input integers modulo a random prime

for some parameter

, which reduces the range of the input integers while preserving their additive structure. However, this hash family suffers from two drawbacks, which led to bottlenecks in many state-of-the-art algorithms: (1) The collision probability of two elements from

rather than

; (2) It is difficult to derandomize the choice of

; known derandomization techniques lead to super-logarithmic overhead [Chan, Lewenstein STOC'15]. In this paper, we partially overcome these drawbacks in certain scenarios, via novel applications of the large sieve inequality from analytic number theory. Consequently, we obtain the following improved algorithms for various problems (in the standard word RAM model): Sparse Nonnegative Convolution: We obtain an

-time Las Vegas algorithm that computes the convolution

of two nonnegative integer vectors

, where

is the output sparsity

. Moreover, our algorithm terminates in

time with

probability. Text-to-Pattern Hamming Distances: Given a length-

pattern

and a length-

text

, we obtain a deterministic

-time algorithm that exactly computes the Hamming distance between

and every length-

substring of

. Sparse General Convolution: We also give a Monte Carlo

time algorithm for sparse convolution with possibly negative input in the restricted case where the length

of the input vectors satisfies

Paper Structure (34 sections, 68 theorems, 101 equations, 6 algorithms)

This paper contains 34 sections, 68 theorems, 101 equations, 6 algorithms.

Introduction
Technical overview
Further related works
Open questions
Preliminaries
Notations
Machine model
Large sieve inequality and its consequences
Basic algebraic tools
Prony's method
Sparsity test
Known lemmas for sparse convolution
Sparse General Convolution in Small Universe
The algorithm
Las Vegas Sparse Nonnegative Convolution
...and 19 more sections

Key Result

Theorem 1.1

Given two vectors $A,B\in \mathbb{N}^N$, one can compute $A\star B$ by a Las Vegas algorithm that terminates in $O(t\log t)$ time with at least $1-1/t$ probability, where $t=\|A\star B\|_0$.

Theorems & Definitions (134)

Theorem 1.1: Las Vegas Sparse Nonnegative Convolution w.h.p.
Theorem 1.2: Constellation
Theorem 1.3: Sparse General Convolution in small universe
Theorem 1.4
Theorem 1.5: Deterministic $X+Y$ lemma
Theorem 2.1: see e.g., iwanieckowalski
Theorem 2.2: wolke
Corollary 2.3
Lemma 2.4
proof
...and 124 more

Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More

TL;DR

Abstract

Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (134)