Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms

Per Kristian Lehre; Shishen Lin

Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms

Per Kristian Lehre, Shishen Lin

TL;DR

This work develops a novel recurrence-based drift theorem that yields exponential tail bounds for first hitting times under a broad range of drift regimes, including positive, weak, zero, and negative drift, by leveraging variance properties and the extended Optional Stopping Theorem. The framework is then applied to diverse algorithms, producing strong high-probability guarantees: (i) RWAB regret concentrates in non-stationary two-armed bandits, and (ii) RLS-PD finds Nash equilibria in Bilinear maximin benchmarks with an $O(n^{1.5})$ runtime that concentrates, while also exhibiting NE forgetting w.h.p. The authors also demonstrate tail bounds for classical problems like Random 2-SAT and Graph Colouring, providing polynomial-time tails $O(n^{4})$. Empirical studies corroborate the theory, showing exponentially decaying tails for runtimes and regrets, and highlighting practical implications for algorithm reliability and stability. Overall, the paper offers a general toolkit for sharp runtime and regret concentration in stochastic algorithms via drift recurrences and optional stopping, with clear avenues for future work on stabilizing coevolutionary dynamics and refining bandit strategies.

Abstract

Runtime analysis, as a branch of the theory of AI, studies how the number of iterations algorithms take before finding a solution (its runtime) depends on the design of the algorithm and the problem structure. Drift analysis is a state-of-the-art tool for estimating the runtime of randomised algorithms, such as evolutionary and bandit algorithms. Drift refers roughly to the expected progress towards the optimum per iteration. This paper considers the problem of deriving concentration tail-bounds on the runtime/regret of algorithms. It provides a novel drift theorem that gives precise exponential tail-bounds given positive, weak, zero and even negative drift. Previously, such exponential tail bounds were missing in the case of weak, zero, or negative drift. Our drift theorem can be used to prove a strong concentration of the runtime/regret of algorithms in AI. For example, we prove that the regret of the \rwab bandit algorithm is highly concentrated, while previous analyses only considered the expected regret. This means that the algorithm obtains the optimum within a given time frame with high probability, i.e. a form of algorithm reliability. Moreover, our theorem implies that the time needed by the co-evolutionary algorithm RLS-PD to obtain a Nash equilibrium in a \bilinear max-min-benchmark problem is highly concentrated. However, we also prove that the algorithm forgets the Nash equilibrium, and the time until this occurs is highly concentrated. This highlights a weakness in the RLS-PD which should be addressed by future work.

Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms

TL;DR

runtime that concentrates, while also exhibiting NE forgetting w.h.p. The authors also demonstrate tail bounds for classical problems like Random 2-SAT and Graph Colouring, providing polynomial-time tails

. Empirical studies corroborate the theory, showing exponentially decaying tails for runtimes and regrets, and highlighting practical implications for algorithm reliability and stability. Overall, the paper offers a general toolkit for sharp runtime and regret concentration in stochastic algorithms via drift recurrences and optional stopping, with clear avenues for future work on stabilizing coevolutionary dynamics and refining bandit strategies.

Abstract

Paper Structure (29 sections, 29 theorems, 50 equations, 5 figures, 2 tables, 5 algorithms)

This paper contains 29 sections, 29 theorems, 50 equations, 5 figures, 2 tables, 5 algorithms.

Introduction
Related Works
Our Contributions
Preliminaries
Previous Works and Discussion
A Recurrent Method in Upper Tail Bound
Variance Overcomes Negative Drift w.h.p.
Standard Variance Drift
Standard Drift
Applications to Random 2-SAT and Graph Colouring
Applications to Random 2-SAT
Applications to Graph Colouring
Applications to Coevolutionary Algorithms
The Bilinear Problem
RLS-PD solves Bilinear efficiently w.h.p.
...and 14 more sections

Key Result

Lemma 1

Let $(X_{t})_{t\geq 0}$ be random variables over $\mathbb{R}_{\geq 0}$, each with finite expectation. Let $T$ be any stopping time of $X_{t}$. If there exist constants $r,\eta>0$ with respect to $j,t$ such that for any $j\geq 0$, $\mathrm{E}\mathord{\left( \mathds{1}_{\{T > t\}} \mathds{1}_{\{|X_{t}

Figures (5)

Figure 1: Runtime distribution for RLS-PD for various $\alpha$ and $\beta$.
Figure 2: Regret distribution for various values of $T$ and $L$.
Figure 3: Classes of accumulated Regret along time horizon.
Figure 4: Runtime distribution for RLS-PD for various values of $\alpha$ and $\beta$, $n=1000$.
Figure 5: Regret distribution for various values of $T$ and $L$.

Theorems & Definitions (55)

Definition 1
Definition 2
Definition 3
Lemma 1
Theorem 1
Theorem 2
Corollary 2
Theorem 3
Theorem 4
Theorem 5
...and 45 more

Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms

TL;DR

Abstract

Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (55)