Table of Contents
Fetching ...

Seed Selection in the Heterogeneous Moran Process

Petros Petsinis, Andreas Pavlogiannis, Josef Tkadlec, Panagiotis Karras

TL;DR

This work studies the natural optimization problem of seed selection: given a budget k, which k agents should initiate the mutant invasion to maximize the fixation probability, and shows that the problem is strongly inapproximable: it is NP-hard to distinguish between maximum fixation probability 0 and 1.

Abstract

The Moran process is a classic stochastic process that models the rise and takeover of novel traits in network-structured populations. In biological terms, a set of mutants, each with fitness $m\in(0,\infty)$ invade a population of residents with fitness $1$. Each agent reproduces at a rate proportional to its fitness and each offspring replaces a random network neighbor. The process ends when the mutants either fixate (take over the whole population) or go extinct. The fixation probability measures the success of the invasion. To account for environmental heterogeneity, we study a generalization of the Standard process, called the Heterogeneous Moran process. Here, the fitness of each agent is determined both by its type (resident/mutant) and the node it occupies. We study the natural optimization problem of seed selection: given a budget $k$, which $k$ agents should initiate the mutant invasion to maximize the fixation probability? We show that the problem is strongly inapproximable: it is $\mathbf{NP}$-hard to distinguish between maximum fixation probability 0 and 1. We then focus on mutant-biased networks, where each node exhibits at least as large mutant fitness as resident fitness. We show that the problem remains $\mathbf{NP}$-hard, but the fixation probability becomes submodular, and thus the optimization problem admits a greedy $(1-1/e)$-approximation. An experimental evaluation of the greedy algorithm along with various heuristics on real-world data sets corroborates our results.

Seed Selection in the Heterogeneous Moran Process

TL;DR

This work studies the natural optimization problem of seed selection: given a budget k, which k agents should initiate the mutant invasion to maximize the fixation probability, and shows that the problem is strongly inapproximable: it is NP-hard to distinguish between maximum fixation probability 0 and 1.

Abstract

The Moran process is a classic stochastic process that models the rise and takeover of novel traits in network-structured populations. In biological terms, a set of mutants, each with fitness invade a population of residents with fitness . Each agent reproduces at a rate proportional to its fitness and each offspring replaces a random network neighbor. The process ends when the mutants either fixate (take over the whole population) or go extinct. The fixation probability measures the success of the invasion. To account for environmental heterogeneity, we study a generalization of the Standard process, called the Heterogeneous Moran process. Here, the fitness of each agent is determined both by its type (resident/mutant) and the node it occupies. We study the natural optimization problem of seed selection: given a budget , which agents should initiate the mutant invasion to maximize the fixation probability? We show that the problem is strongly inapproximable: it is -hard to distinguish between maximum fixation probability 0 and 1. We then focus on mutant-biased networks, where each node exhibits at least as large mutant fitness as resident fitness. We show that the problem remains -hard, but the fixation probability becomes submodular, and thus the optimization problem admits a greedy -approximation. An experimental evaluation of the greedy algorithm along with various heuristics on real-world data sets corroborates our results.
Paper Structure (8 sections, 19 theorems, 35 equations, 8 figures, 1 table)

This paper contains 8 sections, 19 theorems, 35 equations, 8 figures, 1 table.

Key Result

Lemma 1

Given an undirected and mutant-biased fitness graph $\mathcal{G}$ and a seed set $S\subseteq V$, the expected time to convergence $T(\mathcal{G},S)$ satisfies $T(\mathcal{G},S)\leq(n^2\cdot\frac{m_{\max}}{r_{\min}})^3$.

Figures (8)

  • Figure 1: Moran processes (with and without environmental heterogeneity) and the complexity of seed selection.
  • Figure 2: Two steps in the Heterogeneous Moran process; mutants/residents are marked in red/blue; the numbers indicate type-dependent mutant/resident fitness (top/bottom).
  • Figure 3: Optimal seed set $S^*$ (in red) while varying the mutant fitness and seed size $k$; all residents have fitness 1.
  • Figure 4: (Left): Graph $G$ for a Set Cover instance with $\mathcal{U}=\{1,2,3,4,5\}$ and $\mathcal{S}=\{ \{1,4\}, \{1,2,4\}, \{3,5\} \}$. (Right): For $k=2$, the optimal seed set forms a Set Cover.
  • Figure 5: The Markov chain for the process of \ref{['lem:lower_mutant']}.
  • ...and 3 more figures

Theorems & Definitions (33)

  • Lemma 1
  • proof
  • Corollary 1
  • Lemma 2
  • Theorem 1
  • proof : Proof sketch
  • Theorem 2
  • proof : Proof sketch
  • Lemma 3
  • proof : Proof sketch of \ref{['lem:hardness_main']}, Item 1
  • ...and 23 more