Table of Contents
Fetching ...

Deciding if a DAG is Interesting is Hard

Jean-Lou De Carufel, Anil Maheshwari, Saeed Odak, Bodhayan Roy, Michiel Smid, Marc Vicuna

TL;DR

This work analyzes Mapper-graph inspired optimization problems on edge-weighted directed acyclic graphs, focusing on the interestingness score $\texttt{score}(\Pi) = \sum_{i=1}^{\ell} w(e_i) \cdot \log_2(i+1)$. It provides polynomial-time reductions from $3$-SAT to the IP problem and from $(3,2)$-set-cover to the $k$-IP problem, proving NP-hardness for IP (even with only two distinct weights) and for every fixed $k \ge 3$ (even with unit weights). The reductions use elaborate variable/clause gadgets and set-system constructions, and they also discuss the challenge of NP-membership due to the transcendental nature of logarithmic sums. The results motivate exploring approximation strategies, with a straightforward greedy $1/k$-approximation highlighted as a baseline and several avenues for future research in hardness and algorithm design for Mapper-related optimization problems.

Abstract

The \emph{interestingness score} of a directed path $Π= e_1, e_2, e_3, \dots, e_\ell$ in an edge-weighted directed graph $G$ is defined as $\texttt{score}(Π) := \sum_{i=1}^\ell w(e_i) \cdot \log{(i+1)}$, where $w(e_i)$ is the weight of the edge $e_i$. We consider two optimization problems that arise in the analysis of Mapper graphs, which is a powerful tool in topological data analysis. In the IP problem, the objective is to find a collection $\mathcal{P}$ of edge-disjoint paths in $G$ with the maximum total interestingness score. %; that is, two raised to the power of the sum of the weights of the paths in $\mathcal{P}$. For $k \in \mathbb{N}$, the $k$-IP problem is a variant of the IP problem with the extra constraint that each path in $\mathcal{P}$ must have exactly $k$ edges. Kalyanaraman, Kamruzzaman, and Krishnamoorthy (Journal of Computational Geometry, 2019) claim that both IP and $k$-IP (for $k \geq 3$) are NP-complete. We point out some inaccuracies in their proofs. Furthermore, we show that both problems are NP-hard in directed acyclic graphs.

Deciding if a DAG is Interesting is Hard

TL;DR

This work analyzes Mapper-graph inspired optimization problems on edge-weighted directed acyclic graphs, focusing on the interestingness score . It provides polynomial-time reductions from -SAT to the IP problem and from -set-cover to the -IP problem, proving NP-hardness for IP (even with only two distinct weights) and for every fixed (even with unit weights). The reductions use elaborate variable/clause gadgets and set-system constructions, and they also discuss the challenge of NP-membership due to the transcendental nature of logarithmic sums. The results motivate exploring approximation strategies, with a straightforward greedy -approximation highlighted as a baseline and several avenues for future research in hardness and algorithm design for Mapper-related optimization problems.

Abstract

The \emph{interestingness score} of a directed path in an edge-weighted directed graph is defined as , where is the weight of the edge . We consider two optimization problems that arise in the analysis of Mapper graphs, which is a powerful tool in topological data analysis. In the IP problem, the objective is to find a collection of edge-disjoint paths in with the maximum total interestingness score. %; that is, two raised to the power of the sum of the weights of the paths in . For , the -IP problem is a variant of the IP problem with the extra constraint that each path in must have exactly edges. Kalyanaraman, Kamruzzaman, and Krishnamoorthy (Journal of Computational Geometry, 2019) claim that both IP and -IP (for ) are NP-complete. We point out some inaccuracies in their proofs. Furthermore, we show that both problems are NP-hard in directed acyclic graphs.

Paper Structure

This paper contains 4 sections, 2 theorems, 13 equations, 2 figures.

Key Result

Theorem 1

The IP problem is NP-hard.

Figures (2)

  • Figure 1: Let $f=C_1\land C_2\land\dots\land C_{\alpha_2} \land \dots \land C_{\alpha_3} \land \dots \land C_m$, where $C_{\alpha_2} = x_i \lor x_j \lor \overline{x}_k$ and $C_{\alpha_3} = \overline{x}_i \lor x_k \lor \overline{x}_{\ell}$. In this figure, we show the variable gadget for $x_i$ and the clause gadgets for $C_{\alpha_2}$ and $C_{\alpha_3}$. The vertices $\overline{y}_{i,\alpha_4}$ on the top and the bottom of the variable gadget are the same vertex (they are identified). The red and blue edges in the figure are type-$\mathrm{U}$ edges. They indicate the positive and negative appearances of the variable $x_i$, respectively.
  • Figure 2: Construction of the graph $G$ for a given instance $(U, \mathcal{S},\tau)$ of the $(3,2)$-set-cover problem. The red, blue, and green edges in $G$ represent the elements of $U$, the elements of the sets in $\cal S$, and the membership of elements of $U$ in the sets of $\cal S$, respectively.

Theorems & Definitions (9)

  • Remark 1
  • Theorem 1
  • proof
  • Claim 1
  • proof
  • Theorem 2
  • proof
  • Claim 2
  • proof