Table of Contents
Fetching ...

Resolvability of Hamming Graphs

Lucas Laird, Richard C. Tillquist, Stephen Becker, Manuel E. Lladser

TL;DR

This work addresses the resolvability (metric dimension) of Hamming graphs $\mathbb{H}_{k,a}$ by recasting the problem as a constrained linear system on one-hot encodings and then extending it to a polynomial-root formulation. The authors derive a central result, Theorem $A z=0$, which characterizes resolving sets via a structured zero-space condition, and they specialize it to hypercubes with simplified kernel criteria. They propose two complementary computational techniques—Gröbner-basis certificates (exact but costly) and a novel, fast Integer Linear Programming (ILP) formulation (probabilistic but scalable)—and demonstrate their use on large symbolic spaces, notably octapeptides in $\mathbb{H}_{8,20}$. As a proof of concept, they identify a resolving set of size $77$ for $\mathbb{H}_{8,20}$, enabling all octamers to be embedded as $77$-dimensional real vectors, which yields potential gains for machine-learning-style analyses of symbolic sequences. Altogether, the work provides practical tools for certifying resolvability, reduces the embedding dimension for symbolic data, and offers a path toward tighter bounds on the metric dimension of Hamming graphs in high-dimensional regimes.

Abstract

A subset of vertices in a graph is called resolving when the geodesic distances to those vertices uniquely distinguish every vertex in the graph. Here, we characterize the resolvability of Hamming graphs in terms of a constrained linear system and deduce a novel but straightforward characterization of resolvability for hypercubes. We propose an integer linear programming method to assess resolvability rapidly, and provide a more costly but definite method based on Gröbner bases to determine whether or not a set of vertices resolves an arbitrary Hamming graph. As proof of concept, we identify a resolving set of size 77 in the metric space of all octapeptides (i.e., proteins composed of eight amino acids) with respect to the Hamming distance; in particular, any octamer may be readily represented as a 77-dimensional real-vector. Representing k-mers as low-dimensional numerical vectors may enable new applications of machine learning algorithms to symbolic sequences.

Resolvability of Hamming Graphs

TL;DR

This work addresses the resolvability (metric dimension) of Hamming graphs by recasting the problem as a constrained linear system on one-hot encodings and then extending it to a polynomial-root formulation. The authors derive a central result, Theorem , which characterizes resolving sets via a structured zero-space condition, and they specialize it to hypercubes with simplified kernel criteria. They propose two complementary computational techniques—Gröbner-basis certificates (exact but costly) and a novel, fast Integer Linear Programming (ILP) formulation (probabilistic but scalable)—and demonstrate their use on large symbolic spaces, notably octapeptides in . As a proof of concept, they identify a resolving set of size for , enabling all octamers to be embedded as -dimensional real vectors, which yields potential gains for machine-learning-style analyses of symbolic sequences. Altogether, the work provides practical tools for certifying resolvability, reduces the embedding dimension for symbolic data, and offers a path toward tighter bounds on the metric dimension of Hamming graphs in high-dimensional regimes.

Abstract

A subset of vertices in a graph is called resolving when the geodesic distances to those vertices uniquely distinguish every vertex in the graph. Here, we characterize the resolvability of Hamming graphs in terms of a constrained linear system and deduce a novel but straightforward characterization of resolvability for hypercubes. We propose an integer linear programming method to assess resolvability rapidly, and provide a more costly but definite method based on Gröbner bases to determine whether or not a set of vertices resolves an arbitrary Hamming graph. As proof of concept, we identify a resolving set of size 77 in the metric space of all octapeptides (i.e., proteins composed of eight amino acids) with respect to the Hamming distance; in particular, any octamer may be readily represented as a 77-dimensional real-vector. Representing k-mers as low-dimensional numerical vectors may enable new applications of machine learning algorithms to symbolic sequences.

Paper Structure

This paper contains 17 sections, 13 theorems, 31 equations, 2 figures, 1 table, 2 algorithms.

Key Result

Lemma 2.1

\newlabellem:UtV0 If $u,v$ are $\hbox{$k$-mers}$ with one-hot encodings $U,V$, respectively, then $d(u,v)=k-{\hbox{Tr}}(U'V)$; in particular, $d(u,v)={\hbox{Tr}}(U'\bar{V})$.

Figures (2)

  • Figure 1: Visual representation of $\mathbb{H}_{1,3}$, $\mathbb{H}_{2,3}$, and $\mathbb{H}_{3,3}$. Blue-colored vertices form minimal resolving sets in their corresponding Hamming graph.
  • Figure 1: Data from \ref{['tab:k_a_pairs']} with lines of best fit (on log-transformed data) for each method.

Theorems & Definitions (26)

  • Lemma 2.1
  • Proof 1
  • Theorem 2.2
  • Proof 2
  • Corollary 2.3
  • Proof 3
  • Corollary 2.4
  • Proof 4
  • Lemma 3.1
  • Proof 5
  • ...and 16 more