Metric geometry for ranking-based voting: Tools for learning electoral structure

Moon Duchin; Kristopher Tapp

Metric geometry for ranking-based voting: Tools for learning electoral structure

Moon Duchin, Kristopher Tapp

TL;DR

This work develops a metric geometry for ranking-based voting by extending Kendall tau and Spearman footrule to incomplete ballots through two coordinate embeddings, the Borda embedding and the head-to-head embedding, producing distances $d_B$ and $d_H$ that capture global ranking structure. It introduces ballot graphs and generalized ballot graphs to realize these distances as path metrics, enabling both coordinate- and graph-based analyses for partial ballots and slate-level preferences. The authors show how to identify voter blocs and candidate slates via clustering in the induced spaces, provide synthetic validation and real-world Scottish election results demonstrating robust, interpretable structure, and connect polarization and proportionality metrics to the learned clusters. The framework supports practical analysis and visualization of electoral structure, with robust performance across embeddings and methods, and is supported by open data and code for replication.

Abstract

In this paper, we develop the metric geometry of ranking statistics, proving that the two major permutation distances in the statistics literature -- Kendall tau and Spearman footrule -- extend naturally to incomplete rankings with both coordinate embeddings and graph realizations. This gives us a unifying framework that allows us to connect popular topics in computational social choice: metric preferences (and metric distortion), polarization, and proportionality. As an important application, the metric structure enables efficient identification of blocs of voters and slates of their preferred candidates. Since the definitions work for partial ballots, we can execute the methods not only on synthetic elections, but on a suite of real-world elections. This gives us robust clustering methods that often produce an identical grouping of voters -- even though one family of methods is based on a Condorcet-consistent ranking rule while the other is not.

Metric geometry for ranking-based voting: Tools for learning electoral structure

TL;DR

and

that capture global ranking structure. It introduces ballot graphs and generalized ballot graphs to realize these distances as path metrics, enabling both coordinate- and graph-based analyses for partial ballots and slate-level preferences. The authors show how to identify voter blocs and candidate slates via clustering in the induced spaces, provide synthetic validation and real-world Scottish election results demonstrating robust, interpretable structure, and connect polarization and proportionality metrics to the learned clusters. The framework supports practical analysis and visualization of electoral structure, with robust performance across embeddings and methods, and is supported by open data and code for replication.

Abstract

Paper Structure (19 sections, 10 theorems, 17 equations, 12 figures, 1 table)

This paper contains 19 sections, 10 theorems, 17 equations, 12 figures, 1 table.

Introduction
Related work
Ballot graphs and coordinate embeddings
Coordinate embeddings
Ballot graphs
Generalized ballot graphs
Relating the ballot graphs to coordinate embeddings
Relating the generalized ballot graph to coordinate embeddings
Finding blocs and slates; synthetic validation
Grouping the ballots
Grouping the candidates
Synthetic elections
Results on Scottish elections
Learning voter blocs in Pentland Hills
Learning candidate slates in Pentland Hills
...and 4 more sections

Key Result

Proposition 2.5

In either $\mathcal{G}_m$ or $\mathcal{G}_m^+$, size of the vertex set is $O(m!)$ and the degree of each vertex is $O(m^2)$.

Figures (12)

Figure 1: Under the pessimistic convention, the total shifts to convert $(A,B,D,F)$ to $(B,C,A)$ sum to 12, while they sum to 10 in the averaged convention; the Borda distances would be 6 and 5, respectively.
Figure 2: The basic ballot graph $\mathcal{G}_3$ and shortcut ballot graph $\mathcal{G}_3^+$. Swaps of the first and last place correspond to new edges of length 2, providing shortcuts between nodes that would otherwise be 3 apart.
Figure 3: An illustrative portion of the shortcut ballot graph $\mathcal{G}_4^+$, showing the connections between ballots headed by candidates 4 and 1. The construction of this picture shows the recursion $|\Omega_m|=m\cdot|\Omega_{m-1}|+m$.
Figure 4: For ballots $\sigma=(A,C,B)$ and $\tau=(D,C,A,E,B)$, we first complete $\sigma$ to $\sigma'$ and then make swaps as indicated (in red) for a total Borda distance of $d_B(\sigma',\tau')=1+2+1=4$. This correctly matches half the $L^1$ difference of $\mathfrak{b}(\sigma')=(4,2,3,1,0)$ and $\mathfrak{b}(\tau')=(2,0,3,4,1)$.
Figure 5: MDS plot of the ballots in a synthetic election $\mathcal{E}$. Ballots marked green were hit from both centers.
...and 7 more figures

Theorems & Definitions (28)

Definition 2.1: Borda and head-to-head distance
Example 2.2
Definition 2.3: Ballot graphs
Example 2.4: Illustrating ballot graph construction
Proposition 2.5
proof
Example 2.6
Theorem 2.7: Graphs match $L^1$ distances
proof
Remark 2.8
...and 18 more

Metric geometry for ranking-based voting: Tools for learning electoral structure

TL;DR

Abstract

Metric geometry for ranking-based voting: Tools for learning electoral structure

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (28)