Table of Contents
Fetching ...

A Comparison of Precinct and District Voting Data Using Persistent Homology to Identify Gerrymandering in North Carolina

Ananya Shah

TL;DR

This paper tackles gerrymandering detection by moving beyond geometry-based compactness metrics to a topological approach. It constructs level-set–based filtered simplicial complexes from precinct- and district-level voting data and analyzes their persistent homology, comparing precinct versus district barcodes with bottleneck distance to detect cracking and packing. Applying the method to North Carolina's US House elections (2012–2024) reveals that precinct-level patterns remain relatively stable across biannual intervals, while district-level patterns fluctuate with redistricting, indicating partisan manipulation rather than voter behavior shifts. The work demonstrates a novel topological data analysis application for evaluating gerrymandering and provides a framework that complements traditional measures, with potential for broader adoption in political geography.

Abstract

Gerrymandering is one of the biggest threats to American democracy. By manipulating district lines, politicians effectively choose their voters rather than the other way around. Current gerrymandering identification methods (namely the Polsby-Popper and Reock scores) focus on the compactness of congressional districts, making them extremely sensitive to physical geography. To address this gap, we extend Feng and Porter's 2021 paper, which used the level-set method to turn geographic shapefiles into filtered simplicial complexes, in order to compare precinct level voting data to district level voting data. As precincts are regarded as too small to be gerrymandered, we are able to identify discrepancies between precinct and district level voting data to quantify gerrymandering in the United States. By comparing the persistent homologies of Democratic voting regions at the precinct and district levels, we detect when areas have been "cracked" (split across multiple districts) or "packed" (compressed into one district) for partisan gain. This analysis was conducted for North Carolina House of Representatives elections (2012-2024). North Carolina has been redistricted four times in the past ten years, unusually frequent as most states redistrict decennially, making it a valuable case study. By comparing persistence barcodes at the precinct and district levels (using the bottleneck distance), we show that precinct level voting patterns do not significantly fluctuate biannually, while district level patterns do, suggesting that shifts are likely a result of redistricting rather than voter behavior, providing strong evidence of gerrymandering. This research presents a novel application of topological data analysis in evaluating gerrymandering and shows persistent homology can be useful in discerning gerrymandered districts.

A Comparison of Precinct and District Voting Data Using Persistent Homology to Identify Gerrymandering in North Carolina

TL;DR

This paper tackles gerrymandering detection by moving beyond geometry-based compactness metrics to a topological approach. It constructs level-set–based filtered simplicial complexes from precinct- and district-level voting data and analyzes their persistent homology, comparing precinct versus district barcodes with bottleneck distance to detect cracking and packing. Applying the method to North Carolina's US House elections (2012–2024) reveals that precinct-level patterns remain relatively stable across biannual intervals, while district-level patterns fluctuate with redistricting, indicating partisan manipulation rather than voter behavior shifts. The work demonstrates a novel topological data analysis application for evaluating gerrymandering and provides a framework that complements traditional measures, with potential for broader adoption in political geography.

Abstract

Gerrymandering is one of the biggest threats to American democracy. By manipulating district lines, politicians effectively choose their voters rather than the other way around. Current gerrymandering identification methods (namely the Polsby-Popper and Reock scores) focus on the compactness of congressional districts, making them extremely sensitive to physical geography. To address this gap, we extend Feng and Porter's 2021 paper, which used the level-set method to turn geographic shapefiles into filtered simplicial complexes, in order to compare precinct level voting data to district level voting data. As precincts are regarded as too small to be gerrymandered, we are able to identify discrepancies between precinct and district level voting data to quantify gerrymandering in the United States. By comparing the persistent homologies of Democratic voting regions at the precinct and district levels, we detect when areas have been "cracked" (split across multiple districts) or "packed" (compressed into one district) for partisan gain. This analysis was conducted for North Carolina House of Representatives elections (2012-2024). North Carolina has been redistricted four times in the past ten years, unusually frequent as most states redistrict decennially, making it a valuable case study. By comparing persistence barcodes at the precinct and district levels (using the bottleneck distance), we show that precinct level voting patterns do not significantly fluctuate biannually, while district level patterns do, suggesting that shifts are likely a result of redistricting rather than voter behavior, providing strong evidence of gerrymandering. This research presents a novel application of topological data analysis in evaluating gerrymandering and shows persistent homology can be useful in discerning gerrymandered districts.

Paper Structure

This paper contains 23 sections, 5 equations, 103 figures, 3 tables.

Figures (103)

  • Figure 1: Point Cloud of a Torus
  • Figure 2: A image of a point cloud filtration
  • Figure 3: A $0$-simplex
  • Figure 4: A $1$-simplex
  • Figure 5: A $2$-simplex
  • ...and 98 more figures