Table of Contents
Fetching ...

Efficient Weighting Schemes for Auditing Instant-Runoff Voting Elections

Alexander Ek, Philip B. Stark, Peter J. Stuckey, Damjan Vukcevic

TL;DR

The paper addresses efficient auditing of IRV elections without cast vote records by extending the AWAIRE framework with a broad suite of adaptive weighting schemes. It examines how test supermartingales and intersection martingales can be combined under various weighting rules, and how ALPHA tuning parameters influence audit efficiency across margins. Through simulations on real NSW data, it identifies practical defaults (notably Largest and Quadratic$+$) and clarifies trade-offs across election margins, while proposing lazy, computationally lighter implementations that still preserve statistical power. The work advances practical, CVR-free RLAs for IRV by offering new schemes, portfolio-inspired ideas, software, and guidance on default settings, with implications for broader adoption in jurisdictions lacking full ballot digitisation.

Abstract

Various risk-limiting audit (RLA) methods have been developed for instant-runoff voting (IRV) elections. A recent method, AWAIRE, is the first efficient approach that can take advantage of but does not require cast vote records (CVRs). AWAIRE involves adaptively weighted averages of test statistics, essentially "learning" an effective set of hypotheses to test. However, the initial paper on AWAIRE only examined a few weighting schemes and parameter settings. We explore schemes and settings more extensively, to identify and recommend efficient choices for practice. We focus on the case where CVRs are not available, assessing performance using simulations based on real election data. The most effective schemes are often those that place most or all of the weight on the apparent "best" hypotheses based on already seen data. Conversely, the optimal tuning parameters tended to vary based on the election margin. Nonetheless, we quantify the performance trade-offs for different choices across varying election margins, aiding in selecting the most desirable trade-off if a default option is needed. A limitation of the current AWAIRE implementation is its restriction to a small number of candidates -- up to six in previous implementations. One path to a more computationally efficient implementation would be to use lazy evaluation and avoid considering all possible hypotheses. Our findings suggest that such an approach could be done without substantially compromising statistical performance.

Efficient Weighting Schemes for Auditing Instant-Runoff Voting Elections

TL;DR

The paper addresses efficient auditing of IRV elections without cast vote records by extending the AWAIRE framework with a broad suite of adaptive weighting schemes. It examines how test supermartingales and intersection martingales can be combined under various weighting rules, and how ALPHA tuning parameters influence audit efficiency across margins. Through simulations on real NSW data, it identifies practical defaults (notably Largest and Quadratic) and clarifies trade-offs across election margins, while proposing lazy, computationally lighter implementations that still preserve statistical power. The work advances practical, CVR-free RLAs for IRV by offering new schemes, portfolio-inspired ideas, software, and guidance on default settings, with implications for broader adoption in jurisdictions lacking full ballot digitisation.

Abstract

Various risk-limiting audit (RLA) methods have been developed for instant-runoff voting (IRV) elections. A recent method, AWAIRE, is the first efficient approach that can take advantage of but does not require cast vote records (CVRs). AWAIRE involves adaptively weighted averages of test statistics, essentially "learning" an effective set of hypotheses to test. However, the initial paper on AWAIRE only examined a few weighting schemes and parameter settings. We explore schemes and settings more extensively, to identify and recommend efficient choices for practice. We focus on the case where CVRs are not available, assessing performance using simulations based on real election data. The most effective schemes are often those that place most or all of the weight on the apparent "best" hypotheses based on already seen data. Conversely, the optimal tuning parameters tended to vary based on the election margin. Nonetheless, we quantify the performance trade-offs for different choices across varying election margins, aiding in selecting the most desirable trade-off if a default option is needed. A limitation of the current AWAIRE implementation is its restriction to a small number of candidates -- up to six in previous implementations. One path to a more computationally efficient implementation would be to use lazy evaluation and avoid considering all possible hypotheses. Our findings suggest that such an approach could be done without substantially compromising statistical performance.
Paper Structure (20 sections, 1 theorem, 7 equations, 4 figures)

This paper contains 20 sections, 1 theorem, 7 equations, 4 figures.

Key Result

theorem thmcountertheorem

Linear is an $F$-weighted portfolio algorithm.

Figures (4)

  • Figure 1: Mean sample size (as a percentage of the total ballots in each contest; $\pm 2$ standard errors) across all simulated audits in each of the margin categories (rows). The vertical gridlines in panels (a)--(d) correspond respectively to approximately 500, 150, 25 and 10 ballots. The dashed lines show the best mean sample size achieved within each panel.
  • Figure 2: Mean sample size (as a percentage of the total ballots in each contest; $\pm 2$ standard errors) across all simulated audits in each of the margin categories (rows) and settings for shrinkTrunc() ($\eta_0$ across columns and $d$ on the x-axis). Three contests were selected to represent each category, see \ref{['sec:tuning-parameters']} for details. The dashed lines show the best mean sample size achieved within each panel.
  • Figure 3: Average reduction in mean sample size for two default choices compared to the previous default choice. Each point represents a single contest (averaged over 500 simulated audits). The margin (x-axis) is shown as a proportion out the total ballots in each contest.
  • Figure 4: Average reduction in mean sample size for our two default choices; now $\pm 1$ standard error in both directions. Each point represents a single contest (averaged over 500 simulated audits). The majority of points are on the right-hand side of the diagonal, indicating a larger average reduction when using Largest as compared to Quadratic$+$.

Theorems & Definitions (2)

  • theorem thmcountertheorem
  • proof