Table of Contents
Fetching ...

Guide to Numerical Experiments on Elections in Computational Social Choice

Niclas Boehmer, Piotr Faliszewski, Łukasz Janeczko, Andrzej Kaczmarczyk, Grzegorz Lisowski, Grzegorz Pierczyński, Simon Rey, Dariusz Stolicki, Stanisław Szufa, Tomasz Wąs

TL;DR

This Guide analyzes how numerical experiments on elections have been conducted in computational social choice, aggregating 2010–2023 conference literature to uncover data-generation practices, election sizes, and data sources. It documents the dominant statistical cultures for ordinal and approval elections (notably Impartial Culture, Mallows, Urn, and Euclidean models) and offers practical recommendations on model selection, data sources, and experimental design to improve comparability and realism. The work provides a Python sampling package and a public paper database to facilitate replication and broader experimentation, and it highlights trends such as growing use of real-life data and higher-dimensional Euclidean models. Overall, the Guide serves as both a landscape map and a set of concrete best-practice guidelines for conducting robust elections-focused simulations in the field.

Abstract

We analyze how numerical experiments regarding elections were conducted within the computational social choice literature (focusing on papers published in the IJCAI, AAAI, and AAMAS conferences). We analyze the sizes of the studied elections and the methods used for generating preference data, thereby making previously hidden standards and practices explicit. In particular, we survey a number of statistical cultures for generating elections and their commonly used parameters.

Guide to Numerical Experiments on Elections in Computational Social Choice

TL;DR

This Guide analyzes how numerical experiments on elections have been conducted in computational social choice, aggregating 2010–2023 conference literature to uncover data-generation practices, election sizes, and data sources. It documents the dominant statistical cultures for ordinal and approval elections (notably Impartial Culture, Mallows, Urn, and Euclidean models) and offers practical recommendations on model selection, data sources, and experimental design to improve comparability and realism. The work provides a Python sampling package and a public paper database to facilitate replication and broader experimentation, and it highlights trends such as growing use of real-life data and higher-dimensional Euclidean models. Overall, the Guide serves as both a landscape map and a set of concrete best-practice guidelines for conducting robust elections-focused simulations in the field.

Abstract

We analyze how numerical experiments regarding elections were conducted within the computational social choice literature (focusing on papers published in the IJCAI, AAAI, and AAMAS conferences). We analyze the sizes of the studied elections and the methods used for generating preference data, thereby making previously hidden standards and practices explicit. In particular, we survey a number of statistical cultures for generating elections and their commonly used parameters.
Paper Structure (37 sections, 8 figures, 1 table)

This paper contains 37 sections, 8 figures, 1 table.

Figures (8)

  • Figure 1: Statistics regarding the numbers of papers in the Guide.
  • Figure 2: Histograms of the numbers of candidates and voters of synthetic elections used in the papers from the Guide (top), and in Preflib (middle) and Pabulib (bottom).
  • Figure 3: Heatmaps of the sizes of synthetic elections used in the papers from the Guide (left), real-life elections from Preflib (middle), and real-life elections from Pabulib (right). Preflib plot omits the elections provided by boe-sch:c:real-world-ranking-data (including them would create an overwhelming spike in the area for $8$-$31$ voters and $100$-$499$ candidates). Darker cells mean more papers with elections of a given size.
  • Figure 4: Numbers of data sources used in the papers that consider ordinal elections. "Neither IC nor R.-L." means papers that used neither impartial culture (IC) nor real-life data, "Real-Life" means using real-life data but not IC, "IC + Real-Life" means using both IC and real-life data, and "IC" means using "IC" but not real-life data.
  • Figure 5: Numbers of data sources used in the papers from the Guide that consider either ordinal (top) or approval (bottom) elections in particular years.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Remark 2.1
  • Remark 3.1
  • Remark B.1
  • Remark B.2
  • Remark B.3