Table of Contents
Fetching ...

Learning Aggregation Rules in Participatory Budgeting: A Data-Driven Approach

Roy Fairstein, Dan Vilenchik, Kobi Gal

TL;DR

The paper tackles how PB organizers can derive aggregation rules that balance welfare and representation without manually specifying objective functions. It introduces a data-driven framework that embeds aggregation rules inside neural networks, using Set Transformer architectures (ST and ST+PMA) trained on PB instances to learn AV, CC, PAV, and mixtures. The authors demonstrate strong generalization from small synthetic PB problems to large real-world instances, and show the model can learn compromise rules (e.g., AV-CC-p) that approximate known trade-offs like PAV. This approach offers a practical, scalable tool for tailoring PB outcomes to evolving objectives, with future work on human-in-the-loop evaluation and explainability.

Abstract

Participatory Budgeting (PB) offers a democratic process for communities to allocate public funds across various projects through voting. In practice, PB organizers face challenges in selecting aggregation rules either because they are not familiar with the literature and the exact details of every existing rule or because no existing rule echoes their expectations. This paper presents a novel data-driven approach utilizing machine learning to address this challenge. By training neural networks on PB instances, our approach learns aggregation rules that balance social welfare, representation, and other societal beneficial goals. It is able to generalize from small-scale synthetic PB examples to large, real-world PB instances. It is able to learn existing aggregation rules but also generate new rules that adapt to diverse objectives, providing a more nuanced, compromise-driven solution for PB processes. The effectiveness of our approach is demonstrated through extensive experiments with synthetic and real-world PB data, and can expand the use and deployment of PB solutions.

Learning Aggregation Rules in Participatory Budgeting: A Data-Driven Approach

TL;DR

The paper tackles how PB organizers can derive aggregation rules that balance welfare and representation without manually specifying objective functions. It introduces a data-driven framework that embeds aggregation rules inside neural networks, using Set Transformer architectures (ST and ST+PMA) trained on PB instances to learn AV, CC, PAV, and mixtures. The authors demonstrate strong generalization from small synthetic PB problems to large real-world instances, and show the model can learn compromise rules (e.g., AV-CC-p) that approximate known trade-offs like PAV. This approach offers a practical, scalable tool for tailoring PB outcomes to evolving objectives, with future work on human-in-the-loop evaluation and explainability.

Abstract

Participatory Budgeting (PB) offers a democratic process for communities to allocate public funds across various projects through voting. In practice, PB organizers face challenges in selecting aggregation rules either because they are not familiar with the literature and the exact details of every existing rule or because no existing rule echoes their expectations. This paper presents a novel data-driven approach utilizing machine learning to address this challenge. By training neural networks on PB instances, our approach learns aggregation rules that balance social welfare, representation, and other societal beneficial goals. It is able to generalize from small-scale synthetic PB examples to large, real-world PB instances. It is able to learn existing aggregation rules but also generate new rules that adapt to diverse objectives, providing a more nuanced, compromise-driven solution for PB processes. The effectiveness of our approach is demonstrated through extensive experiments with synthetic and real-world PB data, and can expand the use and deployment of PB solutions.

Paper Structure

This paper contains 24 sections, 1 equation, 3 figures, 5 tables.

Figures (3)

  • Figure 1: The network architecture.
  • Figure 2: Welfare and rep. ratios on the UNIQUE datasets for the AV-CC-p models (blue dots) and the weighted AV-CC model (orange dots). The PAV value is added as well (red square). The first four points from the left correspond to $p=0,0.01,0.04,0.8$ and the dashed line at $y=0.99$ marks a favorable AV-CC tradeoff for $p\le 0.1$. The remaining blue and orange dots are in 0.1 steps.
  • Figure 3: The Jaccard similarity obtained for each of the aggregation methods: top - AV, middle - PAV , bottom - CC. Missing values for ST is due to GPU out-of-memory exception. UNIQUE-datasets on the left of the dashed line and TIED-datasets on the right.