Privacy-Aware Predictions in Participatory Budgeting
Juan Zambrano, Clément Contet, Jairo Gudiño-Rosero, Felipe Garrido-Lucero, Umberto Grandi, César Hidalgo
TL;DR
Privacy-Aware Predictions in Participatory Budgeting tackles predicting voter support for proposals at the submission stage using only textual descriptions and anonymous historical voting records, avoiding voter demographics. The authors construct two multi-year city datasets (Toulouse and Wroclaw) and compare classical predictors (ElasticNet, XGBoost) with LLM-based prompting, including a retrieval-augmented generation (RAG) setup. They show that LLMs can meaningfully predict proposal rankings, with ranking correlations improving notably under RAG and when text content is utilized, while No-Text baselines perform worse. The approach offers a practical, privacy-preserving tool for PB organizers to manage large proposal volumes, though limitations such as access to commercial LLMs, reliance on RAG, and ethical considerations around algorithmic influence warrant careful deployment and further study.
Abstract
Participatory budgeting is a democratic innovation that empowers citizens to propose and vote on public investment projects. While researchers in computer science focused on improving the voting phase of this process, in this work we aim to support organizers of participatory budgeting campaigns to manage large volumes of project proposals at the submission stage. We propose a privacy-preserving approach to predict which proposals are likely to be funded, using only projects' textual descriptions and anonymous historical voting records, without relying on voter demographics or personally identifiable information.
