Table of Contents
Fetching ...

Preference Construction: A Bayesian Interactive Preference Elicitation Framework Based on Monte Carlo Tree Search

Yan Wang, Jiapeng Liu, Milosz Kadziński, Xiuwu Liao

TL;DR

This paper addresses efficient interactive preference elicitation under limited interaction rounds by integrating a variational Bayesian framework with a Monte Carlo Tree Search (MCTS) based questioning policy. It models participant preferences with an additive value function in MCDA and estimates the posterior $p(U\mid Q^{(t)})$ using a variational distribution $q(\mathbf{u}|\boldsymbol{\theta})$, enhanced by the reparameterization trick to reduce gradient variance. The MCTS-based policy selects the next pair of alternatives to maximize expected variance reduction, enabling long-horizon planning in a finite decision process. Computational experiments on real-world and synthetic MCDA datasets show that RT-enhanced variational Bayesian inference yields superior accuracy and uncertainty reduction, while the MCTS questioning policy consistently outperforms baselines, with advantages growing as interaction rounds increase.

Abstract

We present a novel preference learning framework to capture participant preferences efficiently within limited interaction rounds. It involves three main contributions. First, we develop a variational Bayesian approach to infer the participant's preference model by estimating posterior distributions and managing uncertainty from limited information. Second, we propose an adaptive questioning policy that maximizes cumulative uncertainty reduction, formulating questioning as a finite Markov decision process and using Monte Carlo Tree Search to prioritize promising question trajectories. By considering long-term effects and leveraging the efficiency of the Bayesian approach, the policy avoids shortsightedness. Third, we apply the framework to Multiple Criteria Decision Aiding, with pairwise comparison as the preference information and an additive value function as the preference model. We integrate the reparameterization trick to address high-variance issues, enhancing robustness and efficiency. Computational studies on real-world and synthetic datasets demonstrate the framework's practical usability, outperforming baselines in capturing preferences and achieving superior uncertainty reduction within limited interactions.

Preference Construction: A Bayesian Interactive Preference Elicitation Framework Based on Monte Carlo Tree Search

TL;DR

This paper addresses efficient interactive preference elicitation under limited interaction rounds by integrating a variational Bayesian framework with a Monte Carlo Tree Search (MCTS) based questioning policy. It models participant preferences with an additive value function in MCDA and estimates the posterior using a variational distribution , enhanced by the reparameterization trick to reduce gradient variance. The MCTS-based policy selects the next pair of alternatives to maximize expected variance reduction, enabling long-horizon planning in a finite decision process. Computational experiments on real-world and synthetic MCDA datasets show that RT-enhanced variational Bayesian inference yields superior accuracy and uncertainty reduction, while the MCTS questioning policy consistently outperforms baselines, with advantages growing as interaction rounds increase.

Abstract

We present a novel preference learning framework to capture participant preferences efficiently within limited interaction rounds. It involves three main contributions. First, we develop a variational Bayesian approach to infer the participant's preference model by estimating posterior distributions and managing uncertainty from limited information. Second, we propose an adaptive questioning policy that maximizes cumulative uncertainty reduction, formulating questioning as a finite Markov decision process and using Monte Carlo Tree Search to prioritize promising question trajectories. By considering long-term effects and leveraging the efficiency of the Bayesian approach, the policy avoids shortsightedness. Third, we apply the framework to Multiple Criteria Decision Aiding, with pairwise comparison as the preference information and an additive value function as the preference model. We integrate the reparameterization trick to address high-variance issues, enhancing robustness and efficiency. Computational studies on real-world and synthetic datasets demonstrate the framework's practical usability, outperforming baselines in capturing preferences and achieving superior uncertainty reduction within limited interactions.

Paper Structure

This paper contains 33 sections, 30 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Preference elicitation process.
  • Figure 2: Average ASP for the variants of the proposed variational Bayesian inference approach with/without RT and SOR, evaluated on the CE dataset, considering various shapes of marginal value functions, different numbers of pairwise comparisons, and varying proportions of biased preference information.
  • Figure 3: Average values of different metrics characterizing the uncertainty of the DM's preferences with 7 questioning policies.
  • Figure 4: The proposed preference elicitation and model construction framework.
  • Figure 5: Average ASP for the variants of the proposed variational Bayesian inference approach with/without RT and SOR, considering different shapes of marginal value functions across various datasets.
  • ...and 6 more figures