Solving Stochastic Orienteering Problems with Chance Constraints Using Monte Carlo Tree Search
Stefano Carpin
TL;DR
The paper addresses planning under uncertainty for stochastic orienteering with a probabilistic budget bound. It introduces an online, anytime Monte Carlo Tree Search method (MCTS-SOPCC) that explicitly tracks both path value and the probability of violating the budget constraint, using a UCT-inspired policy with failures (UCTF) and rollout/backups informed by sample-based approximations. The approach avoids discretizing time and handles continuous residual budgets, delivering adaptive policies that perform near the optimal MILP solutions while offering substantial computational savings, particularly on larger graphs. This work advances risk-aware robotic routing by enabling online policy construction that respects chance constraints and scales to realistic problem instances.
Abstract
We present a new Monte Carlo Tree Search (MCTS) algorithm to solve the stochastic orienteering problem with chance constraints, i.e., a version of the problem where travel costs are random, and one is assigned a bound on the tolerable probability of exceeding the budget. The algorithm we present is online and anytime, i.e., it alternates planning and execution, and the quality of the solution it produces increases as the allowed computational time increases. Differently from most former MCTS algorithms, for each action available in a state the algorithm maintains estimates of both its value and the probability that its execution will eventually result in a violation of the chance constraint. Then, at action selection time, our proposed solution prunes away trajectories that are estimated to violate the failure probability. Extensive simulation results show that this approach can quickly produce high-quality solutions and is competitive with the optimal but time-consuming solution.
