Large Scale Constrained Clustering With Reinforcement Learning
Benedikt Schesch, Marco Caserta
TL;DR
The paper tackles large-scale constrained clustering to form fully connected clusters with a diameter bound $D$ while minimizing intra-cluster travel times and maximizing the number of clustered nodes. It proposes a reinforcement learning framework using a graph neural network to guide edge selection, trained with PPO in an environment that enforces triangle-inequality and connectivity constraints. The approach demonstrates near-optimal solutions on large instances with runtimes far smaller than exact solvers, highlighting distribution-aware heuristics and scalable learning-based planning. This enables scalable, cluster-based resource allocation in large networks such as logistics or field-service operations, where rapid, high-quality clustering decisions are critical.
Abstract
Given a network, allocating resources at clusters level, rather than at each node, enhances efficiency in resource allocation and usage. In this paper, we study the problem of finding fully connected disjoint clusters to minimize the intra-cluster distances and maximize the number of nodes assigned to the clusters, while also ensuring that no two nodes within a cluster exceed a threshold distance. While the problem can easily be formulated using a binary linear model, traditional combinatorial optimization solvers struggle when dealing with large-scale instances. We propose an approach to solve this constrained clustering problem via reinforcement learning. Our method involves training an agent to generate both feasible and (near) optimal solutions. The agent learns problem-specific heuristics, tailored to the instances encountered in this task. In the results section, we show that our algorithm finds near optimal solutions, even for large scale instances.
