Dependent randomized rounding for clustering and partition systems with knapsack constraints
David G. Harris, Thomas Pensyl, Aravind Srinivasan, Khoa Trinh
TL;DR
This work develops a robust framework for clustering under knapsack constraints with fairness considerations by introducing Knapsack-Partition Rounding (KPR), a dependent-rounding technique that preserves multiple knapsack budgets and a partition constraint while maintaining strong negative-correlation-like properties. The authors prove a Samuels–Feige–type concentration bound for sums of unbounded, negatively associated variables, enabling additive pseudo-approximation guarantees that complement exact knapsack feasibility. They apply KPR to obtain new pseudo-approximation results for knapsack median and knapsack center, including single- and multi-knapsack settings, with additive and multiplicative guarantees and near-fair distance behavior. The methods yield improved theoretical guarantees and practical implications for fair clustering and facility location, offering a toolkit to balance efficiency, constraint satisfaction, and equitable representation.
Abstract
Clustering problems are fundamental to unsupervised learning. There is an increased emphasis on fairness in machine learning and AI; one representative notion of fairness is that no single demographic group should be over-represented among the cluster-centers. This, and much more general clustering problems, can be formulated with "knapsack" and "partition" constraints. We develop new randomized algorithms targeting such problems, and study two in particular: multi-knapsack median and multi-knapsack center. Our rounding algorithms give new approximation and pseudo-approximation algorithms for these problems. One key technical tool, which may be of independent interest, is a new tail bound analogous to Feige (2006) for sums of random variables with unbounded variances. Such bounds can be useful in inferring properties of large networks using few samples.
