Interpretable Fair Clustering
Mudi Jiang, Jiahui Zhou, Xinying Liu, Zengyou He, Zhikui Chen
TL;DR
This work addresses the need for interpretable clustering that also enforces group fairness. It proposes IFCT, a decision-tree–based framework that jointly optimizes intra-cluster compactness and fairness via $\mathcal{L}(\mathcal{T}) = \mathcal{L}_C(\mathcal{T}) + \lambda \mathcal{L}_F(\mathcal{T})$, and IFCT-P, a hyperparameter-free variant using post-pruning. The method supports mixed-type features and multiple sensitive attributes, and demonstrates competitive clustering performance with improved fairness and clear interpretability across real-world and synthetic datasets. Experiments show IFCT generally outperforms baselines on fairness while maintaining reasonable accuracy, and IFCT-P delivers robust performance without parameter tuning. The work offers a practical path toward transparent, fair clustering suitable for high-stakes applications.
Abstract
Fair clustering has gained increasing attention in recent years, especially in applications involving socially sensitive attributes. However, existing fair clustering methods often lack interpretability, limiting their applicability in high-stakes scenarios where understanding the rationale behind clustering decisions is essential. In this work, we address this limitation by proposing an interpretable and fair clustering framework, which integrates fairness constraints into the structure of decision trees. Our approach constructs interpretable decision trees that partition the data while ensuring fair treatment across protected groups. To further enhance the practicality of our framework, we also introduce a variant that requires no fairness hyperparameter tuning, achieved through post-pruning a tree constructed without fairness constraints. Extensive experiments on both real-world and synthetic datasets demonstrate that our method not only delivers competitive clustering performance and improved fairness, but also offers additional advantages such as interpretability and the ability to handle multiple sensitive attributes. These strengths enable our method to perform robustly under complex fairness constraints, opening new possibilities for equitable and transparent clustering.
