CLIMB: Class-imbalanced Learning Benchmark on Tabular Data
Zhining Liu, Zihao Li, Ze Yang, Tianxin Wei, Jian Kang, Yada Zhu, Hendrik Hamann, Jingrui He, Hanghang Tong
TL;DR
CLIMB tackles the real-world problem of class-imbalanced learning on tabular data by providing a comprehensive benchmark with 73 open-world datasets and 29 representative CIL algorithms implemented in a unified API. The framework employs a principled protocol with standardized preprocessing, 5-fold stratified splits, 100 hyperparameter trials per method, and evaluation via $AUPRC$, macro-$F_1$, and $BAC$, enabling fair, large-scale comparisons. Key findings include that naive balancing often hurts performance, ensemble approaches, especially undersample ensembles, yield robust gains, and data quality factors such as label noise and missing values can have a larger impact than the imbalance itself. The study also highlights the importance of metric choice for interpretation, analyzes runtime trade-offs, and demonstrates that data quality and safe data cleaning can be as or more critical than balancing, offering practical guidance for deploying CIL methods in industry. The open-source CLIMB package, extensive datasets, and empirical insights are positioned to guide future research and real-world applications in imbalanced tabular tasks.
Abstract
Class-imbalanced learning (CIL) on tabular data is important in many real-world applications where the minority class holds the critical but rare outcomes. In this paper, we present CLIMB, a comprehensive benchmark for class-imbalanced learning on tabular data. CLIMB includes 73 real-world datasets across diverse domains and imbalance levels, along with unified implementations of 29 representative CIL algorithms. Built on a high-quality open-source Python package with unified API designs, detailed documentation, and rigorous code quality controls, CLIMB supports easy implementation and comparison between different CIL algorithms. Through extensive experiments, we provide practical insights on method accuracy and efficiency, highlighting the limitations of naive rebalancing, the effectiveness of ensembles, and the importance of data quality. Our code, documentation, and examples are available at https://github.com/ZhiningLiu1998/imbalanced-ensemble.
