MRCpy: A Library for Minimax Risk Classifiers
Kartheek Bondugula, Verónica Álvarez, José I. Segovia-Martín, Aritz Pérez, Santiago Mazuelas
TL;DR
MRCpy delivers a Python-based, scikit-learn–friendly implementation of minimax risk classifiers built on robust risk minimization, enabling worst-case performance guarantees under distribution shifts. The framework defines uncertainty sets via moment constraints on a feature map $\boldsymbol{\Phi}$ and supports $0$-$1$ and log losses, with convex optimization and $L_1$ regularization driving sparse, high-dimensional solutions. It extends to concept drift via AMRC and general covariate shift via DW-GCS, offering efficient solvers (subgradient, CG, SGD/Adam, constraint generation) and a modular, extensible architecture. Empirical results demonstrate faster hyper-parameter tuning using upper-bound bounds, competitive accuracy on high-dimensional biological data, and effective adaptation to drift and covariate shifts, underscoring practical impact for robust, distribution-aware classification in complex domains.
Abstract
Libraries for supervised classification have enabled the wide-spread usage of machine learning methods. Existing libraries, such as scikit-learn, caret, and mlpack, implement techniques based on the classical empirical risk minimization (ERM) approach. We present a Python library, MRCpy, that implements minimax risk classifiers (MRCs) based on the robust risk minimization (RRM) approach. The library offers multiple variants of MRCs that can provide performance guarantees, enable efficient learning in high dimensions, and adapt to distribution shifts. MRCpy follows an object-oriented approach and adheres to the standards of popular Python libraries, such as scikit-learn, facilitating readability and easy usage together with a seamless integration with other libraries. The source code is available under the GPL-3.0 license at https://github.com/MachineLearningBCAM/MRCpy.
