Embracing Contradiction: Theoretical Inconsistency Will Not Impede the Road of Building Responsible AI Systems
Gordon Dai, Yunze Xiao
TL;DR
The paper reframes Responsible AI evaluation by treating metric inconsistency as a feature rather than a flaw, arguing that intra-concept inconsistencies (e.g., among different fairness metrics) and inter-concept trade-offs (e.g., accuracy versus privacy) preserve diverse stakeholder values, enrich concept understanding, and act as practical regularizers. It formalizes two forms of inconsistency, reviews canonical results and mechanisms (gradient conflicts, Pareto/Rashomon sets, and cross-metric regularization) that can yield robust, generalizable behavior under real-world uncertainty, and introduces a protocol for transparent documentation of inconsistency via Metric Provenance Sheets and SPHERE-aligned reporting. The authors advocate practice-driven theories, empirical study of human–metric interactions, and profile-based governance to navigate pluralistic evaluation without sacrificing accountability. Overall, the work argues for embracing contradiction to enhance normative representation, epistemic fidelity, and operational resilience in responsible AI systems.
Abstract
This position paper argues that the theoretical inconsistency often observed among Responsible AI (RAI) metrics, such as differing fairness definitions or tradeoffs between accuracy and privacy, should be embraced as a valuable feature rather than a flaw to be eliminated. We contend that navigating these inconsistencies, by treating metrics as divergent objectives, yields three key benefits: (1) Normative Pluralism: Maintaining a full suite of potentially contradictory metrics ensures that the diverse moral stances and stakeholder values inherent in RAI are adequately represented. (2) Epistemological Completeness: The use of multiple, sometimes conflicting, metrics allows for a more comprehensive capture of multifaceted ethical concepts, thereby preserving greater informational fidelity about these concepts than any single, simplified definition. (3) Implicit Regularization: Jointly optimizing for theoretically conflicting objectives discourages overfitting to one specific metric, steering models towards solutions with enhanced generalization and robustness under real-world complexities. In contrast, efforts to enforce theoretical consistency by simplifying or pruning metrics risk narrowing this value diversity, losing conceptual depth, and degrading model performance. We therefore advocate for a shift in RAI theory and practice: from getting trapped in inconsistency to characterizing acceptable inconsistency thresholds and elucidating the mechanisms that permit robust, approximated consistency in practice.
