ANML: Attribution-Native Machine Learning with Guaranteed Robustness
Oliver Zahn, Matt Beton, Simran Chana
TL;DR
ANML introduces attribution-native machine learning by weighting per-sample training contributions with four provenance-informed factors: gradient-consistency $q$, verification $v$, reputation $r$, and freshness $T$. It analyzes multiplicative and robust signal-combination strategies, proposing Two-Stage Adaptive Gating and Softmax Blend to ensure baseline performance and resilience to data-poisoning. Empirical validation across five datasets shows substantial improvements (e.g., +33% to +75% error reduction) and notable data-efficiency (20% high-quality data beating 100% uniformly weighted data by 47%), along with strong robustness to adversarial conditions and meaningful contributor-level attribution gains. The work highlights practical implications for training frontier AI on expert data, offering mechanisms to credit contributors and sustain data-sharing incentives while maintaining safe training dynamics, including a safety fallback when external signals are compromised.
Abstract
Frontier AI systems increasingly train on specialized expert data, from clinical records to proprietary research to curated datasets, yet current training pipelines treat all samples identically. A Nobel laureate's contribution receives the same weight as an unverified submission. We introduce ANML (Attribution-Native Machine Learning), a framework that weights training samples by four quality factors: gradient-based consistency (q), verification status (v), contributor reputation (r), and temporal relevance (T). By combining what the model observes (gradient signals) with what the system knows about data provenance (external signals), ANML produces per-contributor quality weights that simultaneously improve model performance and enable downstream attribution. Across 5 datasets (178-32,561 samples), ANML achieves 33-72% error reduction over gradient-only baselines. Quality-weighted training is data-efficient: 20% high-quality data outperforms 100% uniformly weighted data by 47%. A Two-Stage Adaptive gating mechanism guarantees that ANML never underperforms the best available baseline, including under strategic joint attacks combining credential faking with gradient alignment. When per-sample detection fails against subtle corruption, contributor-level attribution provides 1.3-5.3x greater improvement than sample-level methods, with the advantage growing as corruption becomes harder to detect.
