Predictive Red Teaming: Breaking Policies Without Breaking Robots
Anirudha Majumdar, Mohit Sharma, Dmitry Kalashnikov, Sumeet Singh, Pierre Sermanet, Vikas Sindhwani
TL;DR
The paper addresses the challenge of predicting how visuomotor policies fail under unseen environmental factors without hardware testing. It formalizes predictive red teaming and introduces RoboART, a pipeline that edits nominal observations via generative image editing and then uses a policy-specific anomaly detector to forecast performance degradation, allowing rank-based vulnerability assessment and absolute performance estimation. Empirical results across 12 off-nominal factors and 500+ hardware trials show RoboART achieving high predictive accuracy (average error $< 0.19$) and enabling targeted data collection that boosts performance by $2$–$7\times$ and improves cross-domain generalization by $2$–$5\times$. The work highlights the practical impact of anticipating deployments’ limits, guiding data collection, policy comparison, and safer, more robust real-world robot operation. It also discusses limitations such as edit-to-real gaps and single-time-step anomaly estimates, pointing to future work in 3D scene editing and iterative exploration of environmental factor spaces.
Abstract
Visuomotor policies trained via imitation learning are capable of performing challenging manipulation tasks, but are often extremely brittle to lighting, visual distractors, and object locations. These vulnerabilities can depend unpredictably on the specifics of training, and are challenging to expose without time-consuming and expensive hardware evaluations. We propose the problem of predictive red teaming: discovering vulnerabilities of a policy with respect to environmental factors, and predicting the corresponding performance degradation without hardware evaluations in off-nominal scenarios. In order to achieve this, we develop RoboART: an automated red teaming (ART) pipeline that (1) modifies nominal observations using generative image editing to vary different environmental factors, and (2) predicts performance under each variation using a policy-specific anomaly detector executed on edited observations. Experiments across 500+ hardware trials in twelve off-nominal conditions for visuomotor diffusion policies demonstrate that RoboART predicts performance degradation with high accuracy (less than 0.19 average difference between predicted and real success rates). We also demonstrate how predictive red teaming enables targeted data collection: fine-tuning with data collected under conditions predicted to be adverse boosts baseline performance by 2-7x.
