Antagonistic AI
Alice Cai, Ian Arawjo, Elena L. Glassman
TL;DR
Addressing the concern that current AI often embraces sycophancy and safety at the cost of resilience and authentic user agency, the paper proposes antagonistic AI as a design space. It uses prompting experiments and a speculative design workshop to explore how opposition, conflict, and humor might yield constructive benefits. The authors propose a taxonomy of antagonism (three types), seven benefit categories, eight design techniques, and a seven-dimensional design space, plus three responsible design dimensions of consent, context, and framing. They argue for responsible exploration to expand AI safety and ethics discourse while acknowledging methodological limits and the need for empirical validation.
Abstract
The vast majority of discourse around AI development assumes that subservient, "moral" models aligned with "human values" are universally beneficial -- in short, that good AI is sycophantic AI. We explore the shadow of the sycophantic paradigm, a design space we term antagonistic AI: AI systems that are disagreeable, rude, interrupting, confrontational, challenging, etc. -- embedding opposite behaviors or values. Far from being "bad" or "immoral," we consider whether antagonistic AI systems may sometimes have benefits to users, such as forcing users to confront their assumptions, build resilience, or develop healthier relational boundaries. Drawing from formative explorations and a speculative design workshop where participants designed fictional AI technologies that employ antagonism, we lay out a design space for antagonistic AI, articulating potential benefits, design techniques, and methods of embedding antagonistic elements into user experience. Finally, we discuss the many ethical challenges of this space and identify three dimensions for the responsible design of antagonistic AI -- consent, context, and framing.
