Enhancing Trust in LLM-Based AI Automation Agents: New Considerations and Future Challenges
Sivan Schwartz, Avi Yaeli, Segev Shlomov
TL;DR
This paper tackles how to establish and evaluate trust in LLM-based AI automation agents, a nascent class of autonomous digital agents embedded in no-code automation workflows. It surveys trust concepts from human-human research and translates them into dimensions for AI agents: reliability, openness, tangibility, immediacy behaviors, task characteristics, and trust trajectory. The authors propose concrete mediations and grounding mechanisms to improve reliability, advocate for transparency about goals and algorithms, and discuss the value of avatars and empathetic interactions. An initial assessment of nascent products via a comparative analysis with ChatGPT+Plugins, MS Copilot, AgentGPT, and Adept.AI highlights gaps and informs a roadmap for metrics, testing, and governance. The work emphasizes building trustworthy AI automation as essential for safe deployment in business processes and calls for interdisciplinary approaches to define metrics and certification.
Abstract
Trust in AI agents has been extensively studied in the literature, resulting in significant advancements in our understanding of this field. However, the rapid advancements in Large Language Models (LLMs) and the emergence of LLM-based AI agent frameworks pose new challenges and opportunities for further research. In the field of process automation, a new generation of AI-based agents has emerged, enabling the execution of complex tasks. At the same time, the process of building automation has become more accessible to business users via user-friendly no-code tools and training mechanisms. This paper explores these new challenges and opportunities, analyzes the main aspects of trust in AI agents discussed in existing literature, and identifies specific considerations and challenges relevant to this new generation of automation agents. We also evaluate how nascent products in this category address these considerations. Finally, we highlight several challenges that the research community should address in this evolving landscape.
