"I know it's not right, but that's what it said to do": Investigating Trust in AI Chatbots for Cybersecurity Policy
Brandon Lit, Edward Crowder, Daniel Vogel, Hassan Khan
TL;DR
The study investigates how trust in AI chatbots affects susceptibility to malicious guidance in cybersecurity policy tasks. Using an in situ deception design (N=15), participants interact with a benign chatbot for three tasks and a manipulated adversarial chatbot for the remaining two, while their actions on a Windows VM are logged. Results show that many participants follow bad advice, with trust influenced by task familiarity and confidence in their own judgment, highlighting challenges in recognizing compromised AI. The work offers an ecologically valid methodology, behavioral insights into human–AI trust in security contexts, and design recommendations to foster critical evaluation of chatbot guidance.
Abstract
AI chatbots are an emerging security attack vector, vulnerable to threats such as prompt injection, and rogue chatbot creation. When deployed in domains such as corporate security policy, they could be weaponized to deliver guidance that intentionally undermines system defenses. We investigate whether users can be tricked by a compromised AI chatbot in this scenario. A controlled study (N=15) asked participants to use a chatbot to complete security-related tasks. Without their knowledge, the chatbot was manipulated to give incorrect advice for some tasks. The results show how trust in AI chatbots is related to task familiarity, and confidence in their ownn judgment. Additionally, we discuss possible reasons why people do or do not trust AI chatbots in different scenarios.
