Engineering Trustworthy Automation: Design Principles and Evaluation for AutoML Tools for Novices
Jarne Thys, Davy Vanacken, Gustavo Rovelo Ruiz
TL;DR
The paper addresses the challenge of making AutoML accessible to novices by proposing an abstract end-to-end pipeline that covers data intake, guided configuration, training, evaluation, and inference. It introduces NovaClass, a novice-friendly prototype for transformer-based text classification that emphasizes one-click training, cascade classification, and metadata-driven inference, paired with a context-aware assistant. A 24-participant study shows that all users can complete end-to-end tasks with positive user experience, though experienced users report higher trust and understanding than novices, highlighting gaps in mental models and transparency. From these findings, the authors derive four design principles to improve novice AutoML tools: ensure first-model success, provide explanations, offer abstractions with context-aware assistance, and enforce predictability and safeguards, guiding future development toward usable, trustworthy end-to-end AutoML for non-experts.
Abstract
AutoML systems targeting novices often prioritize algorithmic automation over usability, leaving gaps in users' understanding, trust, and end-to-end workflow support. To address these issues, we propose an abstract pipeline that covers data intake, guided configuration, training, evaluation, and inference. To examine the abstract pipeline, we report a user study where we assess trust, understandability, and UX of a prototype implementation. In a 24-participant study, all participants successfully built their own models, UEQ ratings were positive, yet experienced users reported higher trust and understanding than novices. Based on this study, we propose four design principles to improve the design of AutoML systems targeting novices: (P1) support first-model success to enhance user self-efficacy, (P2) provide explanations to help users form correct mental models and develop appropriate levels of reliance, (P3) provide abstractions and context-aware assistance to keep users in their zone of proximal development, and (P4) ensure predictability and safeguards to strengthen users' sense of control.
