Table of Contents
Fetching ...

IDA: Breaking Barriers in No-code UI Automation Through Large Language Models and Human-Centric Design

Segev Shlomov, Avi Yaeli, Sami Marreed, Sivan Schwartz, Netanel Eder, Offer Akrabi, Sergey Zeltyn

TL;DR

The paper tackles the barrier of no-code UI automation for non-technical business users by introducing IDA, a no-code tool that blends guided programming by demonstration, a semantic programming model, and a teacher-student learning metaphor. Leveraging large language models, IDA supports multiple demonstrations, semantic element understanding, and robust program synthesis to automate web tasks across enterprise applications. A prototype demonstrates an intuitive Define–Teach–Validate workflow with real-world tasks such as HR candidate screening, achieving high usability and trust in a user study with eight participants. The work contributes a human-centric design, novel LLM-powered semantic detection, and demonstration-guided program synthesis, highlighting IDA’s potential to boost business productivity by enabling business users to self-serve web UI automation with minimal coding.

Abstract

Business users dedicate significant amounts of time to repetitive tasks within enterprise digital platforms, highlighting a critical need for automation. Despite advancements in low-code tools for UI automation, their complexity remains a significant barrier to adoption among non-technical business users. However, recent advancements in large language models (LLMs) have created new opportunities to overcome this barrier by offering more powerful, yet simpler and more human-centric programming environments. This paper presents IDA (Intelligent Digital Apprentice), a novel no-code Web UI automation tool designed specifically to empower business users with no technical background. IDA incorporates human-centric design principles, including guided programming by demonstration, semantic programming model, and teacher-student learning metaphor which is tailored to the skill set of business users. By leveraging LLMs, IDA overcomes some of the key technical barriers that have traditionally limited the possibility of no-code solutions. We have developed a prototype of IDA and conducted a user study involving real world business users and enterprise applications. The promising results indicate that users could effectively utilize IDA to create automation. The qualitative feedback indicates that IDA is perceived as user-friendly and trustworthy. This study contributes to unlocking the potential of AI assistants to enhance the productivity of business users through no-code user interface automation.

IDA: Breaking Barriers in No-code UI Automation Through Large Language Models and Human-Centric Design

TL;DR

The paper tackles the barrier of no-code UI automation for non-technical business users by introducing IDA, a no-code tool that blends guided programming by demonstration, a semantic programming model, and a teacher-student learning metaphor. Leveraging large language models, IDA supports multiple demonstrations, semantic element understanding, and robust program synthesis to automate web tasks across enterprise applications. A prototype demonstrates an intuitive Define–Teach–Validate workflow with real-world tasks such as HR candidate screening, achieving high usability and trust in a user study with eight participants. The work contributes a human-centric design, novel LLM-powered semantic detection, and demonstration-guided program synthesis, highlighting IDA’s potential to boost business productivity by enabling business users to self-serve web UI automation with minimal coding.

Abstract

Business users dedicate significant amounts of time to repetitive tasks within enterprise digital platforms, highlighting a critical need for automation. Despite advancements in low-code tools for UI automation, their complexity remains a significant barrier to adoption among non-technical business users. However, recent advancements in large language models (LLMs) have created new opportunities to overcome this barrier by offering more powerful, yet simpler and more human-centric programming environments. This paper presents IDA (Intelligent Digital Apprentice), a novel no-code Web UI automation tool designed specifically to empower business users with no technical background. IDA incorporates human-centric design principles, including guided programming by demonstration, semantic programming model, and teacher-student learning metaphor which is tailored to the skill set of business users. By leveraging LLMs, IDA overcomes some of the key technical barriers that have traditionally limited the possibility of no-code solutions. We have developed a prototype of IDA and conducted a user study involving real world business users and enterprise applications. The promising results indicate that users could effectively utilize IDA to create automation. The qualitative feedback indicates that IDA is perceived as user-friendly and trustworthy. This study contributes to unlocking the potential of AI assistants to enhance the productivity of business users through no-code user interface automation.
Paper Structure (32 sections, 14 figures, 3 tables)

This paper contains 32 sections, 14 figures, 3 tables.

Figures (14)

  • Figure 1: Decision-making flowchart in OrangeHRM: Actions (white), observed states (purple), and decision outcomes (green).
  • Figure 2: UI interaction steps in OrangeHRM for candidate decision-making. Includes candidate lookup, search results, and resume status checks.
  • Figure 3: IDA Integration for Seamless Workflow: Showcases the side-by-side setup of the IDA client and HR application, enabling Cassie to interact with both simultaneously for a seamless demonstration and feedback experience.
  • Figure 4: IDA's Automation Lifecycle: Illustrates IDA's step-by-step guidance from defining ('I'm Learning' state) to validation and deployment readiness ('Ready to Deploy' state), ensuring a user-validated development process.
  • Figure 5: IDA's in-context Semantic Guidance: In the manual-review1 scenario, IDA (a) displays captured steps, (b) Cassie identifies the search table by selecting the header, and (c) IDA suggests actions for Cassie to choose verbally or by click. Semantic conditions are highlighted in purple, decisions in green.
  • ...and 9 more figures