Bridging the gap between natural user expression with complex automation programming in smart homes
Yingtian Shi, Xiaoyi Liu, Chun Yu, Tianao Yang, Cheng Gao, Chen Liang, Yuanchun Shi
TL;DR
This work tackles the challenge of enabling end-users to configure complex smart-home automation using natural expressions while ensuring executability. It introduces AwareAuto, a system that fuses context-aware multimodal sensing with two-step LLM inference (intent and feasibility) and a TA-pair grounding mechanism to generate deployable automation rules. Key contributions include standardized multimodal inputs, a structured prompt framework for both rule inference and grounding, a TA-pair representation of automation, and an interactive loop for controllability, achieving an overall 91.7% inference success on a realistic complex-task dataset. The approach demonstrates how grounding LLMs in real-world smart-home contexts can balance naturalness and expressiveness, paving the way for practical, user-friendly end-user programming of dynamic, multi-modal automation.
Abstract
A long-standing challenge in end-user programming (EUP) is to trade off between natural user expression and the complexity of programming tasks. As large language models (LLMs) are empowered to handle semantic inference and natural language understanding, it remains under-explored how such capabilities can facilitate end-users to configure complex automation more naturally and easily. We propose AwareAuto, an EUP system that standardizes user expression and finishes two-step inference with the LLMs to achieve automation generation. AwareAuto allows contextual, multi-modality, and flexible user expression to configure complex automation tasks (e.g., dynamic parameters, multiple conditional branches, and temporal constraints), which are non-manageable in traditional EUP solutions. By studying realistic, complex rules data, AwareAuto gains 91.7% accuracy in matching user intentions and feasibility. We introduced user interaction to ensure system controllability and usability. We discuss the opportunities and challenges of incorporating LLMs in end-user programming techniques and grounding complex smart home contexts.
