"What If Smart Homes Could See Our Homes?": Exploring DIY Smart Home Building Experiences with VLM-Based Camera Sensors
Sojeong Yun, Youn-kyung Lim
TL;DR
This work investigates how Vision-Language Model (VLM) camera sensors could transform DIY smart homes by enabling autonomous understanding of household contexts. Through a three-week diary-based experience prototyping with 12 participants, the study reveals three key outcomes: roles for VLM-based features (auto-monitoring, assistant, advisory), the distinctive sensor characteristics (comprehensive sensing, inference, perspective-embodied sensing, unbounded values, interpretive capabilities) that reshape the DIY process, and user concerns (privacy, replacement of family interactions, over-dependence, and AI control). The authors offer design implications across the DIY workflow to support feature construction with VLM sensors and discuss implications for living with intelligent homes. The findings highlight both the potential to simplify DIY smart-home building and the need to address trust, privacy, and social-psychological dynamics to ensure user autonomy and well-being. Overall, the work provides a user-centered foundation for developing VLM-based DIY smart-home systems and identifies critical directions for real-world deployment and collaborative use.
Abstract
The advancement of Vision-Language Model (VLM) camera sensors, which enable autonomous understanding of household situations without user intervention, has the potential to completely transform the DIY smart home building experience. Will this simplify or complicate the DIY smart home process? Additionally, what features do users want to create using these sensors? To explore this, we conducted a three-week diary-based experience prototyping study with 12 participants. Participants recorded their daily activities, used GPT to analyze the images, and manually customized and tested smart home features based on the analysis. The study revealed three key findings: (1) participants' expectations for VLM camera-based smart homes, (2) the impact of VLM camera sensor characteristics on the DIY process, and (3) users' concerns. Through the findings of this study, we propose design implications to support the DIY smart home building process with VLM camera sensors, and discuss living with intelligence.
