IRIS: Wireless Ring for Vision-based Smart Home Interaction
Maruchi Kim, Antonio Glenn, Bandhav Veluri, Yunseo Lee, Eyoel Gebre, Aditya Bagaria, Shwetak Patel, Shyamnath Gollakota
TL;DR
IRIS tackles the challenge of vision-based smart home control with a fully wireless ring that integrates a camera, BLE, IMU, and battery in a SWaP-constrained form factor. It fuses a hardware design with a cross-layer ML pipeline (YOLO+CODA for center-object detection and DINOV2-based semantic embeddings) to achieve instance-level device recognition via context-rich scene semantics. The system delivers real-time interaction (end-to-end latency under or around 1 s) and 16–24 h of battery life, validated by a user study showing IRIS is often preferred over voice for toggling, granular control, and social acceptability. The work demonstrates practical, unobtrusive smart home control with broad potential impact on ring-based HCI, while outlining future directions for gesture expansion, on-device inference, and privacy-preserving deployment.
Abstract
Integrating cameras into wireless smart rings has been challenging due to size and power constraints. We introduce IRIS, the first wireless vision-enabled smart ring system for smart home interactions. Equipped with a camera, Bluetooth radio, inertial measurement unit (IMU), and an onboard battery, IRIS meets the small size, weight, and power (SWaP) requirements for ring devices. IRIS is context-aware, adapting its gesture set to the detected device, and can last for 16-24 hours on a single charge. IRIS leverages the scene semantics to achieve instance-level device recognition. In a study involving 23 participants, IRIS consistently outpaced voice commands, with a higher proportion of participants expressing a preference for IRIS over voice commands regarding toggling a device's state, granular control, and social acceptability. Our work pushes the boundary of what is possible with ring form-factor devices, addressing system challenges and opening up novel interaction capabilities.
