Intelligence of Things: A Spatial Context-Aware Control System for Smart Devices
Sukanth Kalivarathan, Muhmmad Abrar Raja Mohamed, Aswathy Ravikumar, S Harini
TL;DR
This work addresses the friction of device-name–based smart home control by introducing INOT, a spatial context-aware system that fuses vision-language reasoning with IoT control. The approach combines an Onboarding Inference Engine, OWL-ViT 2 zero-shot detection, GPT-4o spatial topology, and Gemini Flash-driven command synthesis, all tied together through modular adapters to Tuya's API. Through a 15-participant user study, INOT significantly reduced cognitive workload (NASA-TLX) and was preferred over Google Home for spatially grounded interactions, demonstrating improved ease-of-use and accessibility across languages. The work demonstrates practical impact by enabling natural, context-aware commands like 'turn on the light near the window' and by outlining extensibility to smart glasses, static cameras, and elder-care environments while addressing privacy through potential on-device processing and encrypted pipelines.
Abstract
This paper introduces Intelligence of Things (INOT), a novel spatial context-aware control system that enhances smart home automation through intuitive spatial reasoning. Current smart home systems largely rely on device-specific identifiers, limiting user interaction to explicit naming conventions rather than natural spatial references. INOT addresses this limitation through a modular architecture that integrates Vision Language Models with IoT control systems to enable natural language commands with spatial context (e.g., "turn on the light near the window"). The system comprises key components including an Onboarding Inference Engine, Zero-Shot Device Detection, Spatial Topology Inference, and Intent-Based Command Synthesis. A comprehensive user study with 15 participants demonstrated INOT's significant advantages over conventional systems like Google Home Assistant, with users reporting reduced cognitive workload (NASA-TLX scores decreased by an average of 13.17 points), higher ease-of-use ratings, and stronger preference (14 out of 15 participants). By eliminating the need to memorize device identifiers and enabling context-aware spatial commands, INOT represents a significant advancement in creating more intuitive and accessible smart home control systems.
