Table of Contents
Fetching ...

Full System Architecture Modeling for Wearable Egocentric Contextual AI

Vincent T. Lee, Tanfer Alan, Sung Kim, Ecenur Ustun, Amr Suleiman, Ajit Krisshna, Tim Balbekov, Armin Alaghi, Richard Newcombe

TL;DR

Wearable egocentric contextual AI faces stringent power constraints while integrating rich personal context signals. The authors present the first end-to-end full-system power and performance model (PnPSim) for Aria2 and use it to study on-device vs off-device computation across egocentric primitives. Key contributions include defining egocentric primitives, detailing a full-system modeling approach, and showing that holistic, cross-stack optimizations are required due to distributed power bottlenecks (Amdahl-like effects). The findings guide design choices and reveal a roadmap for achieving all-day operation through coordinated hardware-software co-design and continuous full-system evaluation.

Abstract

The next generation of human-oriented computing will require always-on, spatially-aware wearable devices to capture egocentric vision and functional primitives (e.g., Where am I? What am I looking at?, etc.). These devices will sense an egocentric view of the world around us to observe all human-relevant signals across space and time to construct and maintain a user's personal context. This personal context, combined with advanced generative AI, will unlock a powerful new generation of contextual AI personal assistants and applications. However, designing a wearable system to support contextual AI is a daunting task because of the system's complexity and stringent power constraints due to weight and battery restrictions. To understand how to guide design for such systems, this work provides the first complete system architecture view of one such wearable contextual AI system (Aria2), along with the lessons we have learned through the system modeling and design space exploration process. We show that an end-to-end full system model view of such systems is vitally important, as no single component or category overwhelmingly dominates system power. This means long-range design decisions and power optimizations need to be made in the full system context to avoid running into limits caused by other system bottlenecks (i.e., Amdahl's law as applied to power) or as bottlenecks change. Finally, we reflect on lessons and insights for the road ahead, which will be important toward eventually enabling all-day, wearable, contextual AI systems.

Full System Architecture Modeling for Wearable Egocentric Contextual AI

TL;DR

Wearable egocentric contextual AI faces stringent power constraints while integrating rich personal context signals. The authors present the first end-to-end full-system power and performance model (PnPSim) for Aria2 and use it to study on-device vs off-device computation across egocentric primitives. Key contributions include defining egocentric primitives, detailing a full-system modeling approach, and showing that holistic, cross-stack optimizations are required due to distributed power bottlenecks (Amdahl-like effects). The findings guide design choices and reveal a roadmap for achieving all-day operation through coordinated hardware-software co-design and continuous full-system evaluation.

Abstract

The next generation of human-oriented computing will require always-on, spatially-aware wearable devices to capture egocentric vision and functional primitives (e.g., Where am I? What am I looking at?, etc.). These devices will sense an egocentric view of the world around us to observe all human-relevant signals across space and time to construct and maintain a user's personal context. This personal context, combined with advanced generative AI, will unlock a powerful new generation of contextual AI personal assistants and applications. However, designing a wearable system to support contextual AI is a daunting task because of the system's complexity and stringent power constraints due to weight and battery restrictions. To understand how to guide design for such systems, this work provides the first complete system architecture view of one such wearable contextual AI system (Aria2), along with the lessons we have learned through the system modeling and design space exploration process. We show that an end-to-end full system model view of such systems is vitally important, as no single component or category overwhelmingly dominates system power. This means long-range design decisions and power optimizations need to be made in the full system context to avoid running into limits caused by other system bottlenecks (i.e., Amdahl's law as applied to power) or as bottlenecks change. Finally, we reflect on lessons and insights for the road ahead, which will be important toward eventually enabling all-day, wearable, contextual AI systems.

Paper Structure

This paper contains 26 sections, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Personal context is built by first sensing the world around us, computing over sensor data to extract egocentric signals. Personal context is combined with an AI agent to drive personalized contextual AI services.
  • Figure 2: Aria2 high level architecture diagram aria2-technical-ref. Compared to existing wearable systems and mobile phones, contextual AI devices like Aria2 have significantly more sensors which better capture an egocentric view of the world. These sensor signals need to be processed efficiently on device prior to offload under a battery constraint much smaller than existing mobile phones.
  • Figure 3: Power composition by device category (normalized out of 100%). On-device computation is 16% lower power relative to the full offload configuration.
  • Figure 4: Power composition for different subsets of egocentric signals computed on device. Egocentric signals computed on device upload sensor data and compute signals off device on a backend server. On-device compute trade-offs compute power for reduced communication power. Actual values for will vary with algorithm variant, system architecture, and maturity.
  • Figure 5: Impact of technology scaling for on-device compute case. Digital devices scale better than analog components which make analog device bottlenecks more acute over time.
  • ...and 1 more figures