Table of Contents
Fetching ...

Imaging for All-Day Wearable Smart Glasses

Michael Goesele, Daniel Andersen, Yujia Chen, Simon Green, Eddy Ilg, Chao Li, Johnson Liu, Grace Kuo, Logan Wan, Richard Newcombe

TL;DR

The paper tackles the challenge of imaging with all-day wearable smart glasses by proposing a distributed camera array that splits high-resolution content across multiple tiny sensors. It develops a complete end-to-end pipeline (preprocessing, OFW, RSR, fusion, depth, SLAM) to synthesize photorealistic output from guide and detail views and demonstrates performance approaching smartphone-quality in experiments with synthetic and real data. Key contributions include a physics-driven analysis of optical limits, a three-pronged distributed camera design, and a robust reconstruction framework validated against commercial devices and QR-reading tasks. The work highlights practical implications for form-factor, power, and privacy, offering a viable path toward glasses that support photography and AI-driven perception with wearable comfort and social acceptability.

Abstract

In recent years smart glasses technology has rapidly advanced, opening up entirely new areas for mobile computing. We expect future smart glasses will need to be all-day wearable, adopting a small form factor to meet the requirements of volume, weight, fashionability and social acceptability, which puts significant constraints on the space of possible solutions. Additional challenges arise due to the fact that smart glasses are worn in arbitrary environments while their wearer moves and performs everyday activities. In this paper, we systematically analyze the space of imaging from smart glasses and derive several fundamental limits that govern this imaging domain. We discuss the impact of these limits on achievable image quality and camera module size -- comparing in particular to related devices such as mobile phones. We then propose a novel distributed imaging approach that allows to minimize the size of the individual camera modules when compared to a standard monolithic camera design. Finally, we demonstrate the properties of this novel approach in a series of experiments using synthetic data as well as images captured with two different prototype implementations.

Imaging for All-Day Wearable Smart Glasses

TL;DR

The paper tackles the challenge of imaging with all-day wearable smart glasses by proposing a distributed camera array that splits high-resolution content across multiple tiny sensors. It develops a complete end-to-end pipeline (preprocessing, OFW, RSR, fusion, depth, SLAM) to synthesize photorealistic output from guide and detail views and demonstrates performance approaching smartphone-quality in experiments with synthetic and real data. Key contributions include a physics-driven analysis of optical limits, a three-pronged distributed camera design, and a robust reconstruction framework validated against commercial devices and QR-reading tasks. The work highlights practical implications for form-factor, power, and privacy, offering a viable path toward glasses that support photography and AI-driven perception with wearable comfort and social acceptability.

Abstract

In recent years smart glasses technology has rapidly advanced, opening up entirely new areas for mobile computing. We expect future smart glasses will need to be all-day wearable, adopting a small form factor to meet the requirements of volume, weight, fashionability and social acceptability, which puts significant constraints on the space of possible solutions. Additional challenges arise due to the fact that smart glasses are worn in arbitrary environments while their wearer moves and performs everyday activities. In this paper, we systematically analyze the space of imaging from smart glasses and derive several fundamental limits that govern this imaging domain. We discuss the impact of these limits on achievable image quality and camera module size -- comparing in particular to related devices such as mobile phones. We then propose a novel distributed imaging approach that allows to minimize the size of the individual camera modules when compared to a standard monolithic camera design. Finally, we demonstrate the properties of this novel approach in a series of experiments using synthetic data as well as images captured with two different prototype implementations.

Paper Structure

This paper contains 51 sections, 16 equations, 22 figures, 2 tables.

Figures (22)

  • Figure 1: Trade-off for a fixed focus camera between angular resolution, entrance pupil diameter (which sets the minimum lens size), and depth of field (DOF), represented by the hyperfocal distance. Lenses within the gray region on the lower left are not physically possible due to diffraction, so achieving 1 arcmin resolution (red diamond) requires an entrance pupil of at least 2 mm. In addition, the long hyperfocal distance of such a lens means that only distant content, beyond about 2 m, is in focus. Therefore, auto-focus is necessary adding additional size, weight and power consumption. Scaling back the resolution to 2 arcmin (red star) enables a long enough DOF that auto-focus is not needed. Further, this design can be achieved with a smaller lens diameter, making the system more compact. Given current technology, this is our recommended trade-off for imaging on smart glasses.
  • Figure 2: Percentage of egocentric image data with minimal motion blur (movement < 1px) as a result of exposure time, for a camera with $\delta\theta$ = 1 arcmin (dashed lines) and $\delta\theta$ = 2 arcmin (solid lines). Normal user movements put strict limitations on the achievable exposure time (blue). Users are, however, able to consciously hold their head still if desired, allowing for significantly longer exposure times (red).
  • Figure 3: Illuminance levels required to achieve SNR=10 for a hypothetical camera module with f-number 1.8 and a sensor pixel pitch of 1 $\mu$m. The black lines relate this to the exposure time required to avoid motion blur as shown in Fig. \ref{['fig:motion_exposure']}. E.g., reaching SNR=10 and 50 % non-blurred frames under the Aria pilot dataset motion distribution using a camera with an IFOV of 2 arcmin requires an exposure time of 2.6 ms and an illuminance of approx. 800 lux.
  • Figure 4: Distributed camera layout design visualized in 2D. Left: three wide FOV cameras pointing to the same scene. Middle: three narrow FOV camera tiling together to cover the full scene. Right: one wide FOV camera covering the full FOV and three narrow FOV cameras tiling together to capture details in the scene. Note that close to the detail cameras, the fields of view do not overlap and $d$ represents the closest distance at which objects are captured by all detail cameras.
  • Figure 5: Overview of the image processing pipeline. The input is either a single set of guide and detail images or respective bursts of images. After preprocessing, we present two orthogonal reconstruction approaches that achieve sharpness and robustness, respectively, and fuse the results of both together to obtain a high-quality photograph.
  • ...and 17 more figures