Table of Contents
Fetching ...

L-VITeX: Light-weight Visual Intuition for Terrain Exploration

Antar Mazumder, Zarin Anjum Madhiha

TL;DR

The paper evaluates L-VITeX's performance across various terrains, and demonstrates the system's application in 3D mapping using a small mobile robot run by ESP32-Cam and Gaussian Splats, showcasing its potential to enhance exploration efficiency and decision-making.

Abstract

This paper presents L-VITeX, a lightweight visual intuition system for terrain exploration designed for resource-constrained robots and swarms. L-VITeX aims to provide a hint of Regions of Interest (RoIs) without computationally expensive processing. By utilizing the Faster Objects, More Objects (FOMO) tinyML architecture, the system achieves high accuracy (>99%) in RoI detection while operating on minimal hardware resources (Peak RAM usage < 50 KB) with near real-time inference (<200 ms). The paper evaluates L-VITeX's performance across various terrains, including mountainous areas, underwater shipwreck debris regions, and Martian rocky surfaces. Additionally, it demonstrates the system's application in 3D mapping using a small mobile robot run by ESP32-Cam and Gaussian Splats (GS), showcasing its potential to enhance exploration efficiency and decision-making.

L-VITeX: Light-weight Visual Intuition for Terrain Exploration

TL;DR

The paper evaluates L-VITeX's performance across various terrains, and demonstrates the system's application in 3D mapping using a small mobile robot run by ESP32-Cam and Gaussian Splats, showcasing its potential to enhance exploration efficiency and decision-making.

Abstract

This paper presents L-VITeX, a lightweight visual intuition system for terrain exploration designed for resource-constrained robots and swarms. L-VITeX aims to provide a hint of Regions of Interest (RoIs) without computationally expensive processing. By utilizing the Faster Objects, More Objects (FOMO) tinyML architecture, the system achieves high accuracy (>99%) in RoI detection while operating on minimal hardware resources (Peak RAM usage < 50 KB) with near real-time inference (<200 ms). The paper evaluates L-VITeX's performance across various terrains, including mountainous areas, underwater shipwreck debris regions, and Martian rocky surfaces. Additionally, it demonstrates the system's application in 3D mapping using a small mobile robot run by ESP32-Cam and Gaussian Splats (GS), showcasing its potential to enhance exploration efficiency and decision-making.

Paper Structure

This paper contains 16 sections, 4 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of the visual heuristic process building on anthropomorphic focus.
  • Figure 2: Proof of concept system design and Emphasis Function action.
  • Figure 3: FOMO detection performance across various datasets with three input sizes.
  • Figure 4: More emphasis resulting in more points for the target object and RoI with L-VITeX resulting in more useful 3D reconstruction through Gaussian Splatting.