AnyThermal: Towards Learning Universal Representations for Thermal Perception
Parv Maheshwari, Jay Karhade, Yogesh Chawla, Isaiah Adu, Florian Heisen, Andrew Porco, Andrew Jong, Yifei Liu, Santosh Pitla, Sebastian Scherer, Wenshan Wang
TL;DR
This work tackles the scarcity and limited diversity of thermal data by introducing AnyThermal, a task-agnostic thermal encoder distilled from RGB foundation models. By combining RGB-to-thermal knowledge distillation across multiple environments, AnyThermal achieves state-of-the-art results on thermal segmentation, cross-modal place recognition, and monocular thermal depth estimation, without task-specific finetuning. The authors also present the TartanRGBT platform and dataset to systematically expand multi-domain RGB-T data, enabling scalable improvement and open community contribution. Across diverse environments and tasks, the approach demonstrates that diverse training data is key to robust generalization, achieving improvements up to 36% over baselines and highlighting the platform’s potential for broad adoption in thermal perception pipelines.
Abstract
We present AnyThermal, a thermal backbone that captures robust task-agnostic thermal features suitable for a variety of tasks such as cross-modal place recognition, thermal segmentation, and monocular depth estimation using thermal images. Existing thermal backbones that follow task-specific training from small-scale data result in utility limited to a specific environment and task. Unlike prior methods, AnyThermal can be used for a wide range of environments (indoor, aerial, off-road, urban) and tasks, all without task-specific training. Our key insight is to distill the feature representations from visual foundation models such as DINOv2 into a thermal encoder using thermal data from these multiple environments. To bridge the diversity gap of the existing RGB-Thermal datasets, we introduce the TartanRGBT platform, the first open-source data collection platform with synced RGB-Thermal image acquisition. We use this payload to collect the TartanRGBT dataset - a diverse and balanced dataset collected in 4 environments. We demonstrate the efficacy of AnyThermal and TartanRGBT, achieving state-of-the-art results with improvements of up to 36% across diverse environments and downstream tasks on existing datasets.
