Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection
Taeheon Kim, Sebin Shin, Youngjoon Yu, Hak Gu Kim, Yong Man Ro
TL;DR
This work tackles modality bias in multispectral pedestrian detection by explicitly modeling causality between RGB and thermal inputs. The proposed Causal Mode Multiplexer (CMM) learns data-type–specific causal graphs, using a common mode to capture total effects on day data and a differential mode to extract total indirect effects on ROTX data, thereby pruning the thermal direct effect that skews predictions. A novel ROTX-MP dataset is introduced to stress-test ROTX scenarios, and extensive experiments show improved generalization across KAIST, CVC-14, FLIR, and ROTX-MP, with AP gains up to 70.4 on ROTX-MP and notable improvements on existing benchmarks. The approach offers a principled, counterfactual-based debiasing framework that enhances reliability of multispectral detectors in safety-critical applications. The ROTX-MP release will catalyze further research into modality bias and causal reasoning in multimodal perception.
Abstract
RGBT multispectral pedestrian detection has emerged as a promising solution for safety-critical applications that require day/night operations. However, the modality bias problem remains unsolved as multispectral pedestrian detectors learn the statistical bias in datasets. Specifically, datasets in multispectral pedestrian detection mainly distribute between ROTO (day) and RXTO (night) data; the majority of the pedestrian labels statistically co-occur with their thermal features. As a result, multispectral pedestrian detectors show poor generalization ability on examples beyond this statistical correlation, such as ROTX data. To address this problem, we propose a novel Causal Mode Multiplexer (CMM) framework that effectively learns the causalities between multispectral inputs and predictions. Moreover, we construct a new dataset (ROTX-MP) to evaluate modality bias in multispectral pedestrian detection. ROTX-MP mainly includes ROTX examples not presented in previous datasets. Extensive experiments demonstrate that our proposed CMM framework generalizes well on existing datasets (KAIST, CVC-14, FLIR) and the new ROTX-MP. We will release our new dataset to the public for future research.
