Deep Learning in Concealed Dense Prediction
Pancheng Zhao, Deng-Ping Fan, Shupeng Cheng, Salman Khan, Fahad Shahbaz Khan, David Clifton, Peng Xu, Jufeng Yang
TL;DR
This paper defines Concealed Dense Prediction (CDP) as a class of dense vision tasks where targets are concealed, demanding fine-grained representations and reasoning. It provides a taxonomy of concealment mechanisms (biological, optical, artificial) and surveys 25 state-of-the-art methods across 12 concealed datasets, evaluating CDP across segmentation, detection, and edge estimation tasks. The authors propose a unified, multimodal direction with CvpINST and CvpAgent to enable instruction-tuned, cross-task concealed perception, and outline six research directions to advance data, models, and evaluation. The work highlights practical applications in industry, agriculture, medicine, and safety, and argues for integrated large-model frameworks to drive progress toward general concealed perception. Overall, the paper offers a structured landscape and actionable roadmap for advancing CDP in the era of large multimodal models.
Abstract
Deep learning is developing rapidly and handling common computer vision tasks well. It is time to pay attention to more complex vision tasks, as model size, knowledge, and reasoning capabilities continue to improve. In this paper, we introduce and review a family of complex tasks, termed Concealed Dense Prediction (CDP), which has great value in agriculture, industry, etc. CDP's intrinsic trait is that the targets are concealed in their surroundings, thus fully perceiving them requires fine-grained representations, prior knowledge, auxiliary reasoning, etc. The contributions of this review are three-fold: (i) We introduce the scope, characteristics, and challenges specific to CDP tasks and emphasize their essential differences from generic vision tasks. (ii) We develop a taxonomy based on concealment counteracting to summarize deep learning efforts in CDP through experiments on three tasks. We compare 25 state-of-the-art methods across 12 widely used concealed datasets. (iii) We discuss the potential applications of CDP in the large model era and summarize 6 potential research directions. We offer perspectives for the future development of CDP by constructing a large-scale multimodal instruction fine-tuning dataset, CvpINST, and a concealed visual perception agent, CvpAgent.
