Spatial-temporal Hierarchical Reinforcement Learning for Interpretable Pathology Image Super-Resolution
Wenting Chen, Jie Liu, Tommy W. S. Chow, Yixuan Yuan
TL;DR
STAR-RL reimagines pathology image super-resolution as a Markov decision process and solves it with a spatial-temporal hierarchical reinforcement learning framework. By introducing outer-loop patch selection via a spatial manager, inner-loop patch recovery via a patch worker, and a temporal manager to regulate termination, it achieves interpretable, resource-aware SR that preserves fine biological details. Quantitative and qualitative results on HistoSR demonstrate superior local-structure metrics (FSIM, GMSD) and competitive global metrics, with ablations validating each component. The method generalizes across degradations and sizes and shows promise for improving diagnostic performance in tasks like tumor recognition, including gigapixel whole-slide images.
Abstract
Pathology image are essential for accurately interpreting lesion cells in cytopathology screening, but acquiring high-resolution digital slides requires specialized equipment and long scanning times. Though super-resolution (SR) techniques can alleviate this problem, existing deep learning models recover pathology image in a black-box manner, which can lead to untruthful biological details and misdiagnosis. Additionally, current methods allocate the same computational resources to recover each pixel of pathology image, leading to the sub-optimal recovery issue due to the large variation of pathology image. In this paper, we propose the first hierarchical reinforcement learning framework named Spatial-Temporal hierARchical Reinforcement Learning (STAR-RL), mainly for addressing the aforementioned issues in pathology image super-resolution problem. We reformulate the SR problem as a Markov decision process of interpretable operations and adopt the hierarchical recovery mechanism in patch level, to avoid sub-optimal recovery. Specifically, the higher-level spatial manager is proposed to pick out the most corrupted patch for the lower-level patch worker. Moreover, the higher-level temporal manager is advanced to evaluate the selected patch and determine whether the optimization should be stopped earlier, thereby avoiding the over-processed problem. Under the guidance of spatial-temporal managers, the lower-level patch worker processes the selected patch with pixel-wise interpretable actions at each time step. Experimental results on medical images degraded by different kernels show the effectiveness of STAR-RL. Furthermore, STAR-RL validates the promotion in tumor diagnosis with a large margin and shows generalizability under various degradations. The source code is available at https://github.com/CUHK-AIM-Group/STAR-RL.
