Explaining Away Results in Accurate and Tolerant Template Matching
M. W. Spratling
TL;DR
This paper introduces a template matching approach that uses explaining away through Divisive Input Modulation (DIM) to make template-evidence compete and produce sparse, robust matches. By pre-processing images to emphasize relative intensity and formulating a competitive, iterative inference, the method achieves greater tolerance to appearance changes than traditional template matching and several recent alternatives. Across multiple benchmarks, including Best Buddies and Oxford VGG datasets, DIM with additional non-target templates consistently outperforms baselines in accuracy, while remaining adaptable through parameter settings. The work demonstrates practical gains for patch localization and correspondence, and outlines clear avenues for extending tolerance to viewpoint changes and to CNN-based feature spaces.
Abstract
Recognising and locating image patches or sets of image features is an important task underlying much work in computer vision. Traditionally this has been accomplished using template matching. However, template matching is notoriously brittle in the face of changes in appearance caused by, for example, variations in viewpoint, partial occlusion, and non-rigid deformations. This article tests a method of template matching that is more tolerant to such changes in appearance and that can, therefore, more accurately identify image patches. In traditional template matching the comparison between a template and the image is independent of the other templates. In contrast, the method advocated here takes into account the evidence provided by the image for the template at each location and the full range of alternative explanations represented by the same template at other locations and by other templates. Specifically, the proposed method of template matching is performed using a form of probabilistic inference known as "explaining away". The algorithm used to implement explaining away has previously been used to simulate several neurobiological mechanisms, and been applied to image contour detection and pattern recognition tasks. Here it is applied for the first time to image patch matching, and is shown to produce superior results in comparison to the current state-of-the-art methods.
