Enhancing Complex Formula Recognition with Hierarchical Detail-Focused Network
Jiale Wang, Junhui Yu, Huanyong Liu, Chenanran Kong
TL;DR
MER suffers from multiple valid interpretations of complex formulas, causing parsing ambiguity. This work presents HDR, a large-scale MER dataset with HDR-100M training data and HDR-Test, and introduces HDNet, a Transformer-based encoder-decoder with a hierarchical sub-formula module that crops high-resolution sub-formulas and fuses their features via $Z = alpha Z_{main} + (1 - alpha) (1/n) sum_i Z_i$. The training objective combines main and sub-formula losses, $L_{total} = alpha L_{main} + (1 - alpha) (1/n) sum_i L_i$, including $L_{main} = - sum_{t=1}^{T} log p(y_t | y_{<t}, Z)$. A fair evaluation protocol maps predictions to functionally equivalent expressions and uses metrics such as $CR = 1 - EditDistance / NumberOfCharacters$, $AED$, and BLEU, with HDNet achieving state-of-the-art results on HDR-Test and public MER datasets. Overall, the work provides large-scale data, a detail-focused architecture, and fair evaluation practices that advance reliable MER for complex hierarchical formulas.
Abstract
Hierarchical and complex Mathematical Expression Recognition (MER) is challenging due to multiple possible interpretations of a formula, complicating both parsing and evaluation. In this paper, we introduce the Hierarchical Detail-Focused Recognition dataset (HDR), the first dataset specifically designed to address these issues. It consists of a large-scale training set, HDR-100M, offering an unprecedented scale and diversity with one hundred million training instances. And the test set, HDR-Test, includes multiple interpretations of complex hierarchical formulas for comprehensive model performance evaluation. Additionally, the parsing of complex formulas often suffers from errors in fine-grained details. To address this, we propose the Hierarchical Detail-Focused Recognition Network (HDNet), an innovative framework that incorporates a hierarchical sub-formula module, focusing on the precise handling of formula details, thereby significantly enhancing MER performance. Experimental results demonstrate that HDNet outperforms existing MER models across various datasets.
