Optimizing Decomposition for Optimal Claim Verification
Yining Lu, Noah Ziems, Hy Dang, Meng Jiang
TL;DR
This work addresses misalignment between decomposition and verification in long-form factuality evaluation by introducing a verifier-aware, dynamic decomposition framework. By formulating the problem as a bilevel optimization and solving it with an on-policy RL approach (DyDecomp), the method learns a decomposition policy that tunes subclaim atomicity to each verifier’s preferred information density. Empirical results show DyDecomp improves verification confidence by about 0.07 and accuracy by about 0.12 on multiple verifiers and datasets, while requiring only 4.73M parameters. The study also demonstrates that verification confidence correlates strongly with accuracy and that optimal atomicity varies across verifiers, underscoring the value of adapting decomposition to downstream verification systems for robust long-form factuality evaluation.
Abstract
Current research on the \textit{Decompose-Then-Verify} paradigm for evaluating the factuality of long-form text typically treats decomposition and verification in isolation, overlooking their interactions and potential misalignment. We find that existing decomposition policies, typically hand-crafted demonstrations, do not align well with downstream verifiers in terms of atomicity -- a novel metric quantifying information density -- leading to suboptimal verification results. We formulate finding the optimal decomposition policy for optimal verification as a bilevel optimization problem. To approximate a solution for this strongly NP-hard problem, we propose dynamic decomposition, a reinforcement learning framework that leverages verifier feedback to learn a policy for dynamically decomposing claims to verifier-preferred atomicity. Experimental results show that dynamic decomposition outperforms existing decomposition policies, improving verification confidence by 0.07 and accuracy by 0.12 (on a 0-1 scale) on average across varying verifiers, datasets, and atomcities of input claims.
