Table of Contents
Fetching ...

A Progressive Visual-Logic-Aligned Framework for Ride-Hailing Adjudication

Weiming Wu, Zi-Jian Cheng, Jie Meng, Peng Zhen, Shan Huang, Qun Li, Guobin Wu, Lan-Zhe Guo

Abstract

The efficient adjudication of responsibility disputes is pivotal for maintaining marketplace fairness. However, the exponential surge in ride-hailing volume renders manual review intractable, while conventional automated methods lack the reasoning transparency required for quasi-judicial decisions. Although Multimodal LLMs offer a promising paradigm, they fundamentally struggle to bridge the gap between general visual semantics and rigorous evidentiary protocols, often leading to perceptual hallucinations and logical looseness. To address these systemic misalignments, we introduce RideJudge, a Progressive Visual-Logic-Aligned Framework. Instead of relying on generic pre-training, we bridge the semantic gap via SynTraj, a synthesis engine that grounds abstract liability concepts into concrete trajectory patterns. To resolve the conflict between massive regulation volume and limited context windows, we propose an Adaptive Context Optimization strategy that distills expert knowledge, coupled with a Chain-of-Adjudication mechanism to enforce active evidentiary inquiry. Furthermore, addressing the inadequacy of sparse binary feedback for complex liability assessment, we implement a novel Ordinal-Sensitive Reinforcement Learning mechanism that calibrates decision boundaries against hierarchical severity. Extensive experiments show that our RideJudge-8B achieves 88.41\% accuracy, surpassing 32B-scale baselines and establishing a new standard for interpretable adjudication.

A Progressive Visual-Logic-Aligned Framework for Ride-Hailing Adjudication

Abstract

The efficient adjudication of responsibility disputes is pivotal for maintaining marketplace fairness. However, the exponential surge in ride-hailing volume renders manual review intractable, while conventional automated methods lack the reasoning transparency required for quasi-judicial decisions. Although Multimodal LLMs offer a promising paradigm, they fundamentally struggle to bridge the gap between general visual semantics and rigorous evidentiary protocols, often leading to perceptual hallucinations and logical looseness. To address these systemic misalignments, we introduce RideJudge, a Progressive Visual-Logic-Aligned Framework. Instead of relying on generic pre-training, we bridge the semantic gap via SynTraj, a synthesis engine that grounds abstract liability concepts into concrete trajectory patterns. To resolve the conflict between massive regulation volume and limited context windows, we propose an Adaptive Context Optimization strategy that distills expert knowledge, coupled with a Chain-of-Adjudication mechanism to enforce active evidentiary inquiry. Furthermore, addressing the inadequacy of sparse binary feedback for complex liability assessment, we implement a novel Ordinal-Sensitive Reinforcement Learning mechanism that calibrates decision boundaries against hierarchical severity. Extensive experiments show that our RideJudge-8B achieves 88.41\% accuracy, surpassing 32B-scale baselines and establishing a new standard for interpretable adjudication.
Paper Structure (38 sections, 13 equations, 3 figures, 5 tables)

This paper contains 38 sections, 13 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: The pipeline consists of three pivotal phases: (1) Automated Data Synthesis (Sec. \ref{['sec:data_synthesis']}), which bridges domain gaps via two specialized modules: SynTraj Construction for visual-linguistic alignment and Chain-of-Adjudication Synthesis for logical reasoning reconstruction; (2) Knowledge-Aware Context Refinement (Sec. \ref{['sec:knowledge_refinement']}), capable of dynamic rule pruning and expert precedent extraction; and (3) Progressive Juridical Alignment (Sec. \ref{['sec:progressive_alignment']}), a multi-stage training paradigm culminating in OS-rewarded reinforcement learning for precise decision boundary alignment.
  • Figure 2: Qualitative case studies on the Appeal benchmark. To preserve privacy, sensitive textual regions in the images, as well as specific numerical values and location names within the reasoning chains, have been masked. We highlight the Information Analysis process in green, the Visual Evidence Integration process in red, and the Rule Grounding process in blue.
  • Figure 3: Left: Performance scaling with our CoA synthetic data used in Stage 2. Right: Stability analysis of RideJudge-8B from ten major cities.