Table of Contents
Fetching ...

ClauseLens: Clause-Grounded, CVaR-Constrained Reinforcement Learning for Trustworthy Reinsurance Pricing

Stella C. Dong, James R. Finlay

TL;DR

ClauseLens tackles the challenge of opacity in reinsurance pricing by embedding retrieved regulatory clauses into a clause-grounded, risk-aware reinforcement learning framework. By formulating quoting as a clause-enhanced, risk-constrained MDP and employing a dual-projected PPO with CVaR tail risk control and real-time clause-based action masking, the approach delivers regulation-compliant decisions with interpretable clause-grounded explanations. Empirical results in a calibrated treaty simulator show substantial improvements in tail-risk performance (around 27.9% for CVaR at 0.10) and solventy feasibility (approximately a 51% reduction in violations), while achieving high explainability and retrieval fidelity (88.2% entailment, 87.4% precision, 91.1% recall). The work demonstrates that integrating legal context into both decision-making and justification pathways can produce auditable, governance-aligned AI for high-stakes financial pricing, with broad implications for domains requiring regulatory compliance and transparency.

Abstract

Reinsurance treaty pricing must satisfy stringent regulatory standards, yet current quoting practices remain opaque and difficult to audit. We introduce ClauseLens, a clause-grounded reinforcement learning framework that produces transparent, regulation-compliant, and risk-aware treaty quotes. ClauseLens models the quoting task as a Risk-Aware Constrained Markov Decision Process (RA-CMDP). Statutory and policy clauses are retrieved from legal and underwriting corpora, embedded into the agent's observations, and used both to constrain feasible actions and to generate clause-grounded natural language justifications. Evaluated in a multi-agent treaty simulator calibrated to industry data, ClauseLens reduces solvency violations by 51%, improves tail-risk performance by 27.9% (CVaR_0.10), and achieves 88.2% accuracy in clause-grounded explanations with retrieval precision of 87.4% and recall of 91.1%. These findings demonstrate that embedding legal context into both decision and explanation pathways yields interpretable, auditable, and regulation-aligned quoting behavior consistent with Solvency II, NAIC RBC, and the EU AI Act.

ClauseLens: Clause-Grounded, CVaR-Constrained Reinforcement Learning for Trustworthy Reinsurance Pricing

TL;DR

ClauseLens tackles the challenge of opacity in reinsurance pricing by embedding retrieved regulatory clauses into a clause-grounded, risk-aware reinforcement learning framework. By formulating quoting as a clause-enhanced, risk-constrained MDP and employing a dual-projected PPO with CVaR tail risk control and real-time clause-based action masking, the approach delivers regulation-compliant decisions with interpretable clause-grounded explanations. Empirical results in a calibrated treaty simulator show substantial improvements in tail-risk performance (around 27.9% for CVaR at 0.10) and solventy feasibility (approximately a 51% reduction in violations), while achieving high explainability and retrieval fidelity (88.2% entailment, 87.4% precision, 91.1% recall). The work demonstrates that integrating legal context into both decision-making and justification pathways can produce auditable, governance-aligned AI for high-stakes financial pricing, with broad implications for domains requiring regulatory compliance and transparency.

Abstract

Reinsurance treaty pricing must satisfy stringent regulatory standards, yet current quoting practices remain opaque and difficult to audit. We introduce ClauseLens, a clause-grounded reinforcement learning framework that produces transparent, regulation-compliant, and risk-aware treaty quotes. ClauseLens models the quoting task as a Risk-Aware Constrained Markov Decision Process (RA-CMDP). Statutory and policy clauses are retrieved from legal and underwriting corpora, embedded into the agent's observations, and used both to constrain feasible actions and to generate clause-grounded natural language justifications. Evaluated in a multi-agent treaty simulator calibrated to industry data, ClauseLens reduces solvency violations by 51%, improves tail-risk performance by 27.9% (CVaR_0.10), and achieves 88.2% accuracy in clause-grounded explanations with retrieval precision of 87.4% and recall of 91.1%. These findings demonstrate that embedding legal context into both decision and explanation pathways yields interpretable, auditable, and regulation-aligned quoting behavior consistent with Solvency II, NAIC RBC, and the EU AI Act.

Paper Structure

This paper contains 48 sections, 4 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: ClauseLens architecture. Structured cedent features and retrieved legal clauses are embedded into an augmented state. A quoting policy $\pi(a|s)$ is trained using dual-projected PPO with CVaR-based advantage weighting and Lagrangian penalties. Clause-derived masks enforce feasibility, while justifications are generated from the same retrieved context. Dashed arrows denote semantic grounding links between retrieved clauses, filtered actions, and generated explanations.
  • Figure 2: ClauseLens training loop. The quoting agent observes a clause-augmented state $s$, selects an action filtered by clause-derived feasibility masks, and receives reward and violation signals from the environment. The policy is updated using CVaR-weighted PPO and Lagrangian-based constraint projection. The lower feedback loop encodes dual supervision---hard constraint masking and soft Lagrangian adjustment---ensuring continuous regulatory alignment.
  • Figure 3: ClauseLens evaluation pipeline. Legal clauses guide quoting actions and post-hoc explanations, with multi-axis evaluation across return, risk, and legal fidelity.