AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding

Moiz Sadiq Awan; Maryam Raza

AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding

Moiz Sadiq Awan, Maryam Raza

Abstract

Prior authorization remains one of the most burdensome administrative processes in U.S. healthcare, consuming billions of dollars and thousands of physician hours each year. While large language models have shown promise across clinical text tasks, their ability to produce submission-ready prior authorization letters has received only limited attention, with existing work confined to single-case demonstrations rather than structured multi-scenario evaluation. We assessed three commercially available LLMs (GPT-4o, Claude Sonnet 4.5, and Gemini 2.5 Pro) across 45 physician-validated synthetic scenarios spanning rheumatology, psychiatry, oncology, cardiology, and orthopedics. All three models generated letters with strong clinical content: accurate diagnoses, well-structured medical necessity arguments, and thorough step therapy documentation. However, a secondary analysis of real-world administrative requirements revealed consistent gaps that clinical scoring alone did not capture, including absent billing codes, missing authorization duration requests, and inadequate follow-up plans. These findings reframe the question: the challenge for clinical deployment is not whether LLMs can write clinically adequate letters, but whether the systems built around them can supply the administrative precision that payer workflows require.

AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding

Abstract

AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding

Abstract

Paper Structure

Table of Contents

Figures (5)