Delegated Authorization for Agents Constrained to Semantic Task-to-Scope Matching
Majed El Helou, Chiara Troiani, Benjamin Ryder, Jean Diaconu, Hervé Muyal, Marcelo Yannuzzi
TL;DR
The paper tackles the risk of over-privileged access in LLM-driven agents by proposing a semantic task-to-scope matching framework that operates via a trusted proxy to tightly bound agent permissions to task intents. It introduces ASTRA, a data-generation pipeline and dataset for benchmarking task-to-scope alignment, and two semantic matchers (Semantic Similarity Matcher and LLM Reasoning Matcher) implemented in an authorization server to enable just-in-time, minimal scopes under Task-Based Access Control. Through experiments on synthetic data and the Toucan dataset, the work demonstrates that LLM-based reasoning improves precision and recall for single-tool tasks and reveals scalability and recall challenges as multi-tool task complexity grows. The results underscore the potential of intent-aware authorization for agent-based tool use while signaling the need for further research into scalable semantic matching, multi-turn workflows, and integration with existing OAuth/OIDC-style protocols. Overall, this work provides a concrete path toward fine-grained, context-driven delegation that mitigates privilege escalation risks in multi-agent, tool-augmented systems.
Abstract
Authorizing Large Language Model driven agents to dynamically invoke tools and access protected resources introduces significant risks, since current methods for delegating authorization grant overly broad permissions and give access to tools allowing agents to operate beyond the intended task scope. We introduce and assess a delegated authorization model enabling authorization servers to semantically inspect access requests to protected resources, and issue access tokens constrained to the minimal set of scopes necessary for the agents' assigned tasks. Given the unavailability of datasets centered on delegated authorization flows, particularly including both semantically appropriate and inappropriate scope requests for a given task, we introduce ASTRA, a dataset and data generation pipeline for benchmarking semantic matching between tasks and scopes. Our experiments show both the potential and current limitations of model-based matching, particularly as the number of scopes needed for task completion increases. Our results highlight the need for further research into semantic matching techniques enabling intent-aware authorization for multi-agent and tool-augmented applications, including fine-grained control, such as Task-Based Access Control (TBAC).
