Table of Contents
Fetching ...

STRIDE: A Systematic Framework for Selecting AI Modalities -- Agentic AI, AI Assistants, or LLM Calls

Shubhi Asthana, Bing Zhang, Chad DeLuca, Ruchi Mahindru, Hima Patel

TL;DR

STRIDE tackles the problem of when autonomous, agentic AI is truly necessary in enterprise contexts. It introduces a five-stage design-time framework that decomposes tasks, scores dynamic reasoning and tool needs, attributes dynamism, assesses self-reflection needs, and yields an Agentic Suitability Score to select among LLM calls, AI assistants, or agentic AI. The approach is validated on 30 real-world tasks across SRE, compliance, and automation, achieving high accuracy, reducing unnecessary agent deployments, and cutting compute and API costs. The work provides a practical guardrail for responsible AI deployment, enabling cost-effective and governance-friendly decisions about autonomy in complex workflows.

Abstract

The rapid shift from stateless large language models (LLMs) to autonomous, goal-driven agents raises a central question: When is agentic AI truly necessary? While agents enable multi-step reasoning, persistent memory, and tool orchestration, deploying them indiscriminately leads to higher cost, complexity, and risk. We present STRIDE (Systematic Task Reasoning Intelligence Deployment Evaluator), a framework that provides principled recommendations for selecting between three modalities: (i) direct LLM calls, (ii) guided AI assistants, and (iii) fully autonomous agentic AI. STRIDE integrates structured task decomposition, dynamism attribution, and self-reflection requirement analysis to produce an Agentic Suitability Score, ensuring that full agentic autonomy is reserved for tasks with inherent dynamism or evolving context. Evaluated across 30 real-world tasks spanning SRE, compliance, and enterprise automation, STRIDE achieved 92% accuracy in modality selection, reduced unnecessary agent deployments by 45%, and cut resource costs by 37%. Expert validation over six months in SRE and compliance domains confirmed its practical utility, with domain specialists agreeing that STRIDE effectively distinguishes between tasks requiring simple LLM calls, guided assistants, or full agentic autonomy. This work reframes agent adoption as a necessity-driven design decision, ensuring autonomy is applied only when its benefits justify the costs.

STRIDE: A Systematic Framework for Selecting AI Modalities -- Agentic AI, AI Assistants, or LLM Calls

TL;DR

STRIDE tackles the problem of when autonomous, agentic AI is truly necessary in enterprise contexts. It introduces a five-stage design-time framework that decomposes tasks, scores dynamic reasoning and tool needs, attributes dynamism, assesses self-reflection needs, and yields an Agentic Suitability Score to select among LLM calls, AI assistants, or agentic AI. The approach is validated on 30 real-world tasks across SRE, compliance, and automation, achieving high accuracy, reducing unnecessary agent deployments, and cutting compute and API costs. The work provides a practical guardrail for responsible AI deployment, enabling cost-effective and governance-friendly decisions about autonomy in complex workflows.

Abstract

The rapid shift from stateless large language models (LLMs) to autonomous, goal-driven agents raises a central question: When is agentic AI truly necessary? While agents enable multi-step reasoning, persistent memory, and tool orchestration, deploying them indiscriminately leads to higher cost, complexity, and risk. We present STRIDE (Systematic Task Reasoning Intelligence Deployment Evaluator), a framework that provides principled recommendations for selecting between three modalities: (i) direct LLM calls, (ii) guided AI assistants, and (iii) fully autonomous agentic AI. STRIDE integrates structured task decomposition, dynamism attribution, and self-reflection requirement analysis to produce an Agentic Suitability Score, ensuring that full agentic autonomy is reserved for tasks with inherent dynamism or evolving context. Evaluated across 30 real-world tasks spanning SRE, compliance, and enterprise automation, STRIDE achieved 92% accuracy in modality selection, reduced unnecessary agent deployments by 45%, and cut resource costs by 37%. Expert validation over six months in SRE and compliance domains confirmed its practical utility, with domain specialists agreeing that STRIDE effectively distinguishes between tasks requiring simple LLM calls, guided assistants, or full agentic autonomy. This work reframes agent adoption as a necessity-driven design decision, ensuring autonomy is applied only when its benefits justify the costs.

Paper Structure

This paper contains 24 sections, 4 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview of STRIDE, a five-stage framework for determining the necessity of Agentic AI, AI assistants, or LLM calls. Stage 1: Task decomposition into subtasks with dependency graph construction. Stage 2: Dynamic reasoning and tool-interaction scoring. Stage 3: Dynamism attribution (model/tool/workflow). Stage 4: Self-reflection requirement analysis. Stage 5: Aggregated suitability inference with persona-aware recommendations.
  • Figure 2: Toy decomposition DAG for "Plan 5-day travel itinerary." Each subtask is scored separately and orchestrated by STRIDE.
  • Figure 3: Domain-wise accuracy of STRIDE across 30 tasks.
  • Figure 4: Expert agreement with STRIDE recommendations.