Table of Contents
Fetching ...

Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection

Junhao Liu, Haonan Yu, Zhenyu Yan, Xin Zhang

TL;DR

This work tackles the challenge of faithful, word-level explanations for long-context LLMs under realistic query budgets. It introduces Focus-LIME, a two-phase, proxy-guided framework that first scouts an active neighborhood with a cheaper proxy model and then performs fine-grained attribution on the target model within that subspace. The approach formalizes the cost-fidelity trade-off, defines an active neighborhood S, and uses a Phase I/Phase II pipeline to preserve fidelity while reducing the perturbation space. Empirical results on CUAD and Qasper show that Focus-LIME achieves higher faithfulness (AOPC) than standard LIME and proxy baselines, while remaining efficient and robust to proxy choice; alignment with human evidence is demonstrated via Recall@k on expert annotations. Overall, Focus-LIME enables practical, surgical explanations for long documents in domains like law and science, improving trust and verifiability in high-stakes applications.

Abstract

As Large Language Models (LLMs) scale to handle massive context windows, achieving surgical feature-level interpretation is essential for high-stakes tasks like legal auditing and code debugging. However, existing local model-agnostic explanation methods face a critical dilemma in these scenarios: feature-based methods suffer from attribution dilution due to high feature dimensionality, thus failing to provide faithful explanations. In this paper, we propose Focus-LIME, a coarse-to-fine framework designed to restore the tractability of surgical interpretation. Focus-LIME utilizes a proxy model to curate the perturbation neighborhood, allowing the target model to perform fine-grained attribution exclusively within the optimized context. Empirical evaluations on long-context benchmarks demonstrate that our method makes surgical explanations practicable and provides faithful explanations to users.

Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection

TL;DR

This work tackles the challenge of faithful, word-level explanations for long-context LLMs under realistic query budgets. It introduces Focus-LIME, a two-phase, proxy-guided framework that first scouts an active neighborhood with a cheaper proxy model and then performs fine-grained attribution on the target model within that subspace. The approach formalizes the cost-fidelity trade-off, defines an active neighborhood S, and uses a Phase I/Phase II pipeline to preserve fidelity while reducing the perturbation space. Empirical results on CUAD and Qasper show that Focus-LIME achieves higher faithfulness (AOPC) than standard LIME and proxy baselines, while remaining efficient and robust to proxy choice; alignment with human evidence is demonstrated via Recall@k on expert annotations. Overall, Focus-LIME enables practical, surgical explanations for long documents in domains like law and science, improving trust and verifiability in high-stakes applications.

Abstract

As Large Language Models (LLMs) scale to handle massive context windows, achieving surgical feature-level interpretation is essential for high-stakes tasks like legal auditing and code debugging. However, existing local model-agnostic explanation methods face a critical dilemma in these scenarios: feature-based methods suffer from attribution dilution due to high feature dimensionality, thus failing to provide faithful explanations. In this paper, we propose Focus-LIME, a coarse-to-fine framework designed to restore the tractability of surgical interpretation. Focus-LIME utilizes a proxy model to curate the perturbation neighborhood, allowing the target model to perform fine-grained attribution exclusively within the optimized context. Empirical evaluations on long-context benchmarks demonstrate that our method makes surgical explanations practicable and provides faithful explanations to users.
Paper Structure (29 sections, 6 equations, 5 figures, 3 tables)

This paper contains 29 sections, 6 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Comparison of explanations generated by different methods when querying LLMs for a "Governing Law" clause in a lengthy contract. (A) Concept-based Explanation: Attributes importance to high-level concepts. (B) Original LIME: Suffers from attribution dilution, where significance scores are distributed noisily across all tokens. (C) Ours (Neighborhood Reduced): Generates a sparse and precise explanation.
  • Figure 2: The pipeline of Focus-LIME, which consists of two main steps: 1) Neighborhood Curation with Proxy Models: we first use a smaller proxy model to iteratively narrow the neighborhood by fixing unimportant features, and identify the most faithful neighborhood; 2) Fine-grained Attribution with Target Model: we then generate the final explanations by using the original LLM, but only in the curated neighborhood.
  • Figure 3: The deletion fidelity (AOPC$\uparrow$) of explanations on the original neighborhood and optimal neighborhood.
  • Figure 4: The fidelity of explanations on the original neighborhood and optimal neighborhood.
  • Figure 5: The importance words identified by Focus-LIME and the ground-truth evidence highlighted by expert lawyers for the "Governing Law" clause checking task.