Table of Contents
Fetching ...

Exploring How LLMs Capture and Represent Domain-Specific Knowledge

Mirian Hipolito Garcia, Camille Couturier, Daniel Madrigal Diaz, Ankur Mallick, Anastasios Kyrillidis, Robert Sim, Victor Ruhle, Saravan Rajmohan

TL;DR

The paper investigates whether hidden states in large language models inherently encode domain-specific knowledge that can be used for domain-aware routing and model selection. By analyzing hidden-state activity during the prefill phase across multiple autoregressive LLMs and a DeBERTa encoder, the authors identify latent domain-related trajectories that consistently separate queries from Maths, Biomedical, Law, and Humanities domains, even under prompt variations. They demonstrate that a Hidden States Classifier, trained on these activations, can outperform semantic routing and domain-finetuned baselines, with robust performance on open-ended tasks and cross-domain generalization. The work highlights deeper-layer representations as robust signals for domain context, offering a path toward unsupervised model selection and improved interpretability in cross-domain generation scenarios. Limitations include the focus on smaller models and potential domain-trace mixing, suggesting future work to extend to larger models and broader domains.

Abstract

We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language. Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains using hidden states generated during the prefill phase. We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains. We also study the robustness of these domain representations to variations in prompt styles and sources. Our approach leverages these representations for model selection, mapping the LLM that best matches the domain trace of the input query (i.e., the model with the highest performance on similar traces). Our findings show that LLMs can differentiate queries for related domains, and that the fine-tuned model is not always the most accurate. Unlike previous work, our interpretations apply to both closed and open-ended generative tasks

Exploring How LLMs Capture and Represent Domain-Specific Knowledge

TL;DR

The paper investigates whether hidden states in large language models inherently encode domain-specific knowledge that can be used for domain-aware routing and model selection. By analyzing hidden-state activity during the prefill phase across multiple autoregressive LLMs and a DeBERTa encoder, the authors identify latent domain-related trajectories that consistently separate queries from Maths, Biomedical, Law, and Humanities domains, even under prompt variations. They demonstrate that a Hidden States Classifier, trained on these activations, can outperform semantic routing and domain-finetuned baselines, with robust performance on open-ended tasks and cross-domain generalization. The work highlights deeper-layer representations as robust signals for domain context, offering a path toward unsupervised model selection and improved interpretability in cross-domain generation scenarios. Limitations include the focus on smaller models and potential domain-trace mixing, suggesting future work to extend to larger models and broader domains.

Abstract

We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language. Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains using hidden states generated during the prefill phase. We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains. We also study the robustness of these domain representations to variations in prompt styles and sources. Our approach leverages these representations for model selection, mapping the LLM that best matches the domain trace of the input query (i.e., the model with the highest performance on similar traces). Our findings show that LLMs can differentiate queries for related domains, and that the fine-tuned model is not always the most accurate. Unlike previous work, our interpretations apply to both closed and open-ended generative tasks

Paper Structure

This paper contains 19 sections, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Activation summary produced by Phi-3-mini-3.8B on the MMLU benchmark. The left side shows the mean activation per domain subset (a) and per sample (b), while the right side presents the variance across domains (c) and samples (d).
  • Figure 2: Standard deviation traces per datasets and samples across four different domains. Each subplot represents the behavior across layers $l$ on a different LLM architecture for MMLU, GSM8K, MEDMCQA, CaseHOLD, and PLATO datasets. Across all subplots, there is a general trend of increasing standard deviation in deeper layers, suggesting that as models progress through layers, the hidden states become more sensitive to the specific characteristics of each dataset. Further results for Llama-2B model are reported in Appendix \ref{['app:llama_analysis']}.
  • Figure 3: Standard deviation of the hidden state traces of Phi-3-mini-3.8B across 12 data sources and different prompt instructions for the domains of Maths, Biomedical, and Law. Each subplot contains the traces from 3-4 different datasets distributions belonging to the same domain. The legends in each subplot correspond to each dataset used for evaluation. Appendix \ref{['app:prompt_perturbance']} provides the traces across the same datasets for Gemma-2B and Mistral-7B model, showing that this behavior is reproducible across other LLM families.
  • Figure 4: Zero-shot Accuracy Performance as we are reducing the number of layers used in the MLP discriminator, for open-ended (GSM8k) and multichoice (MEDMCQA, CaseHOLD) tasks. Each point in the subplots is cumulative, incorporating signals from layers 1 to $X$.
  • Figure 5: Standard deviation traces per datasets and samples across four different domains, extracted from Llama2-7B model. The law domain datasets, in particular, stand out with their higher variability, indicating that the model's hidden states are more sensitive to the specific characteristics of legal texts - which is a similar behavir presented as Phi-3-mini-3.8B in Figure \ref{['fig:hs_models_3']}. This nuanced encoding could be a result of the model's training data.
  • ...and 5 more figures