Table of Contents
Fetching ...

Enhancing LLM's Cognition via Structurization

Kai Liu, Zhihang Fu, Chao Chen, Wei Zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye

TL;DR

A novel concept of context structurization is presented, which transforms the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements so that large language models can better grasp intricate and extended contexts through precise attention and information-seeking along the organized structures.

Abstract

When reading long-form text, human cognition is complex and structurized. While large language models (LLMs) process input contexts through a causal and sequential perspective, this approach can potentially limit their ability to handle intricate and complex inputs effectively. To enhance LLM's cognition capability, this paper presents a novel concept of context structurization. Specifically, we transform the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements. By doing so, LLMs can better grasp intricate and extended contexts through precise attention and information-seeking along the organized structures. Extensive evaluations are conducted across various model architectures and sizes (including a series of auto-regressive LLMs as well as BERT-like masking models) on a diverse set of NLP tasks (e.g., context-based question-answering, exhaustive hallucination evaluation, and passage-level dense retrieval). Empirical results show consistent and significant performance gains afforded by a single-round structurization. In particular, we boost the open-sourced LLaMA2-70B model to achieve comparable performance against GPT-3.5-Turbo as the hallucination evaluator. Besides, we show the feasibility of distilling advanced LLMs' language processing abilities to a smaller yet effective StruXGPT-7B to execute structurization, addressing the practicality of our approach. Code is available at https://github.com/alibaba/struxgpt.

Enhancing LLM's Cognition via Structurization

TL;DR

A novel concept of context structurization is presented, which transforms the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements so that large language models can better grasp intricate and extended contexts through precise attention and information-seeking along the organized structures.

Abstract

When reading long-form text, human cognition is complex and structurized. While large language models (LLMs) process input contexts through a causal and sequential perspective, this approach can potentially limit their ability to handle intricate and complex inputs effectively. To enhance LLM's cognition capability, this paper presents a novel concept of context structurization. Specifically, we transform the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements. By doing so, LLMs can better grasp intricate and extended contexts through precise attention and information-seeking along the organized structures. Extensive evaluations are conducted across various model architectures and sizes (including a series of auto-regressive LLMs as well as BERT-like masking models) on a diverse set of NLP tasks (e.g., context-based question-answering, exhaustive hallucination evaluation, and passage-level dense retrieval). Empirical results show consistent and significant performance gains afforded by a single-round structurization. In particular, we boost the open-sourced LLaMA2-70B model to achieve comparable performance against GPT-3.5-Turbo as the hallucination evaluator. Besides, we show the feasibility of distilling advanced LLMs' language processing abilities to a smaller yet effective StruXGPT-7B to execute structurization, addressing the practicality of our approach. Code is available at https://github.com/alibaba/struxgpt.
Paper Structure (28 sections, 24 figures, 8 tables)

This paper contains 28 sections, 24 figures, 8 tables.

Figures (24)

  • Figure 1: Structured cognition on sequential contexts. Humans may easily identify a given passage's topic/scope, break down the text sentences into several aspect points with detailed descriptions, and form a tree-like knowledge structure.
  • Figure 2: Framework overview. When instructed to generate responses based on vanilla long-form and sophisticated contexts, LLMs often lose their focus and give unreliable answers due to their limited cognition capability. In contrast, we structurize the vanilla context by using our StruXGPT to identify its main scope and aspect points, facilitating the original LLMs to comprehend the context and generate accurate responses.
  • Figure 3: Prompt template for structurization.
  • Figure 4: Left: templates to transform structurization results into natural languages, with special linguistic markers to preserve and highlight the extracted knowledge structure. Right: transformed context examples with clear information structure for long-form reading comprehension (upper) and hallucination detection (lower) tasks.
  • Figure 5: Attention maps on vanilla and structurized contexts for the same LLaMA2-7B. The sample comes from the QAsper subset.
  • ...and 19 more figures