Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4

Sondos Mahmoud Bsharat; Aidar Myrzakhan; Zhiqiang Shen

Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4

Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen

TL;DR

Prompt quality critically shapes LLM outputs, and this work proposes 26 principled instructions to guide prompting across model scales and tasks. The authors validate the approach on the ATLAS benchmark using LLaMA-1/2 variants and GPT-3.5/4, reporting significant boosts in response quality and correctness—especially for larger models. The principles emphasize audience-tailored prompts, incremental prompting, and example-driven design to reduce bias and improve accuracy. This work provides actionable guidance for researchers and developers to craft prompts and argues for integrating principled prompts into standard LLM workflows.

Abstract

This paper introduces 26 guiding principles designed to streamline the process of querying and prompting large language models. Our goal is to simplify the underlying concepts of formulating questions for various scales of large language models, examining their abilities, and enhancing user comprehension on the behaviors of different scales of large language models when feeding into different prompts. Extensive experiments are conducted on LLaMA-1/2 (7B, 13B and 70B), GPT-3.5/4 to verify the effectiveness of the proposed principles on instructions and prompts design. We hope that this work can provide a better guide for researchers working on the prompting of large language models. Project page is available at https://github.com/VILA-Lab/ATLAS.

Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4

TL;DR

Abstract

Paper Structure (15 sections, 14 figures, 2 tables)

This paper contains 15 sections, 14 figures, 2 tables.

Introduction
Related Work
Principles
Motivation
Overview
Design Principles
Experiments
Setup and Implementation Details
Models and Metrics
Results
Results on small, medium and large-scale LLMs
Results on individual LLMs
More examples on various scales of LLMs
Conclusion
Limitations and Discussion

Figures (14)

Figure 1: Illustration example of prompts and corresponding responses before and after applying principles. Left is the original promotes and their responses from GPT-4, right is the principled prompts and the associated responses. Principles 5 and 6 are utilized.
Figure 2: Boosting example of LLM response after using the principle 13 on prompts.
Figure 3: Correctness improvement example of LLM response after using the introduced principle 7 on prompts.
Figure 4: Boosting of LLM response quality after employing the introduced principles on prompts. small-scale indicates the 7B models, medium-scale indicates the 13B models and large-scale indicates the 70B and GPT-3.5/4 models.
Figure 6: Relative correctness improvement of LLM response quality after employing the introduced principles on prompts. small-scale indicates the 7B models, medium-scale indicates the 13B models and large-scale indicates the 70B and GPT-3.5/4 models.
...and 9 more figures

Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4

TL;DR

Abstract

Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4

Authors

TL;DR

Abstract

Table of Contents

Figures (14)