CRCL at SemEval-2024 Task 2: Simple prompt optimizations

Clément Brutti-Mairesse; Loïc Verlingue

CRCL at SemEval-2024 Task 2: Simple prompt optimizations

Clément Brutti-Mairesse, Loïc Verlingue

TL;DR

This paper tackles the SemEval-2024 Task 2 NLI problem in a clinical-trial context by applying hard prompt optimization and three prompting strategies (OPRO, zero-shot CoT, dynamic one-shot CoT) to infer entailment/contradiction between statements and CTR sections. It evaluates multiple LLMs (including Mixtral-8x7B-Instruct) and uses a vector-embedding workflow to support exemplar retrieval in the dynamic prompting setup. Faithfulness and Consistency are defined as key robustness metrics, with explicit formulas governing their computation. The results show that zero-shot CoT prompts offer the best improvement in F1, while dynamic one-shot CoT yields the strongest faithfulness and consistency, demonstrating the practical value of advanced prompting techniques in medical NLI without fine-tuning.

Abstract

We present a baseline for the SemEval 2024 task 2 challenge, whose objective is to ascertain the inference relationship between pairs of clinical trial report sections and statements. We apply prompt optimization techniques with LLM Instruct models provided as a Language Model-as-a-Service (LMaaS). We observed, in line with recent findings, that synthetic CoT prompts significantly enhance manually crafted ones.

CRCL at SemEval-2024 Task 2: Simple prompt optimizations

TL;DR

Abstract

Paper Structure (16 sections, 2 equations, 3 figures, 2 tables, 3 algorithms)

This paper contains 16 sections, 2 equations, 3 figures, 2 tables, 3 algorithms.

Introduction
Methods
Tasks
Prompting
OPRO optimization
Zero-shot Chain-of-Thought prompt
Dynamic one-shot Chain-of-Thought prompt
Language models
Evaluation metrics
Results
Main results
Other evaluations
Conclusion
Acknowledgments
Prompt instructions
...and 1 more sections

Figures (3)

Figure 1: SemEval 2024 dataset data model
Figure 2: Dynamic one-shot prompting workflow
Figure 3: Zero-shot CoT prompting sample pipeline

CRCL at SemEval-2024 Task 2: Simple prompt optimizations

TL;DR

Abstract

CRCL at SemEval-2024 Task 2: Simple prompt optimizations

Authors

TL;DR

Abstract

Table of Contents

Figures (3)