CRCL at SemEval-2024 Task 2: Simple prompt optimizations
Clément Brutti-Mairesse, Loïc Verlingue
TL;DR
This paper tackles the SemEval-2024 Task 2 NLI problem in a clinical-trial context by applying hard prompt optimization and three prompting strategies (OPRO, zero-shot CoT, dynamic one-shot CoT) to infer entailment/contradiction between statements and CTR sections. It evaluates multiple LLMs (including Mixtral-8x7B-Instruct) and uses a vector-embedding workflow to support exemplar retrieval in the dynamic prompting setup. Faithfulness and Consistency are defined as key robustness metrics, with explicit formulas governing their computation. The results show that zero-shot CoT prompts offer the best improvement in F1, while dynamic one-shot CoT yields the strongest faithfulness and consistency, demonstrating the practical value of advanced prompting techniques in medical NLI without fine-tuning.
Abstract
We present a baseline for the SemEval 2024 task 2 challenge, whose objective is to ascertain the inference relationship between pairs of clinical trial report sections and statements. We apply prompt optimization techniques with LLM Instruct models provided as a Language Model-as-a-Service (LMaaS). We observed, in line with recent findings, that synthetic CoT prompts significantly enhance manually crafted ones.
