Integrating Randomness in Large Language Models: A Linear Congruential Generator Approach for Generating Clinically Relevant Content

Andrew Bouras

Integrating Randomness in Large Language Models: A Linear Congruential Generator Approach for Generating Clinically Relevant Content

Andrew Bouras

TL;DR

The paper addresses randomness and repetition in large language model outputs used for educational content generation. It proposes a framework that combines a linear congruential generator (LCG) with GPT-4o to select unique gastrointestinal physiology and pathology facts from a pool of 100, generating seven vignette-style MCQs per round; the LCG recurrence is $X_{n+1} = (a X_n + c) \bmod m$, with parameters $X_0=12345$, $a=1103515245$, $c=12345$, $m=2^{31}$. Over 14 rounds, 98 MCQs were produced with no fact overlap between rounds, demonstrating diverse coverage and high clinical relevance. The approach offers a scalable, reproducible method for automated generation of medical assessment materials and could extend to other domains and question types.

Abstract

Generating diverse, high-quality outputs from language models is crucial for applications in education and content creation. Achieving true randomness and avoiding repetition remains a significant challenge. This study uses the Linear Congruential Generator method for systematic fact selection, combined with AI-powered content generation. We ensured unique combinations of gastrointestinal physiology and pathology facts across multiple rounds, integrating these facts into prompts for GPT-4o to create clinically relevant, vignette-style outputs. Over 14 rounds, 98 unique outputs were generated, demonstrating LCG's effectiveness in producing diverse and high-quality content. This method addresses key issues of randomness and repetition, enhancing the quality and efficiency of language model-generated content for various applications.

Integrating Randomness in Large Language Models: A Linear Congruential Generator Approach for Generating Clinically Relevant Content

TL;DR

, with parameters

. Over 14 rounds, 98 MCQs were produced with no fact overlap between rounds, demonstrating diverse coverage and high clinical relevance. The approach offers a scalable, reproducible method for automated generation of medical assessment materials and could extend to other domains and question types.

Abstract

Paper Structure (18 sections, 1 equation)

This paper contains 18 sections, 1 equation.

Introduction
Background
Objective
Methodology
Linear Congruential Generator
Fact Selection Process
MCQ Generation
Experiment Setup
Rounds of MCQ Generation
Evaluation Metrics
Results
Overview of the MCQs Generated in Each Round
Analysis of Fact Overlap and Significance of Findings
Discussion
Interpretation of Results
...and 3 more sections

Integrating Randomness in Large Language Models: A Linear Congruential Generator Approach for Generating Clinically Relevant Content

TL;DR

Abstract

Integrating Randomness in Large Language Models: A Linear Congruential Generator Approach for Generating Clinically Relevant Content

Authors

TL;DR

Abstract

Table of Contents