SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical Synthesis

Aidan Wong; He Cao; Zijing Liu; Yu Li

SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical Synthesis

Aidan Wong, He Cao, Zijing Liu, Yu Li

TL;DR

This paper explores the security vulnerabilities of LLMs within the field of chemistry, particularly their capacity to provide instructions for synthesizing hazardous substances, and introduces a novel attack technique named SMILES-prompting, which uses the Simplified Molecular-Input Line-Entry System to reference chemical substances.

Abstract

The increasing integration of large language models (LLMs) across various fields has heightened concerns about their potential to propagate dangerous information. This paper specifically explores the security vulnerabilities of LLMs within the field of chemistry, particularly their capacity to provide instructions for synthesizing hazardous substances. We evaluate the effectiveness of several prompt injection attack methods, including red-teaming, explicit prompting, and implicit prompting. Additionally, we introduce a novel attack technique named SMILES-prompting, which uses the Simplified Molecular-Input Line-Entry System (SMILES) to reference chemical substances. Our findings reveal that SMILES-prompting can effectively bypass current safety mechanisms. These findings highlight the urgent need for enhanced domain-specific safeguards in LLMs to prevent misuse and improve their potential for positive social impact.

SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical Synthesis

TL;DR

Abstract

SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical Synthesis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)