SOK: Exploring Hallucinations and Security Risks in AI-Assisted Software Development with Insights for LLM Deployment
Ariful Haque, Sunzida Siddique, Md. Mahfuzur Rahman, Ahmed Rafi Hasan, Laxmi Rani Das, Marufa Kamal, Tasnim Masura, Kishor Datta Gupta
TL;DR
The paper investigates the benefits and risks of AI-assisted software development tools (Copilot, ChatGPT, Cursor AI, Codeium), focusing on productivity gains, security vulnerabilities, data leakage, IP concerns, and hallucinations. It combines user feedback analysis, security analyses, and case studies to map tool-specific strengths and weaknesses and to propose a framework for secure, effective deployment. Key contributions include a user-centric multi-tool evaluation, a security-risk taxonomy, and practical guidance for mitigating data leaks, prompt-based attacks, and insecure code generation. The work highlights the need for robust governance, encryption, access controls, and human oversight to harness AI-enabled coding safely in real-world workflows.
Abstract
The integration of Large Language Models (LLMs) such as GitHub Copilot, ChatGPT, Cursor AI, and Codeium AI into software development has revolutionized the coding landscape, offering significant productivity gains, automation, and enhanced debugging capabilities. These tools have proven invaluable for generating code snippets, refactoring existing code, and providing real-time support to developers. However, their widespread adoption also presents notable challenges, particularly in terms of security vulnerabilities, code quality, and ethical concerns. This paper provides a comprehensive analysis of the benefits and risks associated with AI-powered coding tools, drawing on user feedback, security analyses, and practical use cases. We explore the potential for these tools to replicate insecure coding practices, introduce biases, and generate incorrect or non-sensical code (hallucinations). In addition, we discuss the risks of data leaks, intellectual property violations and the need for robust security measures to mitigate these threats. By comparing the features and performance of these tools, we aim to guide developers in making informed decisions about their use, ensuring that the benefits of AI-assisted coding are maximized while minimizing associated risks.
