Table of Contents
Fetching ...

AI-assisted Coding with Cody: Lessons from Context Retrieval and Evaluation for Code Recommendations

Jan Hartman, Rishabh Mehrotra, Hitesh Sagtani, Dominic Cooney, Rafal Gajdulewicz, Beyang Liu, Julie Tibshirani, Quinn Slack

TL;DR

This paper analyzes how to make LLM-based coding assistants practical by injecting codebase context through a context engine. It advocates a two-stage pipeline—retrieval to generate candidate context and a ranker to select items for the LLM prompt—across diverse, potentially non-indexed sources. It discusses significant evaluation challenges, including online-offline discrepancies and scarce labeled data, and highlights offline datasets and guardrails developed for Cody to assess completions, edits, tests, and chat. The work provides practical insights for improving precision and reliability of AI-assisted coding in real-world development environments.

Abstract

In this work, we discuss a recently popular type of recommender system: an LLM-based coding assistant. Connecting the task of providing code recommendations in multiple formats to traditional RecSys challenges, we outline several similarities and differences due to domain specifics. We emphasize the importance of providing relevant context to an LLM for this use case and discuss lessons learned from context enhancements & offline and online evaluation of such AI-assisted coding systems.

AI-assisted Coding with Cody: Lessons from Context Retrieval and Evaluation for Code Recommendations

TL;DR

This paper analyzes how to make LLM-based coding assistants practical by injecting codebase context through a context engine. It advocates a two-stage pipeline—retrieval to generate candidate context and a ranker to select items for the LLM prompt—across diverse, potentially non-indexed sources. It discusses significant evaluation challenges, including online-offline discrepancies and scarce labeled data, and highlights offline datasets and guardrails developed for Cody to assess completions, edits, tests, and chat. The work provides practical insights for improving precision and reliability of AI-assisted coding in real-world development environments.

Abstract

In this work, we discuss a recently popular type of recommender system: an LLM-based coding assistant. Connecting the task of providing code recommendations in multiple formats to traditional RecSys challenges, we outline several similarities and differences due to domain specifics. We emphasize the importance of providing relevant context to an LLM for this use case and discuss lessons learned from context enhancements & offline and online evaluation of such AI-assisted coding systems.
Paper Structure (3 sections)

This paper contains 3 sections.