Table of Contents
Fetching ...

Open-Source AI-Powered Optimization in Scalene: Advancing Python Performance Profiling with DeepSeek-R1 and LLaMA 3.2

Saem Hasan, Sanju Basak

TL;DR

The paper addresses Python performance inefficiencies and the need for actionable profiling guidance by replacing the proprietary OpenAI API with open-source LLMs. It proposes integrating Ollama-hosted DeepSeek-R1 and LLaMA 3.2 into SCALENE to generate optimization suggestions directly from profiling data. Experimental results on 15 code snippets reveal that DeepSeek-R1 provides concise, hardware-aware optimizations and effectively leverages GPU acceleration, while LLaMA 3.2 can introduce unnecessary computations. The findings demonstrate that open-source AI-driven code optimization is viable, cost-effective, and privacy-preserving, expanding accessibility for Python performance analysis tools.

Abstract

Python's flexibility and ease of use come at the cost of performance inefficiencies, requiring developers to rely on profilers to optimize execution. SCALENE, a high-performance CPU, GPU, and memory profiler, provides fine-grained insights into Python applications while running significantly faster than traditional profilers. Originally, SCALENE integrated OpenAI's API to generate AI-powered optimization suggestions, but its reliance on a proprietary API limited accessibility. This study explores the feasibility of using opensource large language models (LLMs), such as DeepSeek-R1 and Llama 3.2, to generate optimization recommendations within SCALENE. Our evaluation reveals that DeepSeek-R1 provides effective code optimizations comparable to proprietary models. We integrate DeepSeek-R1 into SCALENE to automatically analyze performance bottlenecks and suggest improvements, enhancing SCALENE's utility while maintaining its open-source nature. This study demonstrates that open-source LLMs can be viable alternatives for AI-driven code optimization, paving the way for more accessible and cost-effective performance analysis tools.

Open-Source AI-Powered Optimization in Scalene: Advancing Python Performance Profiling with DeepSeek-R1 and LLaMA 3.2

TL;DR

The paper addresses Python performance inefficiencies and the need for actionable profiling guidance by replacing the proprietary OpenAI API with open-source LLMs. It proposes integrating Ollama-hosted DeepSeek-R1 and LLaMA 3.2 into SCALENE to generate optimization suggestions directly from profiling data. Experimental results on 15 code snippets reveal that DeepSeek-R1 provides concise, hardware-aware optimizations and effectively leverages GPU acceleration, while LLaMA 3.2 can introduce unnecessary computations. The findings demonstrate that open-source AI-driven code optimization is viable, cost-effective, and privacy-preserving, expanding accessibility for Python performance analysis tools.

Abstract

Python's flexibility and ease of use come at the cost of performance inefficiencies, requiring developers to rely on profilers to optimize execution. SCALENE, a high-performance CPU, GPU, and memory profiler, provides fine-grained insights into Python applications while running significantly faster than traditional profilers. Originally, SCALENE integrated OpenAI's API to generate AI-powered optimization suggestions, but its reliance on a proprietary API limited accessibility. This study explores the feasibility of using opensource large language models (LLMs), such as DeepSeek-R1 and Llama 3.2, to generate optimization recommendations within SCALENE. Our evaluation reveals that DeepSeek-R1 provides effective code optimizations comparable to proprietary models. We integrate DeepSeek-R1 into SCALENE to automatically analyze performance bottlenecks and suggest improvements, enhancing SCALENE's utility while maintaining its open-source nature. This study demonstrates that open-source LLMs can be viable alternatives for AI-driven code optimization, paving the way for more accessible and cost-effective performance analysis tools.

Paper Structure

This paper contains 4 sections, 1 figure.

Figures (1)

  • Figure 1: Some code optimizations suggested by LLaMa3.2 and DeepSeek-R1