Table of Contents
Fetching ...

Chart2Code-MoLA: Efficient Multi-Modal Code Generation via Adaptive Expert Routing

Yifei Wang, Jacky Keung, Zhenyu Mao, Jingyu Zhang, Yuchen Cao

TL;DR

The paper tackles chart-to-code generation by proposing C2C-MoLA, a multimodal framework that combines expert routing (MoE) with parameter-efficient fine-tuning (LoRA) to handle structurally diverse charts under resource constraints. It introduces a structure-aware routing mechanism driven by a learnable complexity metric and couples this with LoRA-based updates to attention modules, enabling adaptive specialization and memory-efficient training. Empirical results on Chart2Code-160k show substantial gains in generation accuracy, especially for complex charts, along with notable reductions in peak memory and faster convergence; ablations validate the effectiveness of the eight-expert, 32-token configuration and rank-8 LoRA targeting attention and output. The work demonstrates meaningful improvements in modularity, scalability, and efficiency for multimodal code generation and outlines avenues for expert-centric pretraining, robustness to noisy inputs, and interactive routing.

Abstract

Chart-to-code generation is a critical task in automated data visualization, translating complex chart structures into executable programs. While recent Multi-modal Large Language Models (MLLMs) improve chart representation, existing approaches still struggle to achieve cross-type generalization, memory efficiency, and modular design. To address these challenges, this paper proposes C2C-MoLA, a multimodal framework that synergizes Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA). The MoE component uses a complexity-aware routing mechanism with domain-specialized experts and load-balanced sparse gating, dynamically allocating inputs based on learnable structural metrics like element count and chart complexity. LoRA enables parameter-efficient updates for resource-conscious tuning, further supported by a tailored training strategy that aligns routing stability with semantic accuracy. Experiments on Chart2Code-160k show that the proposed model improves generation accuracy by up to 17%, reduces peak GPU memory by 18%, and accelerates convergence by 20%, when compared to standard fine-tuning and LoRA-only baselines, particularly on complex charts. Ablation studies validate optimal designs, such as 8 experts and rank-8 LoRA, and confirm scalability for real-world multimodal code generation.

Chart2Code-MoLA: Efficient Multi-Modal Code Generation via Adaptive Expert Routing

TL;DR

The paper tackles chart-to-code generation by proposing C2C-MoLA, a multimodal framework that combines expert routing (MoE) with parameter-efficient fine-tuning (LoRA) to handle structurally diverse charts under resource constraints. It introduces a structure-aware routing mechanism driven by a learnable complexity metric and couples this with LoRA-based updates to attention modules, enabling adaptive specialization and memory-efficient training. Empirical results on Chart2Code-160k show substantial gains in generation accuracy, especially for complex charts, along with notable reductions in peak memory and faster convergence; ablations validate the effectiveness of the eight-expert, 32-token configuration and rank-8 LoRA targeting attention and output. The work demonstrates meaningful improvements in modularity, scalability, and efficiency for multimodal code generation and outlines avenues for expert-centric pretraining, robustness to noisy inputs, and interactive routing.

Abstract

Chart-to-code generation is a critical task in automated data visualization, translating complex chart structures into executable programs. While recent Multi-modal Large Language Models (MLLMs) improve chart representation, existing approaches still struggle to achieve cross-type generalization, memory efficiency, and modular design. To address these challenges, this paper proposes C2C-MoLA, a multimodal framework that synergizes Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA). The MoE component uses a complexity-aware routing mechanism with domain-specialized experts and load-balanced sparse gating, dynamically allocating inputs based on learnable structural metrics like element count and chart complexity. LoRA enables parameter-efficient updates for resource-conscious tuning, further supported by a tailored training strategy that aligns routing stability with semantic accuracy. Experiments on Chart2Code-160k show that the proposed model improves generation accuracy by up to 17%, reduces peak GPU memory by 18%, and accelerates convergence by 20%, when compared to standard fine-tuning and LoRA-only baselines, particularly on complex charts. Ablation studies validate optimal designs, such as 8 experts and rank-8 LoRA, and confirm scalability for real-world multimodal code generation.

Paper Structure

This paper contains 35 sections, 20 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: C2C-MoLA: Chart-to-Code Generation Framework
  • Figure 2: Expert Routing Mechanism in C2C-MoLA
  • Figure 3: Training Convergence Curves
  • Figure 4: Expert selection before vs. after regularization