Table of Contents
Fetching ...

PEARL: Peer-Enhanced Adaptive Radio via On-Device LLM

Ju-Hyung Lee, Yanqing Lu, Klaus Doppler

TL;DR

PEARL tackles cooperative cross-layer optimization for device-to-device wireless links by deploying an on-device LLM agent that leverages both publisher and subscriber context to select Wi-Fi Aware parameters. It introduces two practical PEFT variants (PEARL with LoRA and PEARL-Lite with a head-only classifier) and a context-aware reward that normalizes latency by application tolerances and scales energy by battery state, enabling KL-based fine-tuning with soft targets. Empirical results show that peer-aware, reward-aligned training improves joint latency-energy performance over heuristic baselines, with PEARL-Lite achieving sub-20 ms inference and PEARL delivering the best overall objective score; energy savings of up to ~16% are observed in cooperative low-battery scenarios. The work demonstrates that on-device, peer-aware LLMs can deliver robust, efficient cross-layer control for always-on wireless operation and provides a path toward real-world demonstrations and broader protocol support. All code, data, and demos are available at the authors’ repository.

Abstract

We present PEARL (Peer-Enhanced Adaptive Radio via On-Device LLM), a framework for cooperative cross-layer optimization in device-to-device (D2D) communication. Building on our previous work on single-device on-device LLMs, PEARL extends the paradigm by leveraging both publisher and subscriber states to guide Wi-Fi Aware (WA) parameter selection. A context-aware reward, which normalizes latency by application tolerances and modulates energy by device battery states, provides richer supervision for KL-based finetuning. We study two lightweight variants: PEARL (Head + Low-Rank Adaptation (LoRA)) achieves the best overall performance, while PEARL-Lite (Head-only) delivers sub-20 ms inference at near-identical objective scores. Across synthetic scenarios grounded in real measurements, PEARL improves objective scores over heuristic and compact model baselines and reduces energy by up to 16% in cooperative low-battery cases. These results demonstrate that peer-aware context, reward-aligned training, and head-based efficiency make LLMs practical for always-on, on-device cross-layer control. Code, real-world demo, and dataset are available at https://github.com/abman23/pearl

PEARL: Peer-Enhanced Adaptive Radio via On-Device LLM

TL;DR

PEARL tackles cooperative cross-layer optimization for device-to-device wireless links by deploying an on-device LLM agent that leverages both publisher and subscriber context to select Wi-Fi Aware parameters. It introduces two practical PEFT variants (PEARL with LoRA and PEARL-Lite with a head-only classifier) and a context-aware reward that normalizes latency by application tolerances and scales energy by battery state, enabling KL-based fine-tuning with soft targets. Empirical results show that peer-aware, reward-aligned training improves joint latency-energy performance over heuristic baselines, with PEARL-Lite achieving sub-20 ms inference and PEARL delivering the best overall objective score; energy savings of up to ~16% are observed in cooperative low-battery scenarios. The work demonstrates that on-device, peer-aware LLMs can deliver robust, efficient cross-layer control for always-on wireless operation and provides a path toward real-world demonstrations and broader protocol support. All code, data, and demos are available at the authors’ repository.

Abstract

We present PEARL (Peer-Enhanced Adaptive Radio via On-Device LLM), a framework for cooperative cross-layer optimization in device-to-device (D2D) communication. Building on our previous work on single-device on-device LLMs, PEARL extends the paradigm by leveraging both publisher and subscriber states to guide Wi-Fi Aware (WA) parameter selection. A context-aware reward, which normalizes latency by application tolerances and modulates energy by device battery states, provides richer supervision for KL-based finetuning. We study two lightweight variants: PEARL (Head + Low-Rank Adaptation (LoRA)) achieves the best overall performance, while PEARL-Lite (Head-only) delivers sub-20 ms inference at near-identical objective scores. Across synthetic scenarios grounded in real measurements, PEARL improves objective scores over heuristic and compact model baselines and reduces energy by up to 16% in cooperative low-battery cases. These results demonstrate that peer-aware context, reward-aligned training, and head-based efficiency make LLMs practical for always-on, on-device cross-layer control. Code, real-world demo, and dataset are available at https://github.com/abman23/pearl

Paper Structure

This paper contains 22 sections, 3 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: System overview of PEARL. The agent runs on the publisher ($D_1$), combining local context ( e.g., battery level, time, running applications) with peer-side context shared by the subscriber ($D_2$, e.g., battery level, device type). Based on these inputs, PEARL selects WA parameters $(\texttt{PerformanceMode}, \texttt{AccessCategory})$, which configure the WA stack to minimize latency and energy on the D2D link.
  • Figure 2: Architecture of PEARL. Context features are encoded into a structured prompt and passed through a frozen LLM encoder. In PEARL-Lite, a classification head directly predicts one of 8 WA parameter tuples. In PEARL, LoRA adapters are added to the encoder and trained jointly with the head. In both cases, the head produces a single-token decision, avoiding autoregressive decoding.
  • Figure 3: Objective score, latency, and energy comparison across PEARL variants and baselines.
  • Figure 4: Objective scores with context-aware vs. naive reward.
  • Figure 5: Objective scores for different training strategies. KL-based fine-tuning consistently outperforms CE and DPO, both in-distribution and out-of-distribution (OOD).
  • ...and 7 more figures