Memory-Augmented Log Analysis with Phi-4-mini: Enhancing Threat Detection in Structured Security Logs
Anbi Guo, Mahfuza Farooque
TL;DR
The paper tackles log anomaly detection for multistage APTs in structured security logs, where LLMs suffer from context limits and domain mismatch. It introduces DM-RAG, a dual-memory retrieval-augmented generation framework that combines a rolling short-term memory with a FAISS-indexed long-term memory and uses an instruction-tuned Phi-4-mini with Bayesian fusion. On UNSW-NB15, DM-RAG achieves $98.70\%$ recall and $69.59\%$ F1, surpassing LoRA-fine-tuned and MITRE-style RAG baselines in recall while maintaining competitive precision. The method is lightweight, interpretable, and suitable for real-time threat monitoring without external corpora, with potential applicability to other structured temporal data.
Abstract
Structured security logs are critical for detecting advanced persistent threats (APTs). Large language models (LLMs) struggle in this domain due to limited context and domain mismatch. We propose \textbf{DM-RAG}, a dual-memory retrieval-augmented generation framework for structured log analysis. It integrates a short-term memory buffer for recent summaries and a long-term FAISS-indexed memory for historical patterns. An instruction-tuned Phi-4-mini processes the combined context and outputs structured predictions. Bayesian fusion promotes reliable persistence into memory. On the UNSW-NB15 dataset, DM-RAG achieves 53.64% accuracy and 98.70% recall, surpassing fine-tuned and RAG baselines in recall. The architecture is lightweight, interpretable, and scalable, enabling real-time threat monitoring without extra corpora or heavy tuning.
