Table of Contents
Fetching ...

Temperature in SLMs: Impact on Incident Categorization in On-Premises Environments

Marcio Pohlmann, Alex Severo, Gefté Almeida, Diego Kreutz, Tiago Heinrich, Lourenço Pereira

TL;DR

The paper investigates whether locally executed Small Language Models (SLMs) can automate incident categorization in on-premises settings, avoiding cloud-based costs and confidentiality risks. It conducts a systematic empirical study of 21 SLMs ranging from 1B to 20B parameters across two hardware architectures, varying the temperature hyperparameter and measuring precision and processing time on a six-class CSIRT incident dataset. The results show that the temperature setting has limited impact on accuracy or latency, while model size and hardware capacity are the primary determinants of performance, with medium-sized models offering favorable cost–accuracy trade-offs. This work demonstrates the feasibility and practical considerations of private, on-premises security automation, guiding practitioners in selecting appropriate SLMs and hardware for real-time incident categorization.

Abstract

SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks. We investigate whether locally executed SLMs can meet this challenge. We evaluated 21 models ranging from 1B to 20B parameters, varying the temperature hyperparameter and measuring execution time and precision across two distinct architectures. The results indicate that temperature has little influence on performance, whereas the number of parameters and GPU capacity are decisive factors.

Temperature in SLMs: Impact on Incident Categorization in On-Premises Environments

TL;DR

The paper investigates whether locally executed Small Language Models (SLMs) can automate incident categorization in on-premises settings, avoiding cloud-based costs and confidentiality risks. It conducts a systematic empirical study of 21 SLMs ranging from 1B to 20B parameters across two hardware architectures, varying the temperature hyperparameter and measuring precision and processing time on a six-class CSIRT incident dataset. The results show that the temperature setting has limited impact on accuracy or latency, while model size and hardware capacity are the primary determinants of performance, with medium-sized models offering favorable cost–accuracy trade-offs. This work demonstrates the feasibility and practical considerations of private, on-premises security automation, guiding practitioners in selecting appropriate SLMs and hardware for real-time incident categorization.

Abstract

SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks. We investigate whether locally executed SLMs can meet this challenge. We evaluated 21 models ranging from 1B to 20B parameters, varying the temperature hyperparameter and measuring execution time and precision across two distinct architectures. The results indicate that temperature has little influence on performance, whereas the number of parameters and GPU capacity are decisive factors.

Paper Structure

This paper contains 5 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Experiment Stages
  • Figure 2: Processing Time per Architecture
  • Figure 3: Precision per Architecture