Table of Contents
Fetching ...

Shedding Light on CVSS Scoring Inconsistencies: A User-Centric Study on Evaluating Widespread Security Vulnerabilities

Julia Wunder, Andreas Kurtz, Christian Eichenmüller, Freya Gassmann, Zinaida Benenson

TL;DR

This paper investigates the reliability of CVSSv3.1 scoring by analyzing how 196 CVSS users rate a set of widespread vulnerabilities and a nine-month follow-up with 59 participants. It systematically identifies metric-related inconsistencies, particularly in Attack Vector, User Interaction, and Scope, and shows that personal factors explain only a small portion of the variance in scoring. The study also examines attitudes toward CVSS, the treatment of security deficiencies, and the behavior of evaluators over time, finding that while many users find CVSS useful, they recognize its limitations and the potential impact on vulnerability management decisions. The authors provide a public data set and offer concrete recommendations to improve the Cit CVSS ecosystem—such as centralizing documentation, clarifying problematic metrics, and refining scoring guidance—to enhance consistency and the practical utility of CVSS for vulnerability prioritization and management.

Abstract

The Common Vulnerability Scoring System (CVSS) is a popular method for evaluating the severity of vulnerabilities in vulnerability management. In the evaluation process, a numeric score between 0 and 10 is calculated, 10 being the most severe (critical) value. The goal of CVSS is to provide comparable scores across different evaluators. However, previous works indicate that CVSS might not reach this goal: If a vulnerability is evaluated by several analysts, their scores often differ. This raises the following questions: Are CVSS evaluations consistent? Which factors influence CVSS assessments? We systematically investigate these questions in an online survey with 196 CVSS users. We show that specific CVSS metrics are inconsistently evaluated for widespread vulnerability types, including Top 3 vulnerabilities from the "2022 CWE Top 25 Most Dangerous Software Weaknesses" list. In a follow-up survey with 59 participants, we found that for the same vulnerabilities from the main study, 68% of these users gave different severity ratings. Our study reveals that most evaluators are aware of the problematic aspects of CVSS, but they still see CVSS as a useful tool for vulnerability assessment. Finally, we discuss possible reasons for inconsistent evaluations and provide recommendations on improving the consistency of scoring.

Shedding Light on CVSS Scoring Inconsistencies: A User-Centric Study on Evaluating Widespread Security Vulnerabilities

TL;DR

This paper investigates the reliability of CVSSv3.1 scoring by analyzing how 196 CVSS users rate a set of widespread vulnerabilities and a nine-month follow-up with 59 participants. It systematically identifies metric-related inconsistencies, particularly in Attack Vector, User Interaction, and Scope, and shows that personal factors explain only a small portion of the variance in scoring. The study also examines attitudes toward CVSS, the treatment of security deficiencies, and the behavior of evaluators over time, finding that while many users find CVSS useful, they recognize its limitations and the potential impact on vulnerability management decisions. The authors provide a public data set and offer concrete recommendations to improve the Cit CVSS ecosystem—such as centralizing documentation, clarifying problematic metrics, and refining scoring guidance—to enhance consistency and the practical utility of CVSS for vulnerability prioritization and management.

Abstract

The Common Vulnerability Scoring System (CVSS) is a popular method for evaluating the severity of vulnerabilities in vulnerability management. In the evaluation process, a numeric score between 0 and 10 is calculated, 10 being the most severe (critical) value. The goal of CVSS is to provide comparable scores across different evaluators. However, previous works indicate that CVSS might not reach this goal: If a vulnerability is evaluated by several analysts, their scores often differ. This raises the following questions: Are CVSS evaluations consistent? Which factors influence CVSS assessments? We systematically investigate these questions in an online survey with 196 CVSS users. We show that specific CVSS metrics are inconsistently evaluated for widespread vulnerability types, including Top 3 vulnerabilities from the "2022 CWE Top 25 Most Dangerous Software Weaknesses" list. In a follow-up survey with 59 participants, we found that for the same vulnerabilities from the main study, 68% of these users gave different severity ratings. Our study reveals that most evaluators are aware of the problematic aspects of CVSS, but they still see CVSS as a useful tool for vulnerability assessment. Finally, we discuss possible reasons for inconsistent evaluations and provide recommendations on improving the consistency of scoring.
Paper Structure (42 sections, 7 figures, 11 tables)

This paper contains 42 sections, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Overview of study design ($N$ = number of participants).
  • Figure 2: Structure of the main study.
  • Figure 3: Evaluations of the Attack Vector and User Interaction metric.
  • Figure 4: Evaluations of the Scope metric.
  • Figure 5: RQ2: Severity distributions of the vulnerabilities. Security deficiencies Banner Disclosure and HTTPOnly are more frequently rated with None than other vulnerabilities.
  • ...and 2 more figures