Predicting Vulnerable Components: Software Metrics vs Text Mining

James Walden, Jeffrey Stuckman, Riccardo Scandariato

December 2014

Abstract

Building secure software is difficult, time-consuming, and expensive. Prediction models that identify vulnerability prone software components can be used to focus security efforts, thus helping to reduce the time and effort required to secure software. Several kinds of vulnerability prediction models have been proposed over the course of the past decade. However, these models were evaluated with differing methodologies and datasets, making it difficult to determine the relative strengths and weaknesses of different modeling techniques. In this paper, we provide a high-quality, public dataset, containing 223 vulnerabilities found in three web applications, to help address this issue. We used this dataset to compare vulnerability prediction models based on text mining with models using software metrics as predictors. We found that text mining models had higher recall than software metrics based models for all three applications.

Type

Conference paper

Publication

IEEE 25th International Symposium on Software Reliability Engineering

Vulnerabilities Security

Predicting Vulnerable Components: Software Metrics vs Text Mining

Abstract

Riccardo Scandariato

Professor

Related