JavaScript malware detection using locality sensitive hashing


In this paper, we explore the idea of using locality sensitive hashes as input features to a feed-forward neural network with the goal of detecting JavaScript malware through static analysis. An experiment is conducted using a dataset containing 1.5M evenly distributed benign and malicious samples provided by the anti-malware company Cyren. Four different locality sensitive hashing algorithms are tested and evaluated: Nilsimsa, ssdeep, TLSH, and SDHASH. The results show a high prediction accuracy, as well as low false positive and negative rates. These results show that LSH based neural networks are a competitive option against other state-of-the-art JavaScript malware classification solutions.

35th International Conference on ICT Systems Security and Privacy Protection
Riccardo Scandariato

Software security, Privacy, Machine learning for secure development