Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection

Siddhant Gupta; Fred Lu; Andrew Barlow; Edward Raff; Francis Ferraro; Cynthia Matuszek; Charles Nicholas; James Holt

Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection

Siddhant Gupta, Fred Lu, Andrew Barlow, Edward Raff, Francis Ferraro, Cynthia Matuszek, Charles Nicholas, James Holt

TL;DR

By extracting sub-signatures from publicly available YARA rules, a set of features that can more effectively discriminate malicious samples from benign ones are assembled, and it is demonstrated that these features add value beyond traditional features on the EMBER 2018 dataset.

Abstract

A strategy used by malicious actors is to "live off the land," where benign systems and tools already available on a victim's systems are used and repurposed for the malicious actor's intent. In this work, we ask if there is a way for anti-virus developers to similarly re-purpose existing work to improve their malware detection capability. We show that this is plausible via YARA rules, which use human-written signatures to detect specific malware families, functionalities, or other markers of interest. By extracting sub-signatures from publicly available YARA rules, we assembled a set of features that can more effectively discriminate malicious samples from benign ones. Our experiments demonstrate that these features add value beyond traditional features on the EMBER 2018 dataset. Manual analysis of the added sub-signatures shows a power-law behavior in a combination of features that are specific and unique, as well as features that occur often. A prior expectation may be that the features would be limited in being overly specific to unique malware families. This behavior is observed, and is apparently useful in practice. In addition, we also find sub-signatures that are dual-purpose (e.g., detecting virtual machine environments) or broadly generic (e.g., DLL imports).

Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection

TL;DR

Abstract

Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)