December 13, 2018

Machine Learning: A Belt and Suspenders Approach

By: Doug Levin, Member, Board of Directors, Reversing Labs

Machine_Learning_post


Definition: Belt and suspenders is a term used to mean conservatism and safety. More generally — as the use of both a belt and suspenders to hold up one's pants implies — having redundant safety procedures in place to eliminate all risk.

Investopedia

“Wearing a belt and suspenders together is far too busy and makes you look like you're trying to set the world record for wearing the heaviest pants.”

Business Insider

A belt and suspenders strategy won't win you any fashion awards, but it can save you from embarrassingly being caught with your pants down in a successful cyber-attack. Malware practitioners are not too proud to use any tactic to gain access to your data, so you must bring an arsenal of effective technologies to undress these new attacks even if the approach is a fashion faux pas. However, unlike wearing two devices to hold up your pants, cybersecurity professionals can combine overlapping technologies to provide significant synergies that exceed the sum of their parts. That makes a belt and suspenders approach a sound strategy for cybersecurity.

Several customers and prospects have inquired about how ReversingLabs compares with companies touting machine learning in their solutions. The answer is we embrace and use machine learning extensively in our products and services. If our file decomposition and static analysis technology is the "belt," then machine learning equates to the “suspenders” in our version of the strategy.

Our belt, decomposition and static analysis, unpacks files to expose all internal objects stripping away obfuscation, archiving, encryption and compression. This provides security analysts, threat hunters, and forensic investigators with a clean and in-depth view into any file and the malware hidden inside. Security teams gain powerful visibility to any malware in these files, but because the approach does not require any previous knowledge of malware (think signatures), it is especially effective at uncovering previously unknown malware.

However, companies deal with massive amounts of files, many of which are hiding malware. No security team can look at all the files tied to events, let alone at files that look perfectly fine but in fact, are zero-day attacks. That is where machine learning comes in. Machine learning offers a practical approach to look across massive amounts of files to search out hidden malware.

Our suspenders, machine learning, and functional similarity analysis algorithms (let's call them the right and left suspender) classify files and malware on a massive scale – a powerful assist to the security team in searching across and prioritizing "files of interest."

Our left suspender is the ReversingLabs Functional Hashing Algorithms (RHA). RHA intelligently hashes a file's features rather than its bits, creating a massively scalable way to identify functionally similar malware. One RHA hash can identify thousands of functionally similar malware files even though each has a unique SHA-1 hash. A single RHA hash can also identify unknown malware variants based on their similarity to known malware.

Our right suspender is machine learning. Our machine learning models rapidly evaluate thousands of characteristics to classify new, unknown files by malware type (e.g., ransomware) at very high speed. Our approach to ML differs from other vendors in that we use malware objects from payloads that have been unpacked, extracted and de-obfuscated. That means ReversingLabs is training models that develop internal algorithms for identifying new attacks utilizing the payload’s internal components that are not visible to other vendors’ ML implementations. The result is excellent coverage, higher accuracy and more detections of new, sophisticated malware attacks.

Expanding on the "belt" part of our analogy, just as many people connect their cell phones to their belts, we also connect a few useful tools to our belt. Examples include dynamic analysis, global file intelligence, and YARA rules. In concert with static analysis and these added tools, ML plays a valuable role multiplying the eyes of the security analyst by thousands. Files classified by RHA and ML enable security teams to identify the riskiest files. Our “belt” technologies provide detailed information for analysis that is surfaced through static analysis, dynamic analysis and global file intelligence. Using this plethora of information, analysts can not only determine the intent and source of the attack but also pivot to similar malware for analysis and develop new defenses against like attacks via YARA rules. With ReversingLabs "belt and suspenders" approach, security teams can identify and respond to undetected malware in their infrastructures through integrated threat intelligence, hunting and response capabilities at enterprise scale.

ReversingLabs also keeps up with the latest malware styles on the cyber-attack "fashion runway." We've implemented ML to aid classification of the more than eight million files processed daily by our TitaniumCloud File Intelligence Service. Our models constantly train on the very latest malware that appears in-the-wild, continuously reducing the risk of both false positives and false negatives.

Security analytics and machine learning are presently over-hyped in the market as the latest answer to all bad things. Beyond the hype, however, these technologies, as part of a sound "belt and suspenders" strategy, are powerful tools to enhance and support the efforts of security teams. ReversingLabs utilizes these tools and more in our efforts to provide cutting-edge malware analysis products, and tools that you will undoubtedly hear more about in the coming months. So much for winning any fashion awards.