Heuristics is the modern anti-malware technology everyone’s talking about and every product claims to have. To some extent this is true. Any antivirus solution worth having uses at least some level of heuristic technology. So why are heuristic techniques important for the future of antivirus protection?
Simply, (and avoiding the etymological discussion) heuristic analysis can be described as a method of estimating the probability that a program that hasn’t been identified as previously known malware is, nevertheless, malicious.
In the modern "threatscape," sheer volumes of newly emergent malware demand at least some level of generic and heuristic approach. For instance, only 25 percent of what we see reported every day (some 110,000+ new samples) is something that we can identify with a "static signature"; the rest is detected using either generic methods (60 percent) or emulator-based heuristics (15 percent).
Malware identification is a balance between two imperatives: the avoidance of false negatives (the scanner fails to detect an infection) and false positives (the scanner detects a virus where none exists). Accuracy in heuristic analysis depends on how aggressively the scoring criteria are set.
Indeed, one approach may be more appropriate than another based on context. It is arguably less damaging to the customer to have a file falsely blocked/deleted when they are trying to download it from a Website, than to falsely detect a critical system file on their desktop operating system.
In the future, vendors must take account of such contextual knowledge about the objects under test and adjust their responses accordingly. Heuristic sensitivity is not just a technical issue related to the accuracy of diagnosing the presence of a previously unknown virus, but also one of impact on the customer.
Some of the most persistent myths in computing relate to virus and antivirus (more accurately termed anti-malware) technology. Such beliefs include the myth anti-malware software can only detect specific known malware objects by using so-called "signatures" to uniquely identify them. This wasn’t true at the beginning of the industry, as some of the first antivirus programs weren’t intended to detect specific viruses, but rather to detect or block virus-like behavior (technology that is now far more advanced) or suspicious changes in files. And it definitely is not true now.
Commercial anti-malware systems supplement signature scanning with a variety of more generic approaches, often grouped together under the banner of heuristic analysis. Unfortunately, "heuristics" has become a bucket term to describe everything from the application of simple generic rules (e.g. block anything in email that comes through email with the double extension .doc.exe) to complex full emulation-based detection systems that search for behavior common to malware.
The idea of heuristic programming is usually regarded as using an application of artificial intelligence as a tool for problem solving. As it is used in the management of malware (or indeed, spam and related nuisances), heuristic analysis also has a more restricted meaning as a rule-based approach to diagnosing a potentially offending file (or message in the case of spam).
As the analyzer checks against criteria that indicate possible malware, it assigns score points when it locates a rule match. If the score meets or exceeds a threshold, the file is flagged as suspicious (or potentially malicious or spammy) and processed accordingly.
In a sense, the goal of heuristics is to apply human-like analysis to an object. In the same way that a human malware analyst would try to determine the process of a given program and its actions, heuristic analysis performs the same intelligent decision-making process, effectively acting as a virtual malware researcher. As more is learned about emerging threats, that knowledge can be applied to the heuristic analyzer through programming, improving future detection rates.
— Andrew Lee, CRO ESET
This blog is part of Internet Evolution’s IT Clan, which addresses the continuing impact of the Internet on enterprise networks, applications, and management. Register here to join the IT Clan’s conversation, and you just might win something unspeakably cool.