The web is experiencing an explosive growth. New technologies are introduced at a very fast-pace with the aim of narrowing the gap between web-based applications and traditional desktop applications. However, these advancements come at a price. The same technologies can also be used to implement web malware able to evade common detection techniques. In our article we present some obfuscation techniques based on HTML5, which can be used to deceive popular malware detection systems. The proposed techniques have been experimented on a reference set of obfuscated malware. Our results show that the malware rewritten using our obfuscation techniques go undetected while being analyzed by a large number of detection systems. Indeed, the same detection systems were able to correctly identify the same malware in its original unobfuscated form. We also provide some hints about how the existing malware detection systems can be enhanced in order to cope with these new threats.
Our obfuscation techniques leverage some HTML5 APIs in order to deliver and reassemble malicious code in a web page. As preliminary phase, the malicious code is split in a series of chunks. The chunks can be arbitrary small and individually undetectable. When the victim visits the infected web page, an arbitrary complex procedure based on HTML5 is executed in order to reassemble and execute the original malware. This approach allows to avoid typical (de)obfuscation patterns detected by static and dynamic analysis. In particular, we show 3 techniques.
This technique allows to avoid (at all or partially) the activities related to the deobfuscation phase by delegating them to the web browser internals, through the WebSQL API or the IndexedDB API. The idea is to split the malicious code into a series of chunks and to recompose it at runtime, as typically occurs for simple (de)obfuscation routines. The difference here is that each chunk is stored in a table entry on the local browser database. Then, when the attack has to take place, the retrieval and preparation of the malicious code is delegated to the database engine through a properly crafted selection query. The same result can be also achieved by means of FileReader and Blob APIs.
Typically, the operations driving the deobfuscation and the execution of a malware would look harmless in themselves but harmful if considered as a whole. The distributed preparation technique aims at deceiving detection systems by breaking-up the execution of a malware code in several simpler pieces to be executed separately in different contexts. Each piece of code would execute its part of the attack and, then, make available the result to the next part. This result can be achieved by executing independent malware activities in different threads through web workers. Moreover, in order to further confuse detection systems, the communication patterns to follow during the execution of the attack would not be established statically but decided at runtime, by evaluating a function that would decide which other web worker would be the target of a communication at the end of a certain step.
The user-driven technique is a variant of the distributed preparation technique. Here, the activities related to the preparation and to the execution of a malware are spread across the time that a victim user spends visiting a single page or a collection of pages, rather than being concentrated in few milliseconds. Moreover, in order to avoid the predictability of the sequence, the execution of the single activities is not automatic but it is triggered by the (unaware) user himself. The content of the page can be organized in such a way that the victim has to perform an exact sequence of steps in order to enjoy the content of the page (e.g., playing a game). By following this sequence, the victim unintentionally drives the execution of the malware.
Using HTML5 to prevent detection of drive-by-download web malware Full paper, paywall article
Using HTML5 to prevent detection of drive-by-download web malware Preliminary version, free
HTML5 Obfuscation Examples A copy of the source code used in our experimentations
Alfredo De Santis
University of Salerno, Italy