On April 21, 2015, Microsoft released MS15-033. In this patch release was a fix for a memory corruption error present in the SmartTag parsing library of MS Office 2007/2010/2013. This vulnerability was assigned CVE-2015-1641.
I won't rehash the technicalities here, as others have already done a thorough job documenting the specifics (see here and here). The important part for the purpose of this discussion is the makeup of the exploit document itself, and perhaps moreso that of the encoded binary data used to hide the intended malware payload.
Note: if you want to follow along here is a sample (hash: aafc9a94b1676172dfc55ae5660b5263888c603821b73b6d2540a7578b67c431)which exhibits the described behaviour.
I examined dozens of documents which exploit this vulnerability and they appear to have consistent structures, suggesting that either they are being built with a kit or builder of some sort or that they are all being constructed via modification of an initial in-the-wild example or proof of concept.
The documents are RTF format containing 4 objects: a reference to an OLE control (otkloadr.dll - see here for why) and 3 Microsoft Word document objects. As well, there is a large binary data chunk appended to the end of the file, after the valid RTF structure ends.
One of the Word documents is responsible for triggering the vulnerability, and another contains a large binary component which is used in a heap spray. This binary data contains a rop-chain and a first stage shellcode. The first stage shellcode locates and executes the second stage shellcode from the large binary chunk, and this second stage is responsible for decrypting itself.
|Second stage shellcode, pre-decryption|
|Second stage shellcode, decrypted, showing ROR7 hash based API lookups|
(resolution of hashes to API calls using FLARE team ShellCode Hashes ida-python plugin)
Once decrypted, this shellcode is responsible for some key actions:
- Locate, decrypt, and execute the malware binary payload.
- Patch some key bytes in the registry to mask the MS Word crash (pursuant to the exploit)
- Locate, decrypt and display the decoy document.
The malware payload and decoy document are both contained inside the large binary segment appended to the end of the RTF file. The steps for locating and decrypting both the malware and the decoy are also found inside this second stage shellcode:
|Second stage shellcode, locating and decrypting malware payload|
Finally, using similar techniques, the shellcode extracts everything between the 0xBB markers and a run of 0xBC bytes. This block of data is the encrypted decoy document, and it is similarly decrypted using an XOR operation against each Dword - this time using 0xBAADF00D as the xor key.
In summary then we can see that the overall high level structure of this binary data segment is:
Stage2_SC | 0xBA's | (enc) Payload | 0xBB's| (enc) Decoy | 0xBC's
In summary, an important fact to note here is that while the CVE-2015-1641 vulnerability itself is new, there have been (and likely will continue to be) numerous vulnerabilities in MS Office handling of documents (in particular RTF files). During the analysis of samples for this post, it became apparent that the second stage shellcode found in the examined CVE-2015-1641 exploit documents has being reused frequently, and simply shoe-horned into the RTF exploit du jour.
Numerous samples for CVE-2014-1761 and CVE-2012-0158 for example are using the same format of Stage2_SC + Marker + MalwarePayload + Marker + Decoy + Marker in their malicious document payloads. As such, the script referenced above is going to work perfectly well against other RTF based exploit documents which happen to utilize this same binary data segment format.