Sunday, 24 January 2016

NanoLocker - Ransomware analysis

I had occasion recently to come across a previously unseen (by me) ransomware variant which goes by the name 'NanoLocker'.  Other than some very general posts such as this one by Symantec, there doesn't seem to be much published analysis of this ransomware. In light of this fact I decided to reverse-engineer a few samples in order to understand how it worked.

The detailed analysis is presented below, but I will mention upfront that I was able to build a decryption tool for files encrypted with this ransomware which will work in certain specific circumstances.  The details of these required preconditions are outlined below.

Note on Samples

The samples I examined for this post were of two different (yet functionally similar) versions of NanoLocker:


NanoLocker takes a similar functional approach to that of other common ransomware variants.  This malware leverages the Windows CryptAPI for its crypto routines. The cryptographic design is mostly solid - the files are encrypted using a run-time generated AES-256 key, however the initialization of this AES key is done with a default (null) IV.  After use, this AES key is obfuscated slightly before being encrypted using an RSA public key which is imported from a base64 string hardcoded into the binary. This newly encrypted key is then finally base64 encoded and displayed to the user in order for it to be transmitted to the malware author via public note during the Bitcoin payment.  We will refer to this final base64 encoded key material as MK.

The main design flaw in this ransomware is similar to that of early TeslaCrypt/AlphaCrypt in that the AES key is written to a file on disk, and this key is left untouched until the encryption process is complete.  The implications of this are that any interruption of this encryption process (eg. by entering hibernation, power-down, etc) can leave the symmetric encryption key available on disk.  

The decryption tool I have written capitalizes on this design choice and thus offers a method of decryption for victims who have either captured a copy of the key file while NanoLocker was in the midst of encrypting their data, or if they were able to interrupt the process via shutdown/hibernate and have managed to acquire this key file from disk.  

Execution Details of Interest

State Tracking

NanoLocker creates a file on disk which is used as a tracking mechanism for its state, key information, and file target list.  In the analyzed samples the filename used for this is %LOCALAPPDATA%\lansrv.ini - and this file is created with the hidden attribute set.

The first byte of this file is used to hold an integer which represents the current execution state.  This state tracking value is increased as the malware progresses with its various functions.  There are three possible states for the tracking byte:

  1. Initialization & Enumeration - symmetric key has been created and is stored in plaintext format [44 bytes] in the tracking file starting at byte offset 0x02.  All locally mapped drives are enumerated for filenames matching the targeted extensions, and these filenames are written to the tracking file.
  2. Encryption - Encryption of the files listed in the tracking file begins once this state is reached.
  3. Encryption Complete - all targeted files have been encrypted. Once in state 3, the raw symmetric key is replaced with the bitcoin payment address followed by MK. At this point the NanoLocker splash screen is displayed to the user.
During initialization the AES key is generated and written to the file along with the state 1 marker as shown in the following pseudocode from the unpacked binary:

Once in state 2, and until all discovered files are encrypted, the tracking file (lansrv.ini) holds the exported key data which was written in the above call to WriteFile.  Capturing the tracking file at this point will reveal the state flag, key data, and a list of targeted files:

Finally, once the targeted files have been encrypted the state 3 flag is written to the tracking file.  At this point the key data is replaced with the bitcoin address followed by the base64 encoded, public key encrypted symmetric key (MK):


A common action for ransomware threats is to transmit the key material over the network to an attacker controlled command server.  NanoLocker on the other hand takes a minimalistic approach to network communication - it transmits only two ICMP packets out to the C2 server.  

Once the malware unpacks itself in memory it carries out some initialization steps (dropping a copy of itself to disk, setting a persistence mechanism, etc).  The next step is to submit a ping to the command and control server.  These ping packets might appear at first glance to be typical echo request packets, but as the code below reveals, there is something else being sent:

As we can see above, the call to IcmpSendEcho uses the ransomware bitcoin address as the data to submit with the echo request.  We can see this in action if we capture the packets going out from the infected system:

The second ICMP packet sent by NanoLocker occurs once the encryption process has completed, immediately after state 3 is reached.  This packet, similar to the first, uses the data bytes of the ICMP message to send the bitcoin address, and it also appends to this the total number of files that were encrypted during state 2:

Decryption Tool

As described above, if the tracking file can be captured during either state 1 or 2 (through interruption of the encryption process or otherwise), the symmetric key can be extracted and used to decrypt any files that may have already been encrypted.  

Admittedly, for most stand alone Windows systems, capturing this tracking file in either of its first two states may be operationally infeasible.  This is simply a function of the relatively small number of files present and the speed of encryption on such a system.  However in larger environments with huge distributed file systems such as those found in modern enterprise networks, it may be possible to detect NanoLocker-encrypted files prior to the completion of the encryption phase (stage 2).  Such a scenario is more likely due to the increased time required to encrypt tens or hundreds of thousands of files.  Encryption at these scales can take several hours or possibly even days for larger file systems.  

The usage of the tool is as follows:
NanoLocker_Decryptor.exe <encrypted_file> <output_file> <tracking_file>
This decryption tool will examine the encrypted source file and the provided tracking file to validate two required conditions:
  1. that the tracking file is in either state 1 or state 2, and 
  2. that the encrypted source file was encrypted using the key found in the provided tracking file
The second step is possible thanks to another design choice made by the creator of NanoLocker.  In each encrypted file there is a header prefixed to the actual encrypted bytes of the original file.  This header contains a checksum value which can be used to validate that the key we are trying to decrypt with was the same one used to encrypt the file initially. 

The source code is available on github here.  A precompiled version is available here.  For the precompiled version, you will need the 32bit Visual C++ 2013 redistributable if you don't already have it installed.  It is available from Microsoft here.

Final Comments

I avoided describing the details of unpacking and disassembling this ransomware.  In a follow-up post I will document the steps required to carry out this unpack in a tutorial fashion.  However it is worth pointing out that an unpacked copy of this particular malware can be easily dumped from memory using Volatility. 

First determine the process id using pslist :

Then dump the process to disk, using the fix-up option (-x):

Then load into your favourite disassembler:

Monday, 26 October 2015

RTF Exploit Document Extraction


On April 21, 2015, Microsoft released MS15-033.  In this patch release was a fix for a memory corruption error present in the SmartTag parsing library of MS Office 2007/2010/2013.  This vulnerability was assigned CVE-2015-1641.

I won't rehash the technicalities here, as others have already done a thorough job documenting the specifics (see here and here).  The important part for the purpose of this discussion is the makeup of the exploit document itself, and perhaps moreso that of the encoded binary data used to hide the intended malware payload.

Note: if you want to follow along here is a sample (hash: aafc9a94b1676172dfc55ae5660b5263888c603821b73b6d2540a7578b67c431)which exhibits the described behaviour.

RTF Document

I examined dozens of documents which exploit this vulnerability and they appear to have consistent structures, suggesting that either they are being built with a kit or builder of some sort or that they are all being constructed via modification of an initial in-the-wild example or proof of concept.

The documents are RTF format containing 4 objects: a reference to an OLE control (otkloadr.dll - see here for why) and 3 Microsoft Word document objects. As well, there is a large binary data chunk appended to the end of the file, after the valid RTF structure ends.

One of the Word documents is responsible for triggering the vulnerability, and another contains a large binary component which is used in a heap spray.  This binary data contains a rop-chain and a first stage shellcode. The first stage shellcode locates and executes the second stage shellcode from the large binary chunk, and this second stage is responsible for decrypting itself.

Second stage shellcode, pre-decryption
Second stage shellcode, decrypted, showing ROR7 hash based API lookups
(resolution of hashes to API calls using FLARE team ShellCode Hashes ida-python plugin)

Once decrypted, this shellcode is responsible for some key actions:

  1. Locate, decrypt, and execute the malware binary payload.
  2. Patch some key bytes in the registry to mask the MS Word crash (pursuant to the exploit)
  3. Locate, decrypt and display the decoy document. 

The malware payload and decoy document are both contained inside the large binary segment appended to the end of the RTF file.  The steps for locating and decrypting both the malware and the decoy are also found inside this second stage shellcode:

Second stage shellcode, locating and decrypting malware payload
The shellcode searches for a sequence of 0xBA bytes, and empirical evidence has shown that the binary chunk typically contains 7 consecutive 0xBA bytes to mark the start of the encrypted payload (although the shellcode does not explicitly require all seven).  The shellcode then decrypts each double word of data by XOR'ing it with the value 0xCAFEBABE and moves it to a buffer.  The end marker of 0xBBBBBBBB is sought, and again in practice a consecutive run of seven 0xBB bytes is typically seen in the samples.

Finally, using similar techniques, the shellcode extracts everything between the 0xBB markers and a run of 0xBC bytes.  This block of data is the encrypted decoy document, and it is similarly decrypted using an XOR operation against each Dword - this time using 0xBAADF00D as the xor key.

In summary then we can see that the overall high level structure of this binary data segment is:

Stage2_SC | 0xBA's | (enc) Payload | 0xBB's| (enc) Decoy | 0xBC's

Obviously this process is far too tedious for manual repetition, so I constructed a rudimentary script to seek, extract, and dump both the malware payload and decoy documents to disk.  It can be grabbed from github here.

Concluding Remarks

In summary, an important fact to note here is that while the CVE-2015-1641 vulnerability itself is new, there have been (and likely will continue to be) numerous vulnerabilities in MS Office handling of documents (in particular RTF files).  During the analysis of samples for this post, it became apparent that the second stage shellcode found in the examined CVE-2015-1641 exploit documents has being reused frequently, and simply shoe-horned into the RTF exploit du jour.

Numerous samples for CVE-2014-1761 and CVE-2012-0158 for example are using the same format of Stage2_SC + Marker + MalwarePayload + Marker + Decoy + Marker in their malicious document payloads.  As such, the script referenced above is going to work perfectly well against other RTF based exploit documents which happen to utilize this same binary data segment format.

Friday, 11 September 2015

How to fail at ransomware


Ransomware sucks. It is annoying and persistent, and apparently very lucrative. For this reason alone we can expect, and indeed find, that even the unskilled will try their hand at developing a ransomware payload. What better low-cost-of-entry method than powershell.  The only problem is (at least for the samples described below) that using deterministic symmetric encryption sorta, well, renders the entire encryption process trivially reversible...


I had originally been taking a look at this CHM dropper [md5: 21e1a87bd0418381a1bc8df088ee5c91] which carries out a download and execute of a powershell based "ransomware" script [md5: be03eb109cab04a1a70b5bbc7b22949e].

The download and execute from the CHM:

CHM dropper referencing

Colorful domain names notwithstanding, we can see the simplicity here. Looking at the actual ransomware script reveals a very basic .Net approach to enumerating all drives on the system, and then systematically recursing for a laundry list of various file extensions.  Each matching file is then sent through the encryption routine.

I have cleaned up some of the variable names for legibility here, but obfuscation can't really mask the issues with this one.

In a more legitimate attempt to effect a crypto based ransomware ploy, one would need to encrypt the victim data asymmetrically.  There a few reasonable ways to go about this, but unfortunately the author of this script decided to go another way entirely - the way which uses symmetric encryption with a completely deterministic key and IV.

I had originally written a simple decryption script for this corresponding Powershell encryption script, but after re-checking the download site after a couple of days it does appear that the script was/is being modified (reference md5: 9fe45fc4c402932248cd2c26b65f883d). The modification or revision process has not improved the overall sophistication of the encryption however, but has merely modified the salt value being used to generate the Rijndael encryption key. I updated my decryption script (link at bottom) to accommodate a user - supplied salt value which can be extracted from line 7 of  the original encryption script [see update below]:

Another interesting change is that the 'decryption instructions' HTML file that gets written to the victims' hard drive has been completely redesigned and is an obvious style ripoff of the Cryptolocker ransom instruction page:

Original decryption instructions from be03eb109cab04a1a70b5bbc7b22949e

New version from 9fe45fc4c402932248cd2c26b65f883d

These instruction files also reference an ID# which is obviously supposed to lend support to the fact that there is some asymmetric encryption taking place which requires a specific private key to decrypt.  It is possible that this ID is somehow deterministic, although it is definitely not machine based - I tried downloading the script from multiple different IP addresses using different operating systems on a single day and received an identical copy of the script having the same ID# each time.  It may be date based, and I will update this post with any additional information that comes out of future checks.

Why you never pay the ransom

Further examination of the encryption script shows that any files less than 42871 bytes in size will be encrypted in their entirety, but those above or equal to this size will have only the first 42871 bytes encrypted.  

I would almost pay the ransom value just to know why the author chose this arbitrary seeming value of 42871 bytes.  The net result of this choice, combined with the 128 bit block size and choice of zero padding is that for any file of 42871 or more bytes, the original file will have 9 bytes of unrecoverable data. 

Whenever the file length is 42871 or more bytes, the encryption will read in the first 42871 bytes, encrypt them and pad out to the next 16 byte (128 bit) boundary which happens to be 42880 bytes. The destructive flaw in the script is that the original file is overwritten in place - thus bytes at offsets 42872 to 42880 are overwritten with the padded bytes.  Upon decryption these bytes cannot be recovered as they were not encrypted originally but simply overwritten. 

Thus - even if you actually had to pay the ransom - it would only recover files which were less than 42871 bytes in size.

In case you were infected with this, I wrote a simple decryption script to recover the files subject to the restrictions described above.  It can be found here.

Update 09/17/2015

New variants of this crypto script are now attempting to delete any existing volume shadow copies prior to executing the encryption routine:

$hdGhncrThhsHhjs = Get-WmiObject Win32_ShadowCopy
ForEach($QhdThscGhsjdR in $hdGhncrThhsHhjs) {

Regarding the salt used in the decryption script, these recent variations have modified the location of the salt from where it had previously existed at line 7 of the crypto script. The salt required by the decryption script is set by the call to GetBytes and stored in a variable used as the second argument to Security.Cryptography.Rfc2898DeriveBytes.

Friday, 17 July 2015

Kovter analysis [part 1]


Fileless malware has been gaining increased attention in the malware forensics community as of late. Accordingly, I have been paying particular attention to indicators and forensic analysis of threats such as Poweliks. These malware variants typically leverage the Windows registry to maintain persistence, and they avoid leaving executable files on disk.  I recently had an encounter with one such malware family - Kovter.

Kovter was originally discovered as a particularly nasty type of ransomware, but has recently been adapted to instead cash in via ad/click fraud.

In the sections below I will walk through some basic static analysis of one such sample.  Additional analysis of later stages of this malware will follow in another writeup.

In case you want to follow along, the sample being analysed in this discussion is: 6ca41538ae9c25b259e6fcfce565b89b (many thanks to Kafeine for the sample).

Initial Infection

Upon execution of the initial infector, several checks are performed to look for security tools such as Fiddler and Wireshark, as well as for standard indicators of virtual environments (registry keys common to VirtualBox, Sandboxie, and VMware). If none are found, the malware proceeds to create a random looking registry key (using hex chars) under HKCU\Software and HKLM\Software.  Within this parent key are six values which are also named using apparently random hex characters. These registry values and their parent key appear to be named in a deterministic manner, likely generated on characteristics of the victim system (such as product id, guid, etc).  One of these registry values contains obfuscated javascript which ultimately unpacks and executes shellcode.  This shellcode is responsible for decrypting the data stored in a second registry value and executing it.  Interestingly, the encryption key for this activity appears to be generated uniquely upon execution of the initial infection binary.

Fileless Persistence

After initial infection, the run key shown below in Figure 1 will be present.

Figure 1: Run key

This run key makes use of the mshta.exe application to execute a few simple javascript commands. Deobfuscated, this became (in the case under examination):
y=new ActiveXObject("WScript.Shell"); 

*Note that this particular malware will write to both HKLM and HKCU if it is able.  The content written is identical in either case.

 Taking a look at the HKCU\Software\2efd7e07 key revealed the following:

Figure  2: HKCU\Software\2efd7e07 key content
The run key ultimately ran a javascript eval() command on the contents of registry value 7b5fa1aa. A closer look at this particular value revealed the block shown in Figure 3:

Figure 3: Obfuscated javascript block
Some quick JSDetox'ing and variable renaming yielded the following:

Figure 4: Deobfuscated javascript
The content of the encodedBlock variable was snipped to aid in readability; and the ultimate goal of this javascript code is to xor decode the content of this variable.  The last stage is to eval() it, and the actual decoded version of encodedBlock looks like this:

Figure 5: Script to execute
Again I have truncated some content to ease reading.  In this case there is a large base64 block which is being converted and then executed via powershell.exe. Naturally this base64 decoded text is just a powershell script. Figure 6 below depicts the version I cleaned up to a readable state.

Figure 6: Powershell script to launch shellcode
Variable $sc32 in the powershell script contains (unsurprisingly) shellcode to be executed from memory. The method of invocation follows a pretty standard execution flow including calls to VirtualAlloc using PAGE_READWRITE_EXECUTE permission and then CreateThread invoking the shellcode as a function from a (now) executable memory page.  Note also that it passes the memory address of the shellcode to itself as a parameter.

Shellcode Analysis


Once the shellcode is loaded, the overall actions are pretty standard, with a couple of interesting exceptions.

Overall steps:
  1. Locate Kernel32.dll using hashing on the BaseDLLName member of the InMemoryOrderModuleList for the current process thread.
  2. Locate the offsets for LoadLibraryA, GetProcAddress, VirtualAlloc, and ExitProcess via ordinal lookups
  3. Load advapi32.dll and then do an ordinal lookup for RegOpenKeyExA, RegQueryValueExA
Things get interesting at this point. With each execution of the initial infector, a unique registry key name is chosen to store the various values shown above in figure 2. In the excerpt below you can see that the shellcode we have extracted (from the $sc32 variable in Figure 6) is actually referencing a location at offset 0xAD8 from its base address, and this location holds a string equal to the name of the registry key it was ultimately being stored in:

Figure 7: Loading the string variable matching the registry key
This suggests that the shellcode is modified during initial infection to correctly reference the registry values it will eventually be stored in.

Figure 8: Reference to randomly generated key
Before continuing from step 3 above, we have to locate a particular (also randomly named) registry value located within this key, and this one is also explicitly named within the shellcode:

Figure 9: Registry value which contains next stage, encrypted shellcode
  1. Load the content of this registry key (HKCU\Software\2efd7e07\fecae03a in our example) into memory.  
Again, we find more evidence that the shellcode has been patched during initial infection, as this registry data is actually an RC4 encrypted executable. The length of the RC4 key, as well as the key itself are pulled directly from the shellcode, and these values are unique for each execution of the initial infector*:

Figure 10: RC4 key length

Figure 11: RC4 key

This key is then copied (using an included 'memcpy' function) to a local variable for later use:

Figure 12: copy RC4 key to local variable
Continuing on, we see the shellcode decrypt the data from the fecae03a registry value and unpack it for execution.  To round out the shellcode execution steps:
  1. Decrypt the executable and perform required memory mapping (headers, copying sections to correct addresses, apply any relocs, etc.)
  2. Call the newly mapped file.
In a follow-up post I will walk through analysis of the running payload. Until then, feel free to contact me with any questions or comments.

* Note: not only do the encryption key and length change each time the initial infector is run, but if the registry key is deleted, a watchdog process reinserts the values immediately - and each time the data is reinserted, the encryption key is modified.