Recently, the MalwareHunterTeam announced the discovery of new ransomware. This ransomware, Spora, is one of the most sophisticated examples seen to date. According to BleepingComputer, Spora has “top notch encryption”, “has the most sophisticated payment site as of yet” and a “professional decryption service”. Even its infection vector demonstrates a level of sophistication above and beyond its predecessors.
One of Spora’s infection vectors is via an HTA email attachment. During the infection, no additional network communication is needed – this malware dropper is completely self-contained. The HTA contains an embedded copy of Spora that is decrypted and installed on the victim machine as the attachment is opened. If you have the attachment, you already have Spora – network connection, or not.
This is the perfect opportunity for us to learn about HTA-based malware droppers. Lets take a look at the Spora infection vector to learn why the authors chose HTA…
Most of us are already familiar with typical infection vectors from droppers that leverage Microsoft Office files, PDF documents, Flash files, etc. Spora has chosen to use HTA as the malware dropper. Interesting choice… So, what is HTA?
HTA stands for HTML Application. It is a technology that Microsoft created to allow graphical user interfaces to be bundled directly with script logic. Scripting languages such as VBScript do not have any native GUI capabilities. HTA leverages Internet Explorer to provide scripts with a graphical interface. Even though they leverage Internet Explorer, they run outside of the browser via MSHTA.exe. This means that they are free from the normal security restrictions placed on scripts that run inside a web browser.
To illustrate, an existing HTML file can be renamed to an .hta extension and it would yield a working example of HTA. The only difference being, the document would open via MSHTA.exe rather than a web browser.
“ HTAs, by contrast, are not bound by the same security restrictions as Internet Explorer; that’s because HTAs run in a different process than Internet Explorer. (HTAs run in the Mshta.exe process rather than the Iexplore.exe process.) Unlike HTML pages, HTAs can run client-side scripts and they have access to the file system. Among other things, this means that HTAs can run your system administration scripts, including those that use WMI and ADSI. Your scripts will run just fine, and you won’t receive any warnings about items that might be unsafe.” – Microsoft TechNet
For a malware author targeting Microsoft Windows victims, HTA seems like a reasonable choice. Its simple, it can (“among other things”) access the file system, and it is a pre-installed capability in Windows.
Now, lets take a deep dive into how the authors of Spora leveraged HTA for their infection…
ANALYZING THE INFECTION VECTOR
The Spora infection begins as an HTA file.
Like many other malware droppers, it is reported that the Spora HTA file is initially delivered to its target via an email attachment. The delivery mechanism doesn’t matter so much, as long as the HTA file reaches a victim and the victim is likely to click on it.
Spora disguises the HTA file name such that it includes two file extensions (e.g., <filename>.doc.hta). This is most likely an attempt by the malware author to hide the actual .hta file extension from the victim. On Windows machines that are configured to hide file extensions, the actual file extension (.hta) would be hidden from the user, leaving only the fake file extension displayed (e.g., <filename>.doc appears rather than <filename>.doc.hta). The intent is to deceive users into clicking on the file, thinking that they are opening a Word document rather than the HTA file.
Once the HTA file is opened, the infection occurs through a series of five distinct stages as shown below. In stage one, the HTA dropper writes an embedded payload to a JScript file (%temp%/close.js) and runs it. This JScript file contains the entire payload necessary for each of the subsequent stages.
The Word document serves no purpose other than to distract the victim from noticing that Spora was installed. When the victim sees a Word document opened from an email attachment with a .doc extension, he/she is more likely to think that nothing is wrong. Actually though… the infection is done. Spora is now encrypting precious files on the victim’s hard drive and holding them for ransom.
Lets walk through this logic in detail to see how each of these stages work…
Stage 1: Create close.js and Run It
The next step is to write the infection payload to this file. As highlighted below, a payload is written to the file using the Scripting.FileSystemObject object. The highlighted payload was abbreviated in the screen shot for conciseness and readability. The actual payload is much larger than what is shown.
Once the payload has been written to the file, the file is closed and then run using the WScript.Shell component. Highlighted below, we can see that %temp%\close.js is being executed.
In stage two, lets see what was in the payload and step through what it does…
This file is obviously formatted as ugly as possible by the author in an attempt to make it less readable. Reading and understanding this file in its current form is a challenge. We will need to reformat it (add white-spacing, etc.) in order to be able to make any sense of it.
So, lets abbreviate some of the payloads to allow us to get a better overall picture of the logic without being confused by the payload values themselves. Shown below, we can start to see that many payloads are passed into the suspected Gr() decode function.
Based on our discovery above, lets replace all qP() references with Function() to make the code more readable. Shown below is the resulting cleansed code.
So, is the Gr() function really a decode loop?
Looking at the highlighted logic shown below, we can see that the variable x that is passed into Gr(), is iterated over, its values are altered using some basic math formulas resulting in a new value (variable b). The new array of values is joined back together into a string and returned. Long story short, this is definitely a decode loop.
Based on our analysis above, lets rename all the Gr() references to Decode() to make things more readable as shown below.
Now that we know how to identify an anonymous self-invoking function, we can easily spot a second one as highlighted below. As close.js is started, the outermost self-invoking function shown above will automatically run. As it runs, this innermost self-invoking function shown below will automatically start.
The syntax for defining and invoking a function constructor is Function(arg1, arg2, argn, codeblock). The last argument passed into the Function constructor is the code block for the function itself. This means that the highlighted section of code below is the code block for this innermost self-invoking function on line 28.
So far, we know that as close.js starts, it self-invokes the outermost function on line 2. It then self-invokes the innermost function on line 28. The innermost function is a function constructor, which means that the highlighted section of code is ultimately the code that automatically runs at startup time. After the highlighted section of code finishes running, its output becomes the code block argument to the innermost function as it starts.
We’re getting closer…
Looking at the code highlighted above, we can see, similar to line 28, it is a function constructor. This means that the highlighted section of code below represents its code block (since it is the last argument passed into the function constructor).
If the above highlighted line of code is the code block for the function constructor, then what is this similar looking section of code highlighted below? These are the arguments passed into the code block highlighted above. If we decode the code block above, we should see logic that references the arguments highlighted below.
Time to decode the code block and see what this stage of infection actually does…
If we use the Decode()/Gr() logic shown earlier to decode the first payload, we can see the code block as shown below. As we suspected, we can see references to inbound arguments in lines 11 and 12. The logic appears to iterate over each of the bytes in each of the arguments, alter them with some basic math formulas and join the results back together.
Sound familiar? Yes, it’s another decode loop.
Based on our understanding of the logic above, it appears to decode each argument passed into it. Looking at the highlighted line below, we can see there is twenty-one arguments passed into this decode loop. These arguments are first decoded via the Decode()/Gr() function discussed earlier and then further decoded with this new decode loop shown above.
Ultimately, this dual-decoded array of payloads shown below is the infection code that is run at the end of this stage.
This stage of the infection began with a series of nested self-invoking functions, which automatically ran at startup time (innermost first, followed by the outermost). The innermost self-invoking function dynamically built and ran a function constructor. The code block for this function constructor was the result of decoding an embedded payload from close.js. Ironically, this code block turned out to be second decoding loop.
As this function constructor ran, the second decode loop was passed twenty-one decoded payload chunks from close.js as arguments. The second decode loop decoded them once more, resulting in the final payload to be run by the innermost self-invoking function.
So, why do we think this is a decode loop?
Its time to decode this payload to see if our analysis is correct so far…
Stage 4: Decode Final Payload and Run It
Could CryptoJS be used to decode the payload rather than our previously seen decode loop techniques?
Lets dig deeper to find out…
Our suspected payload is highlighted in the screen shot below. This payload was abbreviated for conciseness and readability.
Expanding the payload display as shown below gives us a better idea of the size of the payload contained in this infection stage. The payload is actually an array of hex encoded values (many separate payloads in a comma separated list). Since it is an array of payloads, each payload element in the list may be used for different purposes at different times during this infection stage.
Please note that this is still only a representative clip of the overall payload.
Looking at the results above, we can see references to cryptography (e.g., AES, BlockCipher, encryptBlock, decryptBlock, OpenSSL, etc.). We can see references to ADODB.stream, which is typically used by droppers to write malware to disk. We can see references to WScript.shell, which is typically used by droppers to run malware after its been installed on disk. We can also see references to what appears to be a Word document name (e.g., MP%\doc_6d518e.docx).
Using the list of decoded hexadecimal values above, we can layer the actual values into the original code as shown below to gain more insight into the intent of this infection stage. In the code below, references to the original payload (e.g., _0xee6f) were replaced with their hexadecimal decoded values. In some cases, lines of code that were performing simple string concatenations were collapsed together to simplify the code shown.
We can now see a clearer picture of the infection. Analyzing the code, we see an obvious payload that appears to be decrypted via CryptoJS using AES in line 7 below. We can also see the crypto key as shown (1C1614D7). Further analyzing the code, we can now see references in lines 13 and 14 to a Word Document (%TEMP%\doc_6d518e.docx) and an EXE file (%TEMP%\81063163ded.exe).
We are getting extremely close to the final stage of the infection. Lets push onward…
The payload shown above is UTF8 encoded and stored in the variable v998c9 as shown in line 15 below.
This AES decrypted payload is then run in line 40 shown below using the eval() function. This is the payload we need to see decoded. It appears to be the final stage of the infection.
In this stage, we see a payload with sections of it that decode to reveal references to AES cryptography. We also see signs of well-known techniques used by droppers to write to disk and run/start files. We see references to a Word document and we see a reference to an EXE. We see a payload that is AES decrypted and ultimately run via an eval() function.
Lets decode the payload that is run in line 40 above and reveal the mystery…
Stage 5: Spora infection completes
Lets take a look and see if we are correct…
In line 4 shown below, the contents of the variable v09bc6 are written to disk. If we reference the cleansed code from stage four, we can see that v09bc6 contains the data bytes for the EXE file (%TEMP%\81063163ded.exe).
The v3462e variable is pointing to a reference of Adodb.Stream. This means that the highlighted line below is equivalent to Adodb.Stream.Write(v09cb6). As we suspected in stage four, Adodb.Stream is used by this infection to access and write to the local file system.
Once the data has been written, the file is saved to disk as shown below. Referencing the cleansed code in stage four, we can see that the v759c1 variable is set to %TEMP%\81063163ded.exe in line 14 of the code. As the result, this line of code is used to create an EXE on the local file system.
This is the code that actually dropped the Spora malware to disk!! Without this line of code, no infection would occur.
Now that Spora exists on disk, it needs to be started. The code shown below is the code that actually runs Spora.
The vefa30 variable is set to the same value as v759c1 in line 16 in the cleansed code from stage four. Both variables point to the file name %TEMP%\81063163ded.exe. It may be confusing that the variables are different in the above and below code fragments. Most likely, it is either due to poor programming style, or, intentional obfuscation.
Similar to the steps above, another file is created as shown below. This time, it is the Word document being created and saved.
Shown below is where the Word document is actually saved to disk. Referencing the cleanse code from stage four, we can see that the v365ec variable is set to %TEMP%\doc_6d518e.docx in line 13 of the code.
The Word document is opened/run via the highlighted code shown below. Unlike the EXE example above, the same variable v365ec was used in the SaveToFile and the Run logic.
The Spora infection vector using an HTA-based email attachment was more sophisticated and complex than what is seen in the typical malware dropper. Typically, a dropper may leverage several stages of payload decodes prior to ultimately dropping and running the malware. Each of these stages would typically rely on simple hex encoding of the payload combined with some lightweight custom decoding logic.
Other droppers have used many of these techniques in the past. However, the combination of these techniques combined with the degree to which the Spora dropper leveraged them, definitely set a new standard of sophistication and obfuscation.
Thanks for reading…
You can also connect with me on Twitter at @kd_cybersec.