Last Updated: 2020-12-15 07:16:52 UTC
by Didier Stevens (Version: 1)
This rule here (Methodology_OLE_CHARENCODING_2) detects OLE files (.doc, .xls, ...) that contains sequences of decimal numbers. Converted to ASCII, these numbers reveal short strings: "echo off", "MZ", "PK".
That indicates to me that maldocs created with FireEye's tool embed a .BAT file, a .EXE and/or a .ZIP file.
The maldoc sample mentioned in the rule is available on VirusTotal: MD5 41b70737fa8dda75d5e95c82699c2e9b.
I analyze this maldoc as follows:
First I run my oledump tool:
The macro indicators (M and m) tell me that there is VBA code in this maldoc. But my attention is first drawn to the streams that end with /o (stream 10 and 20). Hiding payloads, scripts, ... inside VBA user form values is a well-known technique used by malware authors. I have a plugin to help with the analysis of maldocs that use this technique: plugin_stream_o.
This is the command:
So stream 10 contains a value that looks like a message to be displayed by this maldoc.
And stream 20 contains the payload we are looking for: a long sequence of decimal numbers. It starts with 80;75;3;4: that's the YARA rule's detection string for a ZIP record.
Remark also the "Found: 2" message from the plugin: this is new since the last version. This means there are 2 values inside this stream (if there is only one value, this Found message is not displayed, just like older versions of the plugin do).
The next step now is to convert this sequence of decimal numbers to bytes. I have a tool for that: numbers-to-string.py.
Since there are 2 values inside stream 20, I want to take a closer look first. I use option -S of numbers-to-string.py to produce statistics for each line of text with numbers:
So there are 2 values inside stream 20 that are long sequences of decimal numbers. Line 25: 66124 values between 0 and 255, Line 26: 66191 values between 0 and 255. So it looks like we have 2 embedded files in here, probably 2 ZIP files.
I select the first value (line 25), decode it as binary data (-b) and analyze it with my tool zipdump.py.
So that is indeed a ZIP file, and it contains a .exe file.
I do a quick check to see if the second value (line 26) also decodes to a ZIP file:
And indeed, that one too is a .exe file.
With zipdump's option -e I get extra info, like the hash to look the file up on VirusTotal:
So this FireEye maldoc is not hard to analyze.
Remark that in the YARA rule, there are strings with separator : and x beside ;. It looks like there can be variations in the encoding, but that has no effect on the decoding of the decimal numbers by my tool.
I also checked if VBA stomping or purging was performed on this maldoc, but that doesn't seem to be the case:
There is compiled code and VBA code inside the module streams. So the compiled VBA code has not been purged, and neither has the source code been stomped, since I can find VBA source code with Shell statements and CreateObject calls:
I recorded a video of this analysis, where I also take a look at the VBA code: