Finding Property Values in Office Documents
Last Updated: 2019-02-16 23:18:58 UTC
by Didier Stevens (Version: 1)
In diary entry "Maldoc Analysis of the Weekend", I use the strings method explained in diary entry "Quickie: String Analysis is Still Useful" to quickly locate the PowerShell command hidden in a malicious Word document.
A comment was posted for this diary entry, asking the question: how can one find a property value with my tools, instead of the strings command? I gave one example in diary entry "Word maldoc: yet another place to hide a command", and I'll give another example in this diary entry.
With oledump.py, I get an overview of streams in the document: stream 8 contains macros (indicator M):
From the previous analysis, I know that a Shell statement is used to execute a PowerShell command. I grep for string "shell" to find this command in the VBA source code in stream 8:
The PowerShell command is hidden in an property AlternativeText, but where can this be found inside the Word document? Variable DwchQqabF points to an object with property AlternativeText.
I grep for this variable in the VBA source code:
The object is a shape with name "qg1batoc21p". This object, with its name, can be found in another stream (not in the VBA stream). I use option -y to create and use an ad-hoc YARA rule to search for this name.
Option -y takes an argument, usually it's the name of a file that contains YARA rules. oledump.py will use the provided YARA rules to search through each stream, and report matching rules.
To avoid the overhead of creating a YARA rule for a single string-search, oledump.py supports ah-hoc YARA rules. An ad-hoc YARA rule, is a simple YARA rule that is generated automatically by oledump.py, with the argument provided via option -y. When this argument starts with #s#, oledump.py will create an ad-hoc YARA rule for a string (s). This string is to be provided after #s#, like this: #s#qg1batoc21p. This will create a YARA rule searching for string "qg1batoc21p" in ASCII (ascii), UNICODE (wide) and regardless of case (nocase).
Here is the result:
Stream 4 and stream 8 contain string "qg1batoc21p". Stream 8 contains the VBA code, so it's to be expected that it contains the string. Hence I take a closer look at stream 4 (1Table), where I expect to find property AlternativeText value.
I use option --yarastrings to know at which position(s) the string "qg1batoc21p" was found:
It was found at position 0x16C7, and it is a UNICODE string (notice the 00 bytes).
I can now use option -C (cut) to cut-out (select) the part of the stream that interests me. I want the part that starts at 0x16C7, thus the cut-expression begins with 0x16C7. And to avoid data scrolling off the screen, I select 256 bytes: 0x100l means to select a part with length (l) 0x100 (256 decimal):
We can see the start of the PowerShell command in UNICODE. With a bit of trial and error, I figure out that I need to select 0x14C8 bytes to get the complete command:
Tomorrow, I will post a video showing this method and a second, slightly different method.