Quickly Finding Encoded Payloads in Office Documents
Last Updated: 2023-05-07 11:30:49 UTC
by Didier Stevens (Version: 1)
Malicious documents like this RevengeRAT ppam file found on MalwareBazaar contain VBA code that you can analyze with oledump.py.
Some shortcuts can be used to try to save time doing a lengthy VBA code analysis,
One of them is looking for strings of encoded payloads, like BASE64.
But if you just run base64dump.py on this sample, you won't get results, because this is a ppam file, and thus a ZIP container: all data is compressed.
You can unzip all contained files, and then analyze them one-by-one with base64dump.py.
But there is a quicker method: let zipdump.py produce JSON output that contains the decompressed content of each file, and then let base64dump.py consume this JSON output.
The way to do this, is to use option --jsonoutput with zipdump.py, and --jsoninput with base64dump.py. Use option -e all to search for all possible encodings supported by base64dump.py (not just BASE64, but also hexadecimal, base85, netbios name encoding, ...), and I also threw in option -n 30 to select payloads at least 30 bytes long (just to keep all the output on a single screenshot):
There is a string, 808 characters long, that is a valid BASE64 string (b64) and also a valid BASE85 string (b85). The base64 decoded strings shows something familiar: I E X ...
And it's found in the vbaProject.bin file.
Next step is to take a look at this decoded data (-s L selects the largest decoding):
This looks like utf16: it can be decoded with -t utf16: