Last Updated: 2016-01-02 11:54:55 UTC
by Didier Stevens (Version: 1)
A few days ago a reader contacted me for a problem he had analyzing such a maldoc MIME file. When he used emldump to analyze his sample (f67aa5a3ede3d31c5a68494c0678e2ee), it was not a multipart:
$ ./emldump.py f67aa5a3ede3d31c5a68494c0678e2ee.vir
1: 18649 text/plain
But when I looked at the content of the file, I saw that the first line was just gibberish:
$ head f67aa5a3ede3d31c5a68494c0678e2ee.vir
Content-Type: multipart/related; boundary="----=_NextPart_Jm9Ovypy.uUh6MCk"
Content-Type: text/html; charset="us-ascii"
You can make emldump skip this first line with option -H:
$ ./emldump.py -H f67aa5a3ede3d31c5a68494c0678e2ee.vir
1: M multipart/related
2: 3547 text/html
3: 10480 application/x-mso
4: 155 text/xml
Now you see that the third part is an MSO file, and you can extract this and pipe it into oledump, like I explained in my diary entry.
Should you find MIME files starting with more than one line of gibberish, you can use the tail command to remove this, like this (example to skip 2 lines):
$ tail -n +3 sample.vir | ./emldump.py