Last Updated: 2023-11-02 15:51:36 UTC
by Didier Stevens (Version: 1)
In his diary entry "Size Matters for Many Security Controls", Xavier talks about a PE file that has been artificially inflated in size (to 1GB) by appending NUL bytes (0x00) to its end.
Many security tools will not process a file this big, and thus one needs to strip the extra NUL bytes at the end, to reduce the size. And as Xavier warns us ("be sure not to remove all of them - keep a safe amount"), removing NUL bytes can be tricky.
The problem is the following: if you are dealing with a PE file that has a section at its end (so no digital signature, no overlay), then there's a high probability that this section is padded with NUL bytes. And you can't remove these bytes. How can you identify these bytes? By reading the PE file's metadata for sections.
This is something that I have automated in my pecheck.py tool. It's a tool to analyze PE files, here I run it on Xavier's sample:
There's a lot of output, and it ends with this information:
pecheck.py detected an "overlay". An overlay is data appended to the end of a PE file, that is not included in the PE file's metadata.
The overlay is 953 MB in size, starts with NUL bytes (MAGIC 00000000) and has an entropy of zero and only one byte value. With this statistical info, we can conclude that the overlay is just a long sequence of NUL bytes. Thus we can remove it. We use option -g to get the stripped (s) version of the PE file: e.g., the PE file without its overlay.
By default, the output is an hex/ascii dump:
One can see that the PE file still ends with NUL bytes: that's the padding of the last section, that we want to preserve.
To extract the stripped PE file, we use option -D to perform a binary dump and then we write that binary data to disk with file redirection:
The stripped PE file has a normal size now, and can be submitted for analysis.