Last Updated: 2010-04-23 11:58:02 UTC
by John Bambenek (Version: 1)
PDF files are a common way to distribute documents on the Internet and even are used for distributing documents with redacted (removed) content. However, when you distribute redacted documents make sure that the data you don't want out there isn't, in fact, still in the file.
Case in point, take the upcoming trial of former Governor Rod Blagojevich. He just submitted a motion to force President Obama to testify during his criminal trial. As you can imagine, there is sensitive information in the motion. You can read the motion here. The areas that are redacted are pretty obvious. Now, hit Control-A. Open a text editor or Microsoft Word (or the like). Hit Control-C.
Hello, Mr. Face. Meet, Mr. Palm. This particular mistake isn't new. There was a well-publicized SNAFU involving the US Department of Defense publishing a redacted document that contained classified information which was happily leaked on the Internet using the same method.
If the data is important enough to redact, it is probably important enough to verify that the data is actual gone. Of course, this is a problem for more than just PDF documents. An amusing HR trick is to take a look at Microsoft Word resumes, particular the "Track Changes" history.
The take away is to make sure to use commercial tools (or tools specifically designed for the task) to delete, not just mask, redacted information and to check to ensure that the redacted information is not easily retrievable... especially with something as trivial as "Copy-Paste". If you are too stingy for a commercial software package, just print the document with the redacted portions and re-scan it as PDF to ensure the text is gone.
(You can read about the issue from this article which is heavy on the facts of the particular trial in question).
bambenek at gmail /dot/ com