Showing posts with label PDF. Show all posts
Showing posts with label PDF. Show all posts

Thursday, 29 October 2009

SWISS red-faced over metadata information left in press release

Whatever your view on where we are on the economic road to recovery (or not) no business can afford any tarnish to its external image. As reported in the Guardian this week Swiss International Air Lines Ltd has a red face and a tarnish to its image in Canada at least due to an inadvertent link of metadata.

SWISS, as they refer to themselves in the press release, included review comments in the document that they sent out. Although the press release might be 'boring,' as reported by the Guardian, it provides a salutary lesson on how features that are useful in the review stage of a document can be a danger if they are not managed correctly when completing the final version that will be sent out.

The file, comments and all, can be found on the Guardian website.

Companies need to remember that converting a document to PDF alone does not protect them from leakage of confidential or embarrassing information via metadata. Although I was not personally sent the press release, and it is not obvious from the posting on the Guardian site, I would say that the release was sent in PDF. Take a look at the other metadata in the PDF file and see what you think (PDF Producer: produced on a Mac, author: initials in this instance, and so on).

This is the perfect example of why it is so important to ensure you have a system in place to automatically remove the metadata information within a document. While the data contained in this file wasn’t damaging to the company, it was definitely embarrassing. Had the data been company private, this could have been a very different situation for them. Make sure your company and your data is protected.

Saturday, 28 February 2009

PDF documents and metadata - some examples

Before I do a deeper dive into what metadata a PDF document contains, let's take a look at what must have been the main headline hitting example in 2008 of sensitive information being discovered within PDF metadata.

I am referring to the situation Google found themselves in with a submission they made, supposedly anonymously, to the Australian Competition and Consumer Commission regarding eBay and their proposal to force their users to use PayPal. After speculation on many blogs about the author of the anonymous submission one Dave Bromage took a look at the metadata in the PDF document and let the world know who it was. Despite the submission being replaced with a new version without the revealing metadata the word was out. I won’t comment on the reasons why this was at least embarrassing to Google (this is one report that gives the details as well as showing the metadata contents), but will add that there was an additional chuckle in the techie community that the metadata also showed that the document had not been created using Google’s own word processing app, one being The Register. My main comment is that this unintentional leakage of information involved a regulator as well as embarrassment at the very least to the originator (author and company).


The submission also had masked what would have been visible text about the submitter within the document. However the PDF did not have any security applied to it so it was very easy to copy that area of the document and paste it into another text processor to see the underlying information. Facebook/ConnectU have just this month fallen foul for the same reason. Numerous other examples in this area, GE and the US Justice Department being a couple of examples from 2008. If you want to mask visible text at the very least add security settings to the PDFs that you generate to disallow copying and pasting of text. Also look at redacting software which fully removes and masks text whilst retaining the layout in the PDF document.

I am sure it is pure coincidence that one of the other headlines in 2008 around information garnered from PDF metadata also involved Google, but from the other side of the fence. As reported here metadata in a PDF version of a lobbying letter from the Corn Farmers to Congress linked, albeit tentatively, the author back to some of Google’s political adversaries.

The lesson from these examples is that you should not assume that converting and sending/publishing a PDF removes metadata that could contain sensitive information.