Using Redaction to Secure Personal Information

by Conor Smith | May 31, 2018

To encourage a full offering of PDF solutions our product and sales teams work closely together to ensure that customer requirements are met in as many instances as possible. One conversation that both teams have quite regularly though is around features that are not as popular as others for customers. One such feature is redaction. When our sales people talk to our product managers about this lack of want for such a critical annotation feature, it becomes abundantly clear that customers may misunderstand what redaction actually is and why it can be so crucial for your business and data protection strategies.

What is Redaction

Nestled under our annotation features in our PDF SDK, our redaction module applies annotations to block sensitive information in a document. All that will exist in its place is a dark line to show that information once lay there. When looking behind that document, into the code, the module ensures that no sensitive data is kept underneath the line. Our technology utilizes a process that ensures redacted material is deleted and the blacked out (redacted) areas become images, stopping someone being able to access the sensitive information underneath.

Why use Redaction

There are many reasons to use redaction. In the legal and financial services industries, it may be essential in covering up sensitive information within a legal document. Within insurance, once the period for storing a previous customer’s data has passed, it is of paramount importance for you to redact personal information from contracts if you insist on keeping these contracts for archiving purposes. A more recent and even more important reason for wanting these features comes with the introduction of GDPR in 2018.

GDPR and Redaction

GDPR is a regulation applied to all EU countries. It will protect EU citizens from businesses using their data irresponsibly. It puts the data subject in charge of what personal information is shared, where and how. This was brought about by the idea of the ‘right to be forgotten’ whereby any individual may contact your business asking for their personal information to be deleted from all your systems within a certain time frame.

If your company has no legal reason to continue holding this person’s information in your database, you must oblige and provide evidence that this information no longer remains in any of your systems. This means that any reference to this individual in a contract, email or any documents referring to their profile, must be first identified and secondly Personally Identifiable Information (PII) within it deleted. Redaction provides the most straightforward and easy to implement process whereby you can remove personal details from a document without needing to suppress relevant information or lose the whole record in the process.

OCR and Redaction

Most organizations have enormous amounts of non-searchable, image based PDFs, usually originated as scans, faxes or output from design packages. In order to make these image PDFs or vector PDFs searchable they need to be OCRed. OCR or Optical Character Recognition discovers text within scanned documents and makes it available to systems and internet search engines to search. OCR then allows you to not only search text, but also redact it, even if your file was once a paper document that you scanned. The best part? This can be done without any fear of invisible text or metadata remaining in the file or the code behind it.


Are you interested in trying out redaction as part of our annotation module? Download our free PDF SDK trial or contact our sales team to learn the full extent of our annotation technology.