Data Classification For the Masses

Published: 2016-08-19
Last Updated: 2016-08-19 04:51:19 UTC
by Xavier Mertens (Version: 1)
14 comment(s)

Data classification isn’t a brand new topic. For a long time, international organizations or military are doing "data classification”. It can be defined as:

A set of processes and tools to help the organization to know what data are used, how they are protected and what access levels are implemented

Military’s levels are well known: Top Secret, Secret, Confidential, Restricted, Unclassified.

But organizations are free to implement their own scheme and they are deviations. NATO is using: Cosmic Top Secret (CTS), NATO Secret (NS), NATO Confidential (NC) and NATO Restricted (NR). EU institutions are using: EU Top Secret, EU Secret, EU Confidential, EU Restricted. The most important is to have the right classification depending on your business!

Data classification is not only used by IT teams but also by all data, applications or process owners in the organization. The implementation of data classification is definitively not an easy process but will more and more become mandatory, especially in Europe. EU adopted a new regulation called “General Data Protection Regulation” (GDPR) [1] that will be effective by May 2018. Its goal is to protect users data. To resume the new rules regarding data:

  • Organizations will need the user’s consent before collection PI (“Personal Information”).
  • Data must be wiped after a predetermined period
  • In case of a data breach, users and authorities must be notified within 72 hours.

The last point is critical because according to a study [2], most companies take over six months to detect data breaches! And data classification help you to better protect your data. The process is based on the following steps:

  1. Identify (assets, data)
  2. Create your protection profiles
  3. Deploy the protection profiles and enforce them
  4. Review
  5. Reclassify data & improve

Don't be fooled, this is a very complex process. Even the first step can be very difficult for many organizations but,  once it's done, it's easy to label any new type of data. We see that more and more products and tools started to take care of privacy and data classification. Two examples:  Microsoft launched the “Windows Information Protection”[3] (for Windows 10 Anniversary Update & Office 365 Pro) which includes features to identify different types of information, determine which apps have access to it, and provide the basic controls (example: Copy and Paste restrictions). The open source world also embraces data classification. The latest LibreOffice release provides document classification according to the TSCP standard[5].

You can also implement a basic data classification at the operating systems level. Modern OS can apply “tags” on files and directories. On my Macbook, I defined the TLP standards and apply them to some of my files:

Some Linux distributions implement also a system to tag files. Via the GUI (here Nautilus):

But you can perform the same from the command line (read: on servers too). Here is an example based on Debian. If not available by default, install the required tools[6]:

# apt-get install tracker-utils

Then, enable the indexing services:

# tracker-control -s

You can now add tags to files:

# touch super-secret.txt
# tracker-add -a TLP:RED super-secret.txt
Tag was added successfully
  Tagged: file:///root/super-secret.txt

Now you can search for tagged files:

# tracker-tag -t -s TLP:RED
Tags (shown by name):

To conclude this diary, my advice is to keep in mind that data classification will get more and more focus in the near future. Be ready to kick off such project inside your organization. And you? Did you already implement data classification? Do you have plans? Please share your tips. 


Xavier Mertens (@xme)
ISC Handler - Freelance Security Consultant

14 comment(s)


Diary Archives