MHC Tech Law: Anonymisation techniques and data protection obligations

17 Oct 2016

Tips and techniques to ensure personal data can’t be identified. Image: Mike Flippo/Shutterstock

Mason Hayes & Curran takes a closer look at some of the anonymisation techniques referenced in the Data Protection Commissioner’s guidance.

The Data Protection Commissioner (DPC) recently published guidance on the use of data anonymisation and pseudonymisation techniques. In our last blog, we examined these concepts and some of the key points in the DPC’s guidance.

We focused, in particular, on the difficulties in implementing these techniques and the scope of what is considered to be personal data.

This time, we will examine the techniques in the guidance, and consider data protection law obligations arising for organisations wishing to anonymise or pseudonymise certain data sets.

Anonymisation techniques

The Data Protection Acts 1988 and 2003 do not explicitly recognise the concepts of data anonymisation and pseudonymisation, meaning there is no prescriptive standard of anonymisation under Irish law. Consequently, an organisation hoping to employ anonymisation techniques will have to decide, on a case-by-case basis, what techniques to use to sufficiently anonymise data sets.

In this regard, the DPC’s guidance note is useful for organisations wanting to assess what techniques (or combination of techniques) they should use.

The DPC’s guidance note discusses the two main forms of anonymisation, namely randomisation and generalisation.

Randomisation

Randomisation techniques involve altering personal data in order to remove the link between the data and the individual. There are a host of randomisation techniques available including noise addition and permutation.

Noise addition (or noise injection) involves the addition of random variables to personal data to reduce the risk that an individual can be identified from the data. For example, in a database that records the height of individuals, each individual’s height could be increased, or decreased, by a small amount. It can be stated to be accurate only with a certain range, such as plus or minus 10cm.

Permutation, on the other hand, involves the swapping or shuffling of data between the records of individuals, making it harder to identify a particular individual. For example, a data set containing the height of individuals could be randomised by shuffling the height values so that they are no longer connected to other information about the individual. These techniques are useful in reducing the risk of inference and the matching up of data between data sets.

Generalisation

Generalisation involves the dilution of identifiers attributable to data subjects so that individuals cannot be singled out. This can be done by modifying the scale of data attributable to an individual.

For example, a data set containing the dates of birth of individuals could be diluted by using the year, as opposed to the individuals’ day and month of birth.

There are a wide number of generalisation anonymisation techniques including k-anonymity, aggregation, l-diversity and t-closeness.

The DPC briefly addresses other techniques, such as masking and pseudonymisation, and observes that these techniques, while useful, merely assist in reducing the risk of identification, but are not sufficient on their own in anonymising data.

Legal obligations

Data that has been irreversibly anonymised ceases to be ‘personal data’ and so falls outside the scope of the Data Protection Acts. However, in order to anonymise personal data, the starting assumption is that the personal data must first have been collected and processed by organisations in accordance with the acts.

According to the DPC, the process of anonymisation itself constitutes the further processing of personal data. Accordingly, if organisations wish to render personal data anonymous, this should be done in accordance with the acts. Therefore, organisations should ensure that personal data is obtained and processed fairly; kept only for specified, explicit and lawful purposes; and used only in a manner that is compatible with these purposes.

The DPC highlights that if an organisation collects personal data with the intention of anonymising the data for future use, it must inform individuals of this purpose when collecting their personal data, such as via a privacy policy.

Organisations should consider whether any amendments are required to their privacy policies or notices to ensure that individuals are appropriately informed. However, if anonymisation is an ancillary purpose, as opposed to a distinct purpose, the DPC states that this might not be viewed as further processing beyond the purposes for which the data was originally obtained. There is also an exemption available under the acts for the processing of data required for statistical, research or other scientific purposes.

The DPC also warns that an organisation that extracts personal data from an anonymised dataset must do so fairly and in compliance with the acts.

If an organisation can identify an individual from the personal data, the organisation may become a data controller, depending on whether it meets the other criteria of a data controller under the acts.

Organisations must be aware that extracting personal data from anonymised sources may result in the organisation becoming subject to the obligations of the acts.

Tips on using anonymisation and pseudonymisation techniques

Use a combination of techniques to ensure that data is sufficiently de-identified. There are inherent limitations in some anonymisation and pseudonymisation techniques. Careful consideration is required in devising appropriate anonymisation techniques.
Take into account all means reasonably likely to be used to identify an individual – both within the organisation and held by third parties. Consideration should be given to additional data sets that an organisation may obtain and which could allow it to identify an individual.
Test the effectiveness of anonymisation techniques regularly to ensure that they are sufficiently robust to avoid the identification of individuals. Consideration should be given to developments in re-identification technologies that may result in re-identification.

The content of this article is provided for information purposes only and does not constitute legal or other advice.

Tech Law is a weekly series brought to you by Irish law firm Mason Hayes & Curran, whose legal tech team advises the world’s top social media organisations and emerging start-ups. Contact a member of the MHC Technology team or visit www.mhc.ie for more information.

Want stories like this and more direct to your inbox? Sign up for Tech Trends, Silicon Republic’s weekly digest of need-to-know tech news.