Debunking tech and data myths

Published in Asian-mena Counsel: Cyber Crime & Data Protection Special Report 2019

By Anita Lam, Brian Harley and William Wong of Clifford Chance

E: Anita.Lam@CliffordChance.com
E: Brian.Harley@CliffordChance.com
E: William.Wong@CliffordChance.com

Practical ways to tackle laws regarding the collection, use, ownership and deletion of data.

Screenshot 2019-11-27 at 10.54.14 AM

Successful businesses understand data and how to extract value from data. What businesses do with data is not only central to enhancing customer satisfaction, optimising the efficiency of their operations and improving employee relations, but also to winning trust and remaining competitive. Yet as the technologies around data grow in complexity, myths emerge around the legal rules that apply to the collection, use, ownership and deletion of data. In this article we debunk some of those myths and provide practical ways to tackle the issues.

Screenshot 2019-11-27 at 3.34.39 PM Myth 1: You own your data

Data is a type of intangible asset, similar to literary works, inventions or trademarks, so surely it must be protected by some kind of intellectual property, right? Unfortunately not. Certain types of data may be protected by intellectual property rights if quite specific requirements are met: copyright may sometimes apply, although automatically generated data will rarely meet the threshold of an “original work of authorship”; certain jurisdictions recognise database rights, but these are limited in the scope of protection they afford and inconsistent across different legal systems. So, as a general proposition, machine-generated data is not straightforwardly subject to ownership by way of any intellectual property right.

If that were not enough of a hurdle for businesses seeking to realise the value of the data they generate or collect, certain types of data are also heavily regulated — especially personal data. Where the European General Data Protection Regulation (GDPR) applies, for example, businesses that collect personal data from individuals are tightly restricted in how they can process an individual’s personal data, to whom and to which countries the data can be transferred, and they may be under an obligation to delete an individual’s data on request (the so-called “right to be forgotten” — see Myth 3). So, far from enjoying some kind of free title of ownership in data, controllers of personal data have only a heavily qualified and temporary right to use that data.

Myth 2: Data is easily anonymised

If personal data is heavily regulated, one way to avoid that regulatory burden is to ensure that the data is anonymised: if the identity of the individual which certain data concerns cannot be established, then the data is not personal data. Anonymisation can therefore provide a way for businesses to use and share data collected about individuals without infringing those individuals’ privacy rights. However, in the age of data analytics, data anonymisation is far from a straightforward proposition — merely deleting names and replacing them with an “anonymous” identifier (such as a number or random string of characters) will in most cases fall well short of the mark.

Is total anonymisation possible?

Under the GDPR, for example, a clear distinction is drawn between “pseudonymisation”, which involves the replacement of names with identifiers, and “anonymisation”, which involves making a certain data set impossible to extract personal data from. The dividing line between pseudonymisation and anonymisation lies in whether or not the process whereby the identity of the relevant individual is obscured can be reversed or not. If the names of individuals have been replaced with numbers, all you need is a list matching those numbers to the original names to connect the data with the individuals it concerns. If the data has been encrypted, anyone who has access to the decryption process (or who can hack through the encryption) can de-anonymise the data set.

Simply eliminating identifiers from the data set entirely won’t necessarily anonymise it: multiple studies have shown the surprising ease with which a determined researcher can match data to individuals without the need for direct identifiers. One study, for example, was able to identify specific Netflix users from an apparently anonymised data set including their ratings of movies by comparing those ratings with publicly available ratings on the online movie database IMDB and matching them to the relevant IMDB users.

Because individuals leave unique “fingerprints” on their data in so many ways — such as their daily commute embedded in mobile location data; their specific interests and preferred websites in their browsing history; their particular combination of medical conditions in their health records — truly anonymising personal data requires an understanding of the power of data analytics to reidentify apparently anonymous data.

Myth 3 : The right to be “forgotten” works globally

With the advancement of digital communication tools and social media, people often regret revealing personal data or other embarrassing information about themselves or others online. Deleting the relevant post or webpage may not fully undo the damage, as references to such data or information may still be seen in the results provided by search engines.

The data privacy laws in many jurisdictions provide for, to different degrees, an individual’s right to request that their personal data be deleted by the data user/controller (the “right to be forgotten” or the “right to erasure”). In the context of online search engines, the right is manifested by requesting the search engine to remove a particular link from the list of results displayed following a search conducted on the basis of a person’s name, which is effectively a “de-referencing” exercise (as opposed to the removal of personal data from the underlying material, which is not controlled by the search engine).

On September 24, 2019, the European Court of Justice (ECJ) decided in Google LLC that under current EU law, search engines need to carry out the requested de-referencing in their European sites only (eg, google.fr, google.co.uk), but not all sites globally (such as google.com).

This decision arose from an order imposed by the French privacy regulator (CNIL) in 2015, following an earlier ECJ decision in Google Spain in May 2014 which suggested that the right to be forgotten under the relevant EU Directive applied outside of the EU as well. The CNIL required Google to apply the removal of links to all of its search engine’s domain name extensions. Google refused, and only removed the results displayed following searches conducted on EU domain names. The matter was eventually referred to the ECJ for a preliminary ruling.

Since the Google Spain decision, Google says it has been removing search results from its European sites (it has received almost 850,000 requests to remove more than 3.3 million web addresses, with about 45 percent of the addresses eventually being delisted), and restricting results from its other sites if it detects that a search originates from within the EU.

Had the ECJ decided against Google in the most recent ruling, it would have been interesting to see how companies in other parts of the world (most notably US and Chinese tech giants) would have responded to the purported application of European regulations beyond the borders of the EU.

Screenshot 2019-11-27 at 3.35.26 PM Myth 4. Your personal data is safe on a blockchain

In typical implementations of blockchain technology, transactions are recorded chronologically on each copy of the blockchain ledger to form an immutable chain, recording the entire history of relevant transactions. Each record on the blockchain is technically visible to anyone with access to the system — and for open blockchain systems, that means literally anyone. As the key principle of blockchain is that copies of the ledger are distributed across many participants in the network, no single party controls the data on the blockchain.

Since data recorded on blockchains is intended to be immutable, there is an inherent clash between blockchain data structures and the right to delete personal data under the Personal Data (Privacy) Ordinance, the right to erasure and be forgotten under GDPR, and similar legislation around the world.

How can the right to delete data be addressed in blockchain systems?

There are at least two possible ways to address this issue:

Pseudonymisation — One way to help comply with data privacy laws is to record pseudonymous data exclusively on the blockchain and not to process any clear personal data. For example, identifiers, names or personal details of users can be “hashed” — a one-way cryptographic operation on a set of data that makes it impossible to reverse-engineer the original data from the hashed data, but easy to verify if the “hash” recorded on the blockchain matches with the original set of data (on the limitations of pseudonymisation vs anonymisation, see Myth 2 above).
Off-the-chain transactions — Another way to avoid data deletion issues is to take all the clear personal data off the blockchain, so there is nothing that needs to be deleted from the publicly available blockchain. Certain implementations of the blockchain enable certain transactions to be taken “off” the blockchain and processed separately. Once the off-the-chain transactions are completed, the fact of the transaction can then be written back on the blockchain, without the details ever needing to appear on the public blockchain.

Given that immutability is fundamental to blockchain designs, it will not usually be possible for users to change the privacy preferences once users have allowed their personal data to be written to a blockchain, making this clear to users is particularly important. How teenagers express their privacy preferences today may be very different from how they may express their privacy preferences 10 years from now. Blockchain developers who have designed their blockchain implementation on the basis of privacy-friendly principles and offer users more control and ability to understand how their data is to be used will likely win more trust from customers and employees.

Myth 5: Facial recognition technology breaches your privacy rights

As a start, there is no general right to privacy in Hong Kong.

Under current Hong Kong law, no personal data is taken to be collected unless data is either collected of an identified person or about a person whom the data user intends to identify.

Unlike traditional CCTV that merely captures facial images, facial recognition software reads the geometry of faces to distinguish between different individuals and gives each individual a personal identifier or a facial signature. If that personal identifier is then mapped to a database of known faces, this can be used to verify a person’s identity, and the collection of personal data then kicks in. However, if the personal identifier is not mapped to any database that can identify the individual, there is technically no collection of personal data under Hong Kong data privacy laws.

Interestingly, in contrast to the Hong Kong position, a number of cities in the US have now banned the use of facial recognition technology. The EU is also exploring ways to impose stricter limits on use of such technology.

What are the recourses against facial recognition that breaches data privacy laws?

Apart from the ability of data subjects to file complaints to the Hong Kong Privacy Commissioner, it is also an offence under section 64 of the Personal Data (Privacy) Ordinance for someone to disclose personal data obtained without consent for gain or if the disclosure causes loss in money. The maximum penalty is a fine of HK$1 million (US$128,000) and imprisonment for five years — sanctions not to be lightly ignored.

How can “ownership” of data be protected?

These limitations on ownership have not prevented businesses from generating value from data and, indeed, in their contracting terms, asserting that they do in fact own the relevant data. How do they do it, if no such property right in data exists? In short, they create one by contract. If you set out in your contracts with others what it means for you to own the data (what you can do with it, how long you can keep it, to whom you can transfer it, etc), and you ensure that anyone who receives data from you, directly or indirectly, has to sign up to those same terms, you can then create something that looks like an ownership right for that data — similar to how businesses protect their “ownership” of their trade secrets and confidential information.

Screenshot 2019-11-27 at 3.32.15 PM