Has India’s Privacy Bill Considered the Dangers of Unrestricted Processing of ‘Anonymised’ Data?

If global lessons are anything to go by, it is hard to imagine an ‘irreversibility standard’ which will ensure complete anonymity.

Anonymisation of data is neither a corollary of privacy protection nor is it an oxymoron to the idea of privacy. Instead, it is more likely a gateway to a possible privacy breach which has not been addressed in the government’s Personal Data Protection Bill, 2019 . 

The Bill which is heralded as a much-needed safeguard to rein in the digital Wild West is an embodiment of the constitutional spirit of privacy, evoked by the Puttaswamy-I case. It seeks to protect the personal data of individuals collected by companies and the state by laying down a comprehensive framework for processing such data. Accordingly, it also outlines what forms of data processing are exempt from this framework. 

One such exemption is the processing of anonymised data. The Bill excludes the coverage of anonymised data, except under Section 91 which covers the use of anonymised and non-personal data by the central government for targeted delivery of services and formulation of evidence-based policies. 

In view of the sweeping exemption of anonymised data and its effects on privacy, it is important to understand its meaning, historical treatment and the requisite safeguards necessary for its protection. 

What is anonymised data?

Anonymisation is a technique applied to personal data to completely strip it of its characteristics, traits, nature and identifiers for possible identification of individuals. On account of this, it is usually not covered by the legislation protecting personal data as it is considered not to affect the privacy of an individual. 

The Bill understands ‘anonymisation’ distinctly from ‘de-identification’. Broadly, anonymisation is subject to a regulatory standard of irreversibility while de-identification is carried out to mask identifying data in accordance with the code of practice prescribed by the Data Protection Authority (Authority). Since de-identification can be reversed, its reversal without adequate consent and transparency to the data principal is now a punishable offence. However, anonymisation is still premised on irreversibility and the impossibility of identification. 

Generally, identifiability of an individual is considered as a spectrum with identification, on one end and perfect anonymity, on the other. The legal protections also correspond with this understanding; personal data and its privacy is protected by law while no equivalent protection is granted to anonymised data. However, the efficacy and possibility of anonymisation itself are considered suspect by many. Some researchers argue that data can be truly anonymised only by deletion while others argue that various technological tools can be used to achieve a practical degree of anonymity. In the same vein, there is also a growing recognition of the trade-off between the utility and privacy of a dataset. 

Also read: Does the Data Protection Bill Solve the Dilemma Posed by Dominance of ‘Foreign’ Apps?

Is it possible to truly anonymise data?

This article seeks to question the assumption of irreversibility and perfect anonymity attached to such data. Over the last decade, substantial research has pointed towards the shaky standing of anonymised data. For example, even during the 1990s, an MIT graduate student named Latana Sweeney, identified the governor of Massachusetts from three data points in an anonymised database. 

The European General Data Protection Regulation (“GDPR”) assesses anonymisation on a standards based approach. It assesses the singularity, linkability and inference that can be drawn from an anonymised set. This refers, respectively, to the ability of the dataset to remotely identify an individual, link with other datasets to identify an individual or draw inferences from the dataset. The Article 29 Working Party, established under the ertswhile data protection regime in Europe, had highlighted the potential of all possible techniques of anonymisation falling short of the standard in one situation or the other.

It assessed various techniques used to anonymise data such as randomisation, generalisation, aggregation etc. and concluded that depending on the technique used, the data may be subject to re-identification, when processed and combined with other datasets. The risk of re-identification arises if certain data points are such as to indirectly identify an individual, in isolation or in combination with more data. Thus, effective anonymity may be hard to ensure in practice. 

In this light, it is subsequently explored if anonymised data retains some value for privacy and if individuals should continue to have a right in it. 

Reasonable expectations of privacy in anonymised data

In view of the idea of the residual risk and traits of personhood that anonymised or non-personal data always retains, its coverage in the Bill through Section 91 has the potential to lay down interesting jurisprudence regarding the contours of reasonable expectations of privacy. While the right to privacy is attached to personal data, the aim of this article is to suggest that a residual privacy right also exists in anonymised data.

The right to privacy has been declared a fundamental right but to prevent financial losses or any other kind of misuse of data, further steps need to be taken. Credit: Reuters

Photo: Reuters

This is because personal data, under the Bill, includes the derivatives of personal data or ‘data about or relating to a natural person who is directly or indirectly identifiable’ which may also arise in ‘combination with other information’. Anonymised data used to target delivery of services, coupled with the risk of de-anonymisation, invariably renders such data as an extension of personal data. It is important to examine if there arise reasonable expectations of privacy in such data, especially its use.

This enquiry is important as the right to privacy extends only upto where it can be reasonably expected to extend. For example, there is no right to privacy in the investigation of a personal diary of a criminal in which he/she has made a personal confession of a crime, subject to a warrant. On the other hand, a right to privacy and bodily autonomy extends to my face and movements as are currently recorded by CCTV cameras in public spaces.

Also read: Privacy Bill Will Allow Government Access to ‘Non-Personal’ Data

This is a standard and test derived from American jurisprudence which suggests that the constitutional privacy protection for an individual is derived by balancing an objective component of privacy against the subjective expectations of that person. While Justice Nariman rejected this test in the case of Puttaswamy-I, it was endorsed in Puttaswamy-II and currently lays down the dominant strand of interpretation for privacy law in India.

The use or processing of anonymised data carries within it the risk of being de-anonymised and turning into personal data. It can be argued that the risk of it being misused is a mere possibility and is not a sufficient reason for recognition of privacy in such data, especially when it is normally understood to be irreversible and thus, protected. There are two responses to this presumption of relative sanctity of anonymised data; firstly such data may not need to be subject to the same level of privacy protection as personal data. The protection needs to be graded to ensure protection of the principal, by laying down strict standards of anonymisation and punishing de-anonymisation.  Secondly, since the privacy right subsists in the culmination of the risk of de-anonymisation, namely, creation of personal data, it is necessary that a more nuanced regulatory framework is applied. Meanwhile, it must be kept in mind that both the state and private parties are involved in usage of non-personal data and in making data-based decisions that affect us, individually or collectively. 

It may also be argued that an individual does not have a subjective expectation of privacy in anonymised data, by virtue of its nature, and thus the question of carving out reasonable expectations of privacy does not arise. This does not hold much validity because the balance leads to a consideration of the objective expectation in the absence of a subjective expectation of privacy. For example, a state university announcing and disclosing the details of a top scorer to newspapers does not imply that the person did not have the right to privacy in such information. While he/she may not want to conceal or hide (or protect) such information, it is legally protected. 

The entire construct can be further looked at from another perspective. Personal data also includes data which indirectly identifies an individual. This may be done using certain specified traits or in combination with other information. The degree of indirect identifiability is not explained or laid down in the Indian context yet. To that extent, any semblance of recognition of a person in an anonymised dataset may overlap with indirectly identifying personal data where reasonable expectations of privacy naturally subsist. Thus, the authority would also do well to lay down the extent of indirect identifiability in contrast to anonymisation.  

The impact of de-anonymisation on an individual under the Bill

This enquiry arises as the envisaged use of non-personal data, under the Bill, opens up a wide range of possibilities of public use of anonymised data. Even otherwise, anonymisation was generally considered a legal way out for companies to circumvent the application of law. In view of these practices, the primary concern is what happens if the data is de-anonymised by any kind of processor/fiduciary, after further processing – intentionally or otherwise?

Also read: Interview | Dilution of Privacy Bill Makes Govt Surveillance a Cakewalk: Justice Srikrishna

The moment anonymisation is removed from data, it becomes personal data and falls within the purview of the Bill. The ex-ante compliance with the irreversibility standards simply allows the conversion of personal data to anonymised data. There are two possibilities in the event of de-anonymisation; the fiduciary complies with the Bill or it does not. These options are more pronounced in the case of de-anonymisation because there is no way for individuals or the authority to know that the data has been compromised.

It is within the exclusive domain and knowledge of a fiduciary. Envisaging this possibility, the private member Data (Privacy and Protection) Bill, 2017 provided the right to individuals to be informed of a personal data breach arising due to de-anonymisation. Similarly, an ex-post sanction on re-identification de-identified data which includes anonymized data, has been put in place in the UK Data Protection Act, 2018. This is necessary to allocate responsibility where it is due. The processor or fiduciary which collects the data should comply with the irreversibility standard while the ultimate processor which handles the data and re-identifies it should be sanctioned for the negligence and offence. 

As things currently stand, there is no way for individuals to be informed that data which was once part of an anonymised dataset has been de-anonymised and is being used for identification or profiling. To an extent, a data principal or an individual may obtain information from a data fiduciary under Section 17. However, this assumes active attention expended by an individual to track informational privacy, something which the society is grappling to understand in real terms.

The exercise of Section 17 by the data principal, to identify a fiduciary which may be using an erstwhile anonymised dataset, is a stretch of both the imagination and the provision. Due to the problem of lack of incentives and oversight for processors and fiduciaries to maintain the integrity of anonymised data, it is important to ensure efficient checks by auditory oversight of the authority and an ex-post sanction to curb de-anonymisation.

Section 82 of the Bill, in its current form, only punishes reversal of de-identification, with no sanction for reversal of anonymisation. It also punishes such re-identification with no exemptions for the research community. This has a perceptible chilling effect and is an inadequate safeguard to protect the anonymised data of individuals. Thus, all forms of sanction on re-identification must expressly guard against this possibility.  

The way forward

The extent of possibility of de-anonymisation can be effectively curtailed by the irreversibility standards laid down by the Authority. But if the global lessons are anything to go by, it is hard to imagine a standard which will ensure complete anonymity.

Also read: Final Privacy Bill Could Turn India into ‘Orwellian State’: Justice Srikrishna

If that may be so, it is important to lay down safeguards ranging from sanctioning de-anonymisation, including the right to transparency granted to principals for the use of such data, obliging the fiduciaries to inform the principals the moment they possesses de-anonymised personal data (currently, such notice is required to be given ‘as soon as reasonably practicable’ under Section 7) and periodic audit requirements to check the integrity of anonymised data.

 It is also hoped that the irreversibility standard development will be informed by a technical consideration of the reasonable expectations of privacy arising in such data. 

Anushka M. is a research associate at ‘IT for Change,’ a Bangalore-based NGO.

Privacy Bill Will Allow Government Access to ‘Non-Personal’ Data

The final Bill also sees the Modi government back down from its previously strict stance on data localisation.

New Delhi: India’s personal data protection (PDP) Bill, which was approved by the cabinet recently, is set to be placed in parliament this week.

The Bill draws its origins from the Justice B.N. Srikrishna Committee on data privacy, which produced a draft piece of legislation that was made public in 2018.

Since then, the contents of the final Bill have been mostly a secret, with it being circulated to MPs on Tuesday afternoon.

Most experts, including retired Justice Srikrishna, believe that the Bill should not be passed without being first sent to a parliamentary committee for further review.

The new PDP Bill also contains three key clauses that were not previously included in the Srikrishna draft version and have raised some concern amongst technology companies and privacy experts.

Also read: Centre Okays Bill Proposing Rs 15 Crore Fine for Data Misuse, Easy Storage Rules

These include sections that will allow the Centre to ask any “data fiduciary or data processor” to hand over anonymised personal data or “other non-personal data” that will allow better governance or targeting of citizen welfare services.

Personal Data Protection Bill, 2019 by The Wire on Scribd

The relevant section of the Bill reads:

91. (1) Nothing in this Act shall prevent the Central Government from framing of any policy for the digital economy, including measures for its growth, security, integrity, prevention of misuse, insofar as such policy do not govern personal data.

 (2) The Central Government may, in consultation with the Authority, direct any data fiduciary or data processor to provide any personal data anonymised or other non-personal data to enable better targeting of delivery of services or formulation of evidence-based policies by the Central Government, in such manner as may be prescribed.

 Explanation.—For the purposes of this sub-section, the expression “non-personal data” means the data other than personal data.

Companies like Amazon and Flipkart have raised concerns over this issue, while others like Uber are more agreeable as it anyway provides bulk travel data through its service ‘Uber Movement’.

Also read: Looking Beyond Privacy: The Importance of Economic Rights to Our Data

The final Bill also asks social media intermediaries, like Facebook and Twitter, to allow Indian users to “voluntarily verify” their accounts in a manner that can be prescribed in the future.

This method of voluntary verification has not been laid out by the Bill. It merely states that any user who voluntarily verifies his account “shall be provided with such demonstrable and visible mark of verification”.

Section 28 of the Bill notes:

(3) Every social media intermediary which is notified as a significant data fiduciary under sub-section (4) of section 26 shall enable the users who register their service from India, or use their services in India, to voluntarily verify their accounts in such manner as may be prescribed.

(4) Any user who voluntarily verifies his account shall be provided with such demonstrable and visible mark of verification, which shall be visible to all users of the service, in such manner as may be prescribed.

In background briefings, IT ministry officials say that this provision will help stop online trolling, but it is not clear how this will be possible if the process remains “voluntary”. Legal experts have raised concerns that mandating a means of verification prepares the ground for possible legislative interventions in the future that may make this authentication process mandatory for all Indian social media users.

Also read: Are ‘Vested Interests’ Delaying Passage of Data Protection Bill, Asks Mahua Moitra

Finally, the Narendra Modi government appears to have backed down from its strict stance on data localisation, which required all technology companies to store a copy of their user’s “personal data” on Indian soil.

The draft Srikrishna Bill noted that every data fiduciary had to ensure the storage of one copy of personal data on a “server or data centre located in India”.

Union law minister Ravi Shankar Prasad receives the report from Justice B.N. Srikrishna. Photo: PTI

This provision has not only attracted criticism from Silicon Valley-based companies, but it also figured as a significant bone of contention in the trade talks between New Delhi and Washington DC.

The final Bill slightly reverses India’s stand, noting that “sensitive personal data may be transferred outside India”, but should continue to be stored within the country.

“The sensitive personal data may only be transferred outside India for the purpose of processing…” the Bill notes, while adding that this doesn’t apply to ‘critical personal data’.

Cabinet Approves Personal Data Protection Bill

The Bill is likely to contain broad guidelines on the collection, storage, and processing of personal data.

New Delhi: The government on Wednesday approved the Personal Data Protection Bill that will spell out a framework for handling personal data including its processing by public and private entities.

The decision was taken at a Cabinet meeting headed by Prime Minister Narendra Modi.

Also read: How Much Facebook Has to Pay in Fines and Settlements This Year

Information and Broadcasting Minister Prakash Javadekar said the Bill will be introduced in Parliament during the current Winter Session.

The Bill is likely to contain broad guidelines on the collection, storage, and processing of personal data, consent of individuals, penalties and compensation, code of conduct and an enforcement model.

Last week, IT minister Ravi Shankar Prasad said the government will soon introduce a robust and balanced Personal Data Protection Bill in Parliament, adding that India will never compromise on data sovereignty.