Open Data movements

Earlier this week, transport minister Nitin Gadkari informed the Rajya Sabha that his department had earned Rs 65 crore in revenue by selling access to centralised databases containing millions of vehicle registration and driver licence records.

This data was snapped up by 87 private entities and 32 government entities, according to Gadkari’s reply to a question raised in the upper parliamentary house. The sale of this valuable bulk data was permitted earlier this year after the ministry rolled out a new data sharing policy, which charges commercial organisations and individuals a one-time fee of Rs 3 crore for access.

At the heart of this policy is a new ethos sweeping the Narendra Modi government. In the Economic Survey 2019, India’s chief economic adviser Krishnamurthy Subramanian devotes a whole chapter to this, asserting data in India needs to be treated as a public good.

Nobody can doubt the economic benefits of harnessing data – the proof lies in the rise of big technology giants and how they built their digital empires.

The problem is that its not always easy for the private sector, especially in domains like education and healthcare, where the social benefits are high but profits are low.

Also read: Looking Beyond Privacy: The Importance of Economic Rights to Our Data

To fix this problem, Subramanian has made the case that the Centre should step in and correct this disparity in the country’s social sectors by giving access to data held by the government or by making it lucrative enough for the private sector to come in.

For instance, the CEA proposes that access to data generated by the public welfare system be sold to the private sector. Information on students under various education government schemes can be sold to firms that could develop “innovative tutoring products tailored to the specific needs of specific districts”. Or data generated from the MNREGA scheme could be monetised by selling it to new-age fin-tech firms that want to give them customised micro-loans.

The money that the government collects through the sale of data can then be used to help defray the costs in funding the welfare state. At one level, this explains the government’s disinterest in continuing to support welfare schemes.

The prime takeaway from the Economic Survey though – which was raised during the Aadhaar debate as well – is that impoverished people should trade their data for welfare.

One could agree with the CEA purely in terms of its economic merits, but he completely ignores the politics behind the move, especially at a time when India still hasn’t passed a comprehensive privacy legislation that passes the rigorous test of public consultation.

In the ‘open data’ and ‘right to information’ communities, both of which have a rich history, public information and records have always been treated as a public good, to be publicly available and free for consumption by everyone. Yet, the very definition of ‘information’ is important – both the RTI and Open Data movements have never classified the personal details of an individual as public information, barring a few exceptions imposed for people in public office.

The Economic Survey makes this mistake when it classifies the personal data of citizens, held in the custody of the government, as a public good. Certainly, datasets of roads, maps or climate metrics that are collected by the government could be classified as a public good, but extending it to personal data is extreme.

There can be important exceptions carved out for personally identifiable information to be treated as a public good. For instance, personal healthcare data could be crucial in dealing with public medical emergencies, for controlling the spread of disease by identifying people that need to be quarantined. But still, sharing it with the private sector is unwarranted and comes with potentially harmful consequences.

The problem is that classification of personal data is being interpreted in multiple ways by different parties. The only official definitions of various data classifications are present in the National Data Sharing and Accessibility Policy. The policy under which the government’s open data portal was established clearly states that personal and sensitive data shall not be shared.

This is not being taken into consideration at all under the current proposed frameworks and instead is being interpreted as opening up trade with the private sector.

Also read: Amid Opposition Criticism, Government Tables DNA Technology Bill in Lok Sabha

The idea of trading one’s personal data for accessing welfare has been already pushed onto people of the country with the Aadhaar initiative and its allied projects. The makers and proponents of the Aadhaar project have hidden behind the rhetoric of ‘open data’, ‘open APIs’, ‘public good’, ‘consent’, and being a ‘data-rich’ country for over ten years now.

This has happened even as they – like Nero in ancient Rome – have done their best to ignore the disastrous privacy and security fires that have been set off as a result.

One can find all these terms embedded in the Economic Survey. A look at its acknowledgement section shows that it thanks Sharad Sharma, the founder of iSPIRT, a group which built Aadhaar, IndiaStack and continues to build other technology stacks inside the government.

The private sector sees value in personal data and is building all the tools for the government to capture this data, even as they engage in regulatory capture themselves. Other and upcoming projects include National Urban Innovation Stack, Digital Sky, Public Credit Registry and the Skills Stack. All these projects are in general have been pushed through without any public consultation. Any criticism of this results in personal abuse and trolling.

The survey’s proposal for the government to sell citizen data to the private sector comes from it treating data as public good and the ownership of data not being defined. Defining ownership of data further complicates the economics of data.

The state is thus ensuring you don’t claim property rights over it. The fundamental right to privacy does not grant you the right to ownership over data – it only gives further obligation for the state to protect your privacy.

More centralisation, the better?

The state, along with the private sector, is now devising interesting ways to interpret this. The transport ministry’s bulk data sharing policy is one example of this.

But even there, Gadkari’s department is wise enough to be aware of the privacy problems that inevitably result when you combine multiple datasets together.

Also read: India’s Proposed Data Protection Measures Don’t Do Enough to Protect Data or Privacy

“There is a possibility of ‘triangulation’ (matching different data-sets that together could enable individuals to be identified and their privacy compromised). It is the responsibility of the organisation that any such activity, which result in identifying individuals using the RC [registration certificate] data-set, shall not be undertaken…,” the bulk data sharing policy warns.

CEA Subramanian believes that building a data republic is only possible if one giant government database that encompasses all datasets is created. The push for such a database, one can argue, has been driven in part by the private sector’s business requirements.

Credit: Economic Survey 2019.

Subramanian’s proposal to link all government databases for building an enterprise architecture is inspired partly from Aadhaar-based hubs like e-Pragati/SRDH of Andhra Pradesh and IndiaStack. India Enterprise Architecture (IndiaEA), a proposal for unifying government databases, was submitted to government by National eGovernance Division with UIDAI chairman J. Satyanarayana as its lead architect.

In the Economic Survey, Subramanian also gleefully anticipates how such a massive database could be used to help reduce leakages:

“If the information embedded in these datasets is utilised together, data offers potential to reduce targeting error in welfare schemes. For example, consider a hypothetical individual who is affluent enough to own a car but is able to avail BPL welfare schemes, though unwarranted. When datasets are unconnected, the vehicle registry does not speak to, say, the public distribution system registry.”

Giant databases also lay the groundwork for a new evolving paradigm called real-time governance. J. Satyanarayana has been a strong proponent of this, with his work in Andhra Pradesh, where the ‘Real-time Governance Society’ was tasked to mix and match public and private data by former chief minister Chandrababu Naidu.

The problems that accompany this are conveniently ignored. Massive databases present a single point of failure. Real-time governance also effectively means real-time tracking of citizens.

Questions raised on this surveillance aspect have never been properly answered by Satyanarayana during his public appearances. Satyanarayana’s proposal of an integrated database (People Hub) in Andhra Pradesh has, in part, contributed to the voter profiling and alleged data theft of 7 crore people seen in the recent assembly and Lok Sabha elections.

Also read: How Can India Get Better Data Processing Laws?

In the Economic Survey, the chief economic adviser and his team propose the need for ‘Data Of the People, By the People, For the People’. A careful reading of the survey shows a lack of nuance and caution, clearly tilting the balance of power in favour of corporates at the expense of citizen rights.

While the economic benefit of monetising and using public data cannot be ignored, India’s current approach towards this is better expressed by is “Data Of the People, By the Government, For the Corporates”.

Srinivas Kodali is an independent researcher working on data and the internet.