In July, the Royal Society and the British Academy held a seminar to connect debates on the governance of data and its uses. Representatives from academia, government and business gathered to share their expertise about how their sector has dealt with new applications of data and what challenges might arise in the future.

Enabling the benefits through sustained debate

Internet iconsA complex landscape emerged from the discussion– a complexity which means we need to rethink what kind of governance is needed to address challenges such as accountability, transparency and equity thrown up by new technologies.

This should be framed as realising the benefits, rather than avoiding risks: How do we best enable new and exciting data applications – such as machine learning – to create opportunities and benefits to the economy and society?

A sustained framework may be required that can mediate well-informed debate and decision-making with input both from experts and the public. The connected and networked landscape of data use means that such a framework also needs to look across sectors, connecting debates when appropriate, and be adaptive to respond to new technological developments

The UK, with its strength in data science and the tech sector, is in a good position to lead this discussion. To do this, we need to move beyond fragmented debates fixed in the 20th century.

A 21st century debate

Throughout the day the conversation moved away from concepts rooted in notions of data as something static, such as privacy and consent. Instead the focus moved to new issues emerging from increasingly more complex and fluid applications of data.

It is in the combining, linking and application of data that innovation is happening and new knowledge is being generated. Datasets that might seem innocuous can reveal meaningful and sometimes sensitive things when combined with other datasets. Smart algorithms make use of multiple datasets to help predict behaviour and automate decisions, in a way that moves the use of data far beyond the original purpose for which it was collected.

At the same time, individual personas matter much less than what the data says about you, the predictions that can be made about your behaviour and how that can be of benefit to you and to others. This predictive power can be advantageous – but it can be problematic if there is bias in datasets used, and where generalisations based on data might lead to stereotyping. As a result, in the context of machine learning and its applications – and even data use more generally – issues of privacy and personal identity seem to have less significance than issues of freedom of choice, autonomy and equality.

Similarly, in the age of big data and analytics it is often the unforeseen use that has most value. As it is becoming more difficult to map the causal route from data generation, collection, processing and use, consent becomes less useful of a concept. Rather, it is merely one component of what is a much more complex governance structure.

Context matters. So does flexibility.

With the wide range of sectors represented at the seminar, the importance of the changing contexts of competing interests became clear. Relationships identified during the day included such things as: accountability and transparency versus algorithmic performance; freedom and autonomy versus efficiency and security; sensitivity versus openness and individual rights and benefits versus social rights and benefits.

This goes beyond simple zero-sum trade-offs. Any future governance system will need to be flexible enough to allow for different balances of such relationships. It also needs to be able to deal with new technologies and applications.

It is not just about the data

Just as it no longer makes sense to think about data applications in any one sector on its own, it does not make sense to think about data on its own. The linking and combination of data sets, the road from collection, analysis and processing in physical systems and through algorithms all means that when we talk about data governance, what we really are talking about is the governance of the use of data. This means that data, algorithms, or any other single data-based technology should not be considered in isolation. In fact, it is new technologies such as machine learning which has brought new question to the fore and throwing up new challenges around transparency, accountability and freedom of choice.

Ultimately, the questions are even larger than this. Given the prevalence of new technologies both now and in the future, these questions speak to the very core of what kind of society we want to live in.

Next steps

There will be a public record of the meeting published later in this summer. The Royal Society, with the British Academy, continue to build our understanding of what key questions need to be answered, and how we can help address them.