Computers and data

Computers and data

Recent advances in machine learning have enabled a range of applications, as we’ve discussed on this blog before. These systems can achieve highly accurate results, but their workings can be opaque, even to experts in the field. So, when might machines be able to explain themselves? And, if they can, what new insights might we gain?

On 23 September 2016, the Royal Society brought together researchers in machine learning, artificial intelligence and cybersecurity for a panel discussion at New Scientist Live to discuss these questions, as part of the Royal Society’s machine learning project. With Nicola Millard (BT) in the chair, Joanna Bryson (University of Bath), Miranda Mowbray (HP Labs) and Stephen Roberts (University of Oxford) discussed what we understand about how machine learning works, and the implications of this understanding.

This blog post highlights some of the points raised during the discussion.

What do we want machines to explain?

Much has previously been written about the ‘black box’ nature of some machine learning systems, which can achieve high levels of accuracy, but whose workings cannot always be easily understood by the user. Yet interpretability can be important; for example, in assuring safety critical systems, or in cases where machine learning systems make decisions about an individual that they could wish to appeal. In fact, a new European regulation implies that individuals have a right to an explanation of decisions reached through the use of algorithms.

As the discussion started, Joanna challenged the assumption that simply asking ‘why’ an outcome was reached would yield useful insights – this wasn’t always true of humans, and might not really be what we want from machines. So, if ‘why’ was not the right question, our panel discussed what we want machines to explain, agreeing that in many cases this boiled down to showing how a system was working.

How can machines explain their workings?

Joanna, Stephen and Miranda explained different approaches to showing how a machine learning system works. These approaches include:

• Making the code for the machine learning algorithm and the data on which it is trained open, to show the mechanics of the system.
• Generating system logs, which can be interrogated by analysts after events have occurred.
• Creating mechanisms which involve language or visualisations to show what inputs the model is looking at, and how influential these are.

The desirability of each – or any – of these approaches depended on the machine learning system; the type of interpretability required would depend on the application and the needs of the user. For example, publishing code might achieve some degree of transparency, but seeing large datasets or complicated code might not really help a user to understand the system.

Are machines thinking about things differently to us?

Nicola reminded the audience of the recent success of AlphaGo – a machine which used moves that no human player had seen to beat the human world champion at the game of Go – and asked whether this showed that machines were able to create new conceptual models of the world, interpreting their environment in a different way to humans.

While not disputing the technical achievements behind AlphaGo, our panel were a little sceptical about the extent to which its success indicated that the machine was thinking about Go in a fundamentally different way to people.

For example, Miranda noted that AlphaGo had played many more games than Lee Sedol; it was therefore possible that – in the vast amount of training data it had used – the program had seen many more of these rare moves than its human opponent. She explained that machine learning can set parameters in existing models – like an instruction to ‘add salt to taste’ in a recipe – which influenced the model output. At what point would we consider this a new recipe?

When will machines be able to explain themselves?

Some machine learning systems can already explain their workings. For example, Stephen noted that the Automated Statistician project has created a system which can generate an explanation of its forecasts or predictions, by breaking complicated datasets into interpretable sections and explaining its findings to the user in accessible language.

The history of AI is littered with predictions about when significant advances might be made; many of which have turned out to be wrong. With this in mind, our panel suggested that, instead of making predictions about when technical advances might happen, we should instead concentrate on defining the future we want, and how we want to work with machines, then working towards it.


The Royal Society is currently carrying out a project looking at the potential of machine learning over the next 5-10 years, and barriers to realising that potential. As part of this project, overseen by an expert Working Group, we’ve been holding public events and industry roundtables. For further information, take a look at our website.