This week the Royal Society launched a report on machine learning, setting out the action needed to maintain the UK’s role in advancing this technology while ensuring careful stewardship of its development. We looked to our journals to see how machine learning is finding its way into many areas of research, and have provided some highlights below.
Being able to improve our understanding of geographical differences and inequality in health, wealth and access to resources is central to meeting international development goals. The authors of this paper, published in Journal of the Royal Society Interface, have developed a new way of mapping this variation. They combine household survey data with satellite imagery using Bayesian and machine learning modelling methods to map gender differences in literacy, stunting and use of contraceptive methods across four low-income countries. The results show the potential of this approach, and also reveal challenges in constructing consistently accurate maps.
Being able to accurately predict the DNA targets and interacting transcriptional regulators from genome-wide data can significantly advance our understanding of gene regulatory networks. In this paper, the authors apply machine learning algorithms to describe the prediction and validation of new protein partners of NKX2-5, a protein critical for normal heart development and associated with a number of congenital heart diseases. The research highlights an intuitive approach to accessing protein–protein interactions, extending the knowledge of gene regulatory networks involved in normal development and in the context of disease.
A bit of history
The 1970s saw the development and the emergence of computing away from the large corporation and into the hands of enthusiast, students and entrepreneurs. It led to the home computer, primarily used for gaming, in the 1980s. In the 1990s it became more complex with the mobile system market also gathering pace, and the 2000s brought improvements in performance efficiency. This current decade has seen massive developments in mobility and power, and dramatic growth in cloud computing and machine learning. This Perspective provides a history of the microprocessor – a processing unit integrated onto a single microchip, now almost 50 years old, and central to all computing.
Advances in microscopy, such as multi-photon laser scanning microscopy, have revolutionised how we visualise cells and tissues in living systems, bringing with it an immense volume of imaging data. As we progress further, machine learning will enable improvements in image acquisition and quality, how tracking algorithms are generated, and how data is stored, managed and visualised. This Review discusses the progress made in quantitative and automated tissue analysis from the perspective of bioimage informatics, drawing on examples from developmental biology. The remaining challenges and future directions in the post-image analysis era are also discussed.
We are generating more data than ever before, and with that brings the need to study and evaluate moral problems related to how we record, process and share data; how we use algorithms for artificial intelligence, machine learning, and robotics; and how we mediate related practices such as innovation, programming and hacking. This article introduces the theme issue ‘The ethical impact of data science’, published in Philosophical Transactions A following a workshop on ‘The Ethics of Data Science, The Landscape for the Alan Turing Institute’ hosted by the University of Oxford. It emphasises the complexity of the ethical challenges posed by data science.
Better molecular screening
The focus of this study was to develop a user friendly and publicly accessible web server that allows high-throughput screening of small molecules for targeting specific protein–protein interactions. Three protein complexes known for their role in apoptosis were chosen as the model system for identifying new interactions, and machine learning was used to train the system. Experimentally, this kind of screening would be a relentless task, so servers such as this allow for novel protein-protein interactions to be identified quickly and accurately, which is useful for designing potential new drugs for diseases like cancer. The authors named their server PPIMpred, which stands for Prediction of Protein-Protein Interactions Modulators.