How the predictions of algorithms used in healthcare provision change: an investigation using COVID-19 and Cardiovascular disease risk prediction case studies.

Study type
Protocol
Date of Approval
Study reference ID
21_000669
Lay Summary

Artificial Intelligence (AI) and Machine Learning (ML) are names for types of computer programmes that find and use patterns in data. For several years these methods have been used for systems that automatically recognise images and in attempts to build self-driving cars. They are now also being developed for computerised systems for diagnosing and treating illnesses. Like everything else used in medical care, these systems will need to be checked before they can be used in clinical care. The checks are to make sure that they are safe, reliable, and actually help patients.

One difference between AI and other medical technology is that AI programs can be changed particularly easily. Adaptive AI software programmes can learn and change as they receive new information. That means they can get better and more useful, but it also means newer versions may make decisions that are different from those made by the version that was originally approved. At some point, a system may change enough to need reassessment by regulators, but requiring this after every minor change would slow down possible improvements. There needs to be a balance between ensuring safety and obtaining the benefits of improved treatment.

This project will investigate how AI models predict risks, of COVID and heart disease, and look at how these predictions change as blocks of new data are added. It should help NHS patients in the UK by helping medical regulators write standards defining when AI systems will need to be reassessed.

Technical Summary

Artificial Intelligence (AI) and Machine Learning (ML) techniques can find and interpret patterns in data that people find difficult to reliably detect. They are starting to be applied to medical systems. A major advantage of these approaches is their ability to learn, updating their estimates in response to new information. However, that flexibility also poses a problem for regulation as it is important to ensure that any changes do not change the benefit-risk ratio in a way that poses risks to patient safety.

This project is applying AI techniques to subsets of CPRD Aurum primary care data in order to estimate risks associated with COVID and cardiovascular disease (CVD). It will fit models to initial datasets, then refit the models after adding a block of more recent data. The situation around COVID has changed rapidly, while that for CVD is likely to be more stable, so these represent two different scenarios of interest. Four types of models (Logistic Regression, Bayesian networks, Neural networks, and Random Forest tree-based models) will be investigated. Changes in how well the models fit the data, the models' internal structure, their parameter estimates and associated uncertainties, and their predictions will be examined. The aim is to understand the relative stability and informativeness of each of these measures, in order to develop methodology to determine if there has been a significant change in the way that an algorithm is working to inform regulators of the need for re-assessment.

This work will benefit patients, including NHS patients in England and Wales, by informing the development of regulatory standards for AI algorithms used in diagnostic systems and other medical devices, particularly in the area of Adaptive AI software programmes which can learn and change as they receive new information.

Health Outcomes to be Measured

Positive COVID test result; cardiovascular event (stroke/heart attack);hospitalisation/death.

Collaborators

Mike Lonergan - Chief Investigator - CPRD
Mike Lonergan - Corresponding Applicant - CPRD
Allan Tucker - Collaborator - Brunel University London
Puja Myles - Collaborator - CPRD
Ylenia Rotalinti - Collaborator - Brunel University London

Linkages

Practice Level Index of Multiple Deprivation;Rural-Urban Classification