Predicting risk of dementia using routine electronic health records

Study type
Date of Approval
Study reference ID
Lay Summary

Detecting the signs and symptoms of dementia as early as possible could help improve treatment of the condition. Several research teams have tried with only partial success to use the vast amount of information held in patients' electronic health records and also information collected directly from patients to predict a future dementia diagnosis. Recently, a new technique from computer science known as machine learning has demonstrated the potential to handle very large amounts of information better than more traditional methods used in the earlier studies.

The aim of this project is to predict the risk of developing dementia from electronic health records using machine learning.

If successful, this will help to identify the patients most likely to benefit from treatments earlier, and provide personalised information about risk of dementia for each patient to inform shared decision-making between GPs and individuals at risk.

Technical Summary

Systematic reviews found 21 dementia risk prediction tools for use in population-based settings, but concluded that none of them were particularly good. One tool, which used the primary care electronic health record (EHR), was developed using traditional statistical techniques and included a limited set of cross-sectional variables. In recent years, machine learning (ML) techniques have demonstrated the potential to outperform traditional prediction methods in large-scale datasets.

This project will utilise ML techniques and the longitudinal nature of EHRs to develop an improved EHR-based tool for estimating patient risk of developing dementia. The project will investigate a broad range of previously identified, newly emerging, and novel potential predictive factors, to develop predictive models using both traditional - i.e. logistic regression - and ML techniques.

A successful primary care-based dementia risk prediction tool will aid early identification of patients most likely to benefit from preventative interventions and provide personalised information about risk to inform shared decision-making between GPs and individuals at risk.

Health Outcomes to be Measured

A diagnosis of any form of dementia (generic definition)
- A diagnosis of dementia excluding certain sub-types with a specific aetiology (restricted definition)
- A code for dementia or a dementia-related condition (e.g. memory loss) (extended definition)


David Reeves - Chief Investigator - University of Manchester
Stephen Pye - Corresponding Applicant - University of Manchester
Blossom Stephan - Collaborator - Newcastle University
Cathy Morgan - Collaborator - University of Manchester
Daniel Stamate - Collaborator - Goldsmiths University of London
Darren Ashcroft - Collaborator - University of Manchester
Elizabeth Ford - Collaborator - Brighton and Sussex Medical School
Evangelos Kontopantelis - Collaborator - University of Manchester
Fionn Murtagh - Collaborator - Goldsmiths University of London
Harm Van Marwijk - Collaborator - Brighton and Sussex Medical School
John Langham - Collaborator - Goldsmiths University of London
Mihai Ermaliuc - Collaborator - Goldsmiths University of London
Neil Pendleton - Collaborator - University of Manchester
Richard Smith - Collaborator - Goldsmiths University of London

Former Collaborators

Taposhri Ganguly - Collaborator - Goldsmiths University of London


HES Admitted Patient Care;Patient Level Index of Multiple Deprivation;Practice Level Index of Multiple Deprivation