Alzheimer’s disease subtype discovery in electronic health records using cluster analysis

Study type
Protocol
Date of Approval
Study reference ID
18_111
Lay Summary

Alzheimer’s disease (AD) is the most common form of dementia affecting 850,000 people in the UK. Patients with AD can have a mixture of symptoms. This study will investigate whether there are distinct groups of patients that have a similar set of symptoms using a mathematical method called cluster analysis. Previously this type of research has been carried out using information from brain imaging and memory tests, however never in patient’s health records to date. As healthcare data is collect during across time, this will enable us to describe how the disease progresses, which might help distinguish AD groups for whom the disease progresses differently. Using this we can predict what is likely to happen to each patient who is diagnosed with AD. Lastly better understanding of these individuals may also help inform diagnosis for a proportion of individuals with dementia but for whom no further diagnosis exists. We will compare these patients to diagnosed AD patients to see if they are similar enough to indicate that they actually have AD. This research will help to inform AD patient management by finding hidden subtypes.

Technical Summary

Dementia will affect 75 million people globally by 2030. AD is the most common form of dementia, diagnosis is challenging and currently no cure exists. Difficulty in meeting the needs of these patients stem from the heterogeneous nature of the disorder. Each patient can have a wide variety of symptoms thus identifying AD and targeting treatments becomes more complex. In this study, we will use clustering analysis to identify AD subtypes in electronic health records. Clustering is an exploratory approach that identifies subtypes of a disease through finding groups of patients more similar to those in the group than those not, based on a multitude of variables.
This technique has been used to identify AD subtypes in smaller studies using neuropsychological and imaging data. However, cluster analysis of a larger patient database, a relatively unbiased population of EHR, could lead to the improved identification and characterisation of AD phenotypes which could create better disease progression models. As EHR contains longitudinal data we will also create trajectory models that will find more clinically relevant disease subtypes. Using the subtypes found we hope to identify undiagnosed AD patients within an unspecified dementia cohort through comparing the progression of patients with AD.

Health Outcomes to be Measured

 Alzheimer’s disease subtypes
• Predicting outcome in Alzheimer’s disease
• Alzheimer’s disease progression
• Improved Alzheimer’s detection
• Alzheimer’s disease phenotyping
• Inform personalised treatment

Collaborators

Spiros Denaxas - Chief Investigator - University College London ( UCL )
Caroline Dale - Collaborator - University College London ( UCL )
Daniel Alexander - Collaborator - University College London ( UCL )
Kenan Direk - Collaborator - University College London ( UCL )
Nonie Alexander - Collaborator - University College London ( UCL )

Linkages

HES Admitted Patient Care;ONS Death Registration Data;Patient Level Index of Multiple Deprivation