Characterisation of multimorbidity clusters and trajectories using data-driven approaches in a nationally-representative population

Date of Approval
Application Number
21_000345
Technical Summary

Our proposal seeks to deliver new knowledge on multimorbidity through the application of hypothesis-generating, data-driven analytical tools to a large-scale electronic health record dataset from a nationally-representative population. We seek to expand on previous multimorbidity research (which focuses predominantly on 30-40 common chronic conditions) to define clusters and trajectories of multimorbidity across ~200 long-term conditions. This large set of conditions have been defined through a rigorous consensus-building process and extensive review of existing validated code sets (including previously-published CPRD studies undertaken by CALIBER and the Cambridge Multimorbidity group (1–3)).

We will initiate our analyses using a cross-sectional design, applying data-driven clustering (unsupervised machine learning) to identify clusters of multimorbidity in England population across our extensively-curated ~ 200 long-term conditions obtained from primary care data and linked Hospital Episode Statistics (HES). We will stratify our analysis by ethnicity to identify how multimorbidity clusters across ethnic groups (including White, Black African and Caribbean, and south Asian).

Subsequent to this hypothesis-generating clustering, we will use an observational cohort design to investigate how multimorbidity clusters develop from index conditions and how they change over time, including to death (using linked Office for National Statistics mortality data). At this point, we will focus on a limited number of multimorbidity clusters (up to 8) selected to ensure we generate new knowledge, e.g. clusters which include less-studied conditions such as cancer or human immunodeficiency virus, as well as those where there is variation by ethnicity. We will then use epidemiological analyses, including Cox model, adjusted by socioeconomic characteristics (age, sex, ethnicity, Index of Multiple Deprivation), and time-dependent variables (e.g. risk factors, clinical measurements) to characterise these trajectories. Additionally, we will apply artificial intelligence techniques (including supervised and unsupervised machine learning) for focused analyses to examine the interrelationship between specific multimorbidity clusters under study and prescribing/polypharmacy patterns.

Health Outcomes to be Measured

Primary outcome(s):
(a) The clusters of multimorbidity*; and,
(b) The multimorbidity cluster trajectories throughout the follow up period.

*Multimorbidity will be defined as the presence of two or more long-term conditions (out of the clinical consensus-derived 220 conditions described in appendix 1) within the same individual. Unsupervised machine learning algorithms applied to the dataset will then derive and define clusters of multimorbidity, by grouping similar entities together (using an understanding of difference in distance between data points). A sample of code sets used to select conditions are shown in Appendix 2.

Secondary outcomes:
(a) Index condition to multimorbidity cluster; and,
(b) Multimorbidity cluster to death

Collaborators

Sarah Finer - Chief Investigator - Queen Mary University of London
Fabiola Eto - Corresponding Applicant - Queen Mary University of London
Alisha Angdembe - Collaborator - Queen Mary University of London
Deborah Swinglehurst - Collaborator - Barts and the London Queen Mary's School of Medicine and Dentistry
Michael Barnes - Collaborator - Barts and the London Queen Mary's School of Medicine and Dentistry
Miriam Samuel - Collaborator - Queen Mary University of London
Nick Reynolds - Collaborator - Newcastle University
Rafael Henkin - Collaborator - Queen Mary University of London
Rohini Mathur - Collaborator - London School of Hygiene & Tropical Medicine ( LSHTM )
Sally Hull - Collaborator - Barts and the London Queen Mary's School of Medicine and Dentistry
Steph Taylor - Collaborator - Barts and the London Queen Mary's School of Medicine and Dentistry
Tahania Ahmad - Collaborator - Queen Mary University of London

Linkages

HES Admitted Patient Care;ONS Death Registration Data;Patient Level Index of Multiple Deprivation;Practice Level Index of Multiple Deprivation;Pregnancy Register