External validation of longitudinal predictive models of health-related outcomes in patients older than 65 years of age

Study type
Protocol
Date of Approval
Study reference ID
23_002715
Lay Summary

Predictive models are being developed using Artificial Intelligence (AI) in different fields, also in healthcare. Some AI techniques can consider how the patient’s information has changed over time when obtaining the predictions, improving their performance as more follow-up time is available. IDIAPJGol, a primary care research institute from Catalonia, Spain, has developed models that can predict all-cause mortality, combination of chronic diseases, and hospital admissions one, three, and five years in advance, using data from all the Catalan over-65s population during 10 years of follow-up. These models can be used both at patient level, to adjust therapy and seek to improve quality of life according to predicted outcomes, and at the administrative level, to adjust human resources and infrastructure planning. However, the performance of these models is at the moment only attributable to the Catalan population. The aim of this project is to validate them using data from the United Kingdom (UK) population, to measure how these models’ performance changes on other populations in general, and on the UK population in particular. For this purpose, the sample of UK population available in CPRD meeting the same eligibility and follow-up criteria as in Catalonia will be selected and used to measure the models’ performance. After this validation, the models will be specialised using this sample. As a result, these models, once optimised for the UK population, can be used in the UK population in the same way as in Catalonia, both at the patient and administrative levels.

Technical Summary

The aim of this project is to externally validate three AI-based predictive models of all-cause mortality, multimorbidity pattern, and hospital admission one, three and five years in advance. These models were developed by IDIAPJGol (Catalonia, Spain), a primary care research institute, using primary care electronic health record data from 10 years of follow-up of the over-65s in Catalonia, and funded by ISCIII. They are based on recurrent neural networks, that are able to consider the evolution over time of the patients, and incorporate attention mechanisms to evaluate the importance of each variable at the individual level when obtaining each prediction. Their performance is only attributable to this population, so new data from a different population are needed to test its validity in other populations.

Once the same population is defined in CPRD data, i.e. individuals aged 65 and over in the period 2010-2019, the variables needed for the model will be calculated. The performance of the model in this population will then be calculated. Second, the existing models will be trained for a few more epochs using the CPRD data, aiming to specialise them in the UK population. This is also known as transfer learning. Differences in the performance will be studied considering the population differences between the Catalan and the CPRD populations, and the use of transfer learning. Finally, a guide for using and adapting these models to other information systems will be created, using the experience gained from validation with CPRD data, reporting mainly on how the data need to be adapted, how the models should be loaded and the predictions obtained.

Health Outcomes to be Measured

All-cause mortality;
Multimorbidity pattern;
Admissions due to heart failure (ICD-10 code: I50), cerebral infarction (ICD-10 code: I63), and "other chronic obstructive pulmonary diseases" (ICD-10 code: J44, please see https://icd.who.int/browse10/2019/en#/J44)

Collaborators

Sara Khalid - Chief Investigator - University of Oxford
Lucía Carrasco-Ribelles - Corresponding Applicant - University of Oxford
Daniel Prieto-Alhambra - Collaborator - University of Oxford

Linkages

HES Admitted Patient Care;ONS Death Registration Data;Patient Level Index of Multiple Deprivation;Rural-Urban Classification