Predicting Type-II Diabetes Using Primary Care information

Study type
Protocol
Date of Approval
Study reference ID
18_040
Lay Summary

The prevalence and burden of disease associated with type 2 diabetes has rapidly increased worldwide. Doctors often utilize prediction scores which use a range of clinically measurable biomarkers (e.g. body mass index, blood pressure) to assist in clinical management of patients and predict how likely they are to develop diabetes. Although several prediction scores exist, none is widely accepted or established and most make use of single values from a limited set of hand-picked measurements despite the fact that many exist. The increasing use of electronic health records in general practice means that large amounts of potentially relevant information about a patientÂ’s medical history are available for analysis and could be of potential benefit in guiding diabetes care. For example, patient records contain a large number of test results, symptoms, prescriptions, lifestyle factors and more which could improve the accuracy of diabetes prognostic models.

This project will develop and test new analytical approaches for using this information in disease prediction in order to take advantage of the richness of the available data. We will initially test these methods in predicting the risk of developing type 2 diabetes and compare performance of the new methods with existing validated prediction models.

Technical Summary

Type 2 diabetes (T2D) prevalence is rapidly increasing and is associated with a substantial burden of disease and significant costs to healthcare systems worldwide. Risk prediction models have been developed to assist clinicians with the clinical management of patients by identifying those at a higher risk of developing T2D and that could potentially benefit from targeted healthcare interventions or lifestyle changes. Such models are based on classical regression-based techniques and utilize a single measurement from a set of a priori-defined predictors. Novel methodological approaches are required to take advantage of the richness and breadth of the clinical data and biomarkers available in patient records and combine them with novel statistical approaches to create accurate T2D risk prediction tools.

In this study, we implement and evaluate a T2D risk prediction model using a pre-defined set of clinical markers and supervised machine learning approaches such as decision trees, support vector machines and probabilistic graphical models in order to predict the risk of developing T2D. We will compare our results with findings from existing validated T2D risk prediction models and clinical knowledge from published literature.

Health Outcomes to be Measured

The primary outcome measure of the study will be the first incident diagnoses of type 2 diabetes. We will define T2D using a previously published and validated algorithm (Shah A. et al 2015).

Collaborators

Spiros Denaxas - Chief Investigator - University College London ( UCL )
Spiros Denaxas - Corresponding Applicant - University College London ( UCL )
Anoop Shah - Collaborator - University College London ( UCL )
Arturo Gonzalez-Izquierdo - Collaborator - University College London ( UCL )
Harry Hemingway - Collaborator - University College London ( UCL )
Maria Pikoula - Collaborator - University College London ( UCL )

Linkages

HES Admitted Patient Care;ONS Death Registration Data;Patient Level Index of Multiple Deprivation