Characterizing type 2 diabetes mellitus (T2DM) patients trajectories and health outcomes - understanding and predicting co-morbidities, risk factors and complications

Study type
Protocol
Date of Approval
Study reference ID
18_270
Lay Summary

Type 2 diabetes mellitus (T2DM) is a serious and progressive illness with a silent onset, affecting the way the subject body utilizes the circulating blood sugar. Along the disease progress, it can lead to severe complications in several parts of the body including legs and feet (i.e. loss of sensitivity or difficult to heal wound that my lead to amputations) but also, renal disease. As the disease remains undetected for a relatively long period of time in some patients, it would be interesting to investigate the disease characteristics leading to certain complications. To know such characteristics, we need to analyse large sets of real-world patients data and use new analysis techniques which were previously not available.
The UK Clinical Practice Research Datalink (CPRD) is a large database, containing anonymized patient data which can be re-used in clinical studies and research. Using statistical methods, this study aims to analyse the T2DM patient data from CPRD to assess which factors lead to bad or to better evolution. Additionally, this work will use machine learning techniques: a computer program which use large data sets to discover potential unknown relationships in the data through self-learning aptitude.
An expected study outcome will be to assess whether certain procedures (like amputation) or major long-term complications like the diabetic kidney disease are predictable and preventable. This study will help having a better understanding of T2DM natural course in general and likely medical evolution for a given patient. This will improve the T2DM management and patients follow up.

Technical Summary

CPRD is a large primary care database, containing anonymized primary care patient data in UK. Those data can be analysed to build disease progression models that can provide insights to disease trajectories (i.e., the sequence of medical events observed in individual patients) and therefore help physicians to better manage the patient’s disease. The study will therefore provide insights in tertiary disease prevention, a novel but increasingly prominent domain in healthcare.

This research aims to analyse patient cohort patterns, comparing the traditional statistical methods versus more novel machine learning approaches that uses the links/relationship between patients’ characteristics or medical events with predictive capabilities based on the patient trajectory data modelling.

In the first step we will apply descriptive and analytical statistical methods including univariate analysis and multivariate analysis with binary logistic regression, latent class analysis (latent growth mixture models) to characterize risk factors and outcomes. In the next step we propose a longitudinal Bayesian tensor factorization approach that relies on the Bayesian probabilistic matrix factorization framework (BPMF), a scalable approach with the ability to joint-model times to events and longitudinal data in two ways including (1) the extensions of methods for matrix factorization to allow the joint handling of time-dependent mixed type data and (2) the new extensions of deep learning algorithms, such as Long Short-Term Memory (LSTM), with appropriate modifications.

The two models will deliver two main insights on the analysis of interactions between different variables (including the risk factors identified by the first step of statistical analysis) and a prognosis for patients.

Health Outcomes to be Measured

•Lower limb complications of T2DM (list of codes is detailed in the appendix at the end of this protocol)
•Diabetic kidney disease (DKD)

Collaborators

Yves Moreau - Chief Investigator - KU Leuven University
Marc Twagirumukiza - Corresponding Applicant - Janssen Pharmaceutica NV
Edward de Brouwer - Collaborator - KU Leuven University
Ákos Tonkol - Collaborator - Janssen Pharmaceutica NV

Linkages

HES Admitted Patient Care;HES Outpatient;ONS Death Registration Data;Practice Level Index of Multiple Deprivation