Comparison of modelling methods for risk prediction of cardiovascular outcomes in patients with hypertension
Date of Approval
Medications proven to be effective in rigorous clinical trials are then used by GPs and other healthcare professionals in their daily practice. This daily practice is recorded in electronic patient records, which can be used by researchers with ethical approval. The Clinical Practice Research Datalink (CPRD) is one such “real-world” research database, including patient records for around 10% of UK GP practices. Such databases differ from clinical trial databases in key ways such as having a broader range of patients but lower data quality. Researchers wishing to use CPRD need to understand problems such as missing data and the fact that when patients change GP practice to one not covered by CPRD, they “disappear” from CPRD. In this proposal, our team plans to undertake various statistical analyses to see what impact these data problems have on the results and what are the best ways to overcome them.
To make the research concrete, we will analyse records for patients newly diagnosed with high blood pressure and build statistical models to predict which kinds of patients are at higher risk of cardiovascular disease after diagnosis. A particular modelling challenge is how to deal with the fact that blood pressure changes over time, which can be tackled in different ways, so far without agreement on the best way to do this. The project will benefit patients and their GPs, by providing better information regarding cardiovascular disease risk, and researchers, by providing a framework for their analysis with real-world data such as CPRD.
Statistical models to estimate the risk of patient outcomes over time can either be generated from trial databases or real-world data. Methodological challenges such as loss to follow-up, time-varying covariates, measurement error, clustering and missing data can limit their validity, irrespective of data source, but the best approach to tackle these challenges is not known. Using hypertension and cardiovascular outcomes as our case study, simulation will assess the impact on predicted cardiovascular disease (CVD) risk for a range of scenarios regarding informative drop-out and missing values; this is because with simulation, the true relation between each risk factor and the outcome is known. We will compare a range of modelling approaches to handle predictors such as systolic blood pressure (BP) and BMI that change over time. The simplest approach will be Cox models using aggregated patient-level predictors (BP etc) and time-varying covariates. More advanced approaches to compare with it will include mixed effects Cox models, joint modelling, and trajectory classification: the last of these will be used to describe the different ways in which BP changes over time, with the trajectory used as a predictor in other models. To deal with the competing risk of non-CVD death, we will also apply subdistributional and cause-specific hazards and compare with Cox outputs. In all approaches, sensitivity analyses will assess the impact on risk estimates of measurement error in the outcome by using different definitions of the outcome, especially different ICD10 codes for CVD.
We aim to produce analytical guidelines for risk prediction models to advise on how to assess and handle these challenges.