Validation of a Risk Estimation Algorithm for Predicting the Early Risk of Chronic Kidney Disease in Newly Diagnosed Diabetics Based on General Practitioners’ Records in the United Kingdom

Study type
Protocol
Date of Approval
Study reference ID
19_040
Lay Summary

The development of chronic kidney disease (CKD) is one of the most frequent complications faced by people with diabetes. CKD is a gradually progressing disease that impairs the renal function. It typically shows no symptoms in early stages but ultimately the kidneys are damaged to an extent that it becomes vital to either use regular blood cleansing (dialysis) to remove the wastes from the patient’s body or eventually replace the patient’s kidney(s).
The ability to accurately predict the onset of this disease at an early stage and adjust medication and lifestyle appropriately, has the potential to prevent or at least slow down the progression of the disease substantially.
Thus far, published prediction algorithms are mostly built upon data from controlled clinical trials which is not necessarily representative for data collected in routine care situations. This limits the applicability of those existing prediction algorithms if used in real-world situations in a general practice.
In this study we verify whether an algorithm previously developed on a large electronic health record database of routine care data from the US can be transferred to data collected in routine care situations in Europe. Therefore, we use a patient population from the UK, analyse the quality of using the CKD prediction algorithm in the UK population, and discuss options on improving the method for tailored European usage.

Technical Summary

This is a retrospective cohort study using secondary data. People with diabetes mellitus (PwD) with and without the comorbidity of chronic kidney disease (CKD) will be filtered from the CPRD data set based on pre-selected READ codes (code lists provided in Appendix A). Baseline characteristics will be described for all included patients. The relevant laboratory covariates will be extracted from the patient’s records using the corresponding READ Codes (see Appendix B). The performance of a previously developed multivariate logistic regression algorithm for accurately predicting the risk of developing CKD within three years after the initial diabetes diagnosis will be compared against a selection of published benchmark algorithms developed using data from clinical trials. The comparison will be conducted both for the unrestricted real-world patient cohort (of PwD with and without a CKD comorbidity) and for sub-cohorts created according to criteria closely mimicking the clinical trial setting those algorithms were built from. The prediction performance will be primarily measured by the area under the receiver operating characteristics curve (AUC) of each algorithm as it was done for the original research. However, to obtain further insight into the algorithm’s performance we determine complete ROC curves, providing straightforward calculations of additional performance metrics. We will discuss illustrative pairs of sensitivity and specificity in our analysis.

Health Outcomes to be Measured

Overall Performance of the prediction algorithm for newly diagnosed diabetes patients on the unrestricted cohort and the relevant sub-cohorts;
Comparison with literature algorithms;
Gender-specific and diabetes type-specific performance of the prediction algorithm;
Analysis of the feature importance of different clinical markers for the prediction of CKD.

Collaborators

Tony Huschto - Chief Investigator - Roche Diabetes Care GmbH
Tony Huschto - Corresponding Applicant - Roche Diabetes Care GmbH
Christian Ringemann - Collaborator - Roche
Claire Marriott - Collaborator - Roche
Helena Konig - Collaborator - Roche