Determining the applicability and feasibility of using regression discontinuity in electronic health record data

Study type
Protocol
Date of Approval
Study reference ID
20_125
Lay Summary

This study focuses on an analysis method – called regression discontinuity or “RD” – that aims to determine whether patients profit from a given treatment. While clinical trials (that is, a study where patients are randomly assigned to receive a treatment or not, often under highly controlled conditions) are considered to be the method of choice to test for effective treatment, they may not fully capture real-life effects occurring during routine care. Moreover, many clinical trials lack the long-run perspective necessary to evaluate treatments. RD achieves this by exploiting thresholds (cut-offs) in variables that influence whether someone receives a treatment or not (e.g., blood pressure above 160/100 mmHg). By “zooming in” around the threshold, it can be assumed that people just above and just below the threshold are similar to each other and can, thus, be compared in the same fashion as treated and untreated patients in a clinical trial. The method could potentially be applied widely in clinical medicine because clinical decisions are frequently at least partially based on such thresholds. RD, however, has thus far not been used in electronic health record data. The objective of this study is to determine to what degree RD can be applied in electronic health record data. If we are able to show that the method is feasible for a wide variety of variables and thresholds used in clinical medicine, then RD could be used to study of the effectiveness of clinical interventions in routine care (as opposed to research settings).

Technical Summary

Regression discontinuity (RD) design – a quasi-experimental method taking advantage of decision rules that assign patients to a clinical intervention if they fall above/below an arbitrary cut-off point – has potential to assess causal effects of clinical interventions. This study seeks to determine the applicability and feasibility of RD in electronic health record data. Specifically, we aim to (1) determine which (if any) laboratory or physical measurements contain thresholds that are associated with a substantial change in the probability of receiving a clinical intervention, (2) evaluate if patient characteristics are balanced within a small bandwidth surrounding these thresholds, and (3) investigate whether associations between the clinical intervention and patient outcomes are robust to different choices of bandwidth around the threshold. Exposure variables include laboratory and physical measurements such as BMI, blood pressure, HbA1c, blood glucose, age, low-density lipoprotein, thyroid-stimulating hormone level, hemoglobin, and T-score and Z-score for bone mineral density. Patient outcomes primarily include future measurements of the exposure variables, mortality (overall and by cause of death, such as due to cardiovascular disease when examining the effects of statins), and hospitalization (overall and by cause of admission). We will estimate “fuzzy” RD models using local linear regression to avoid overfitting data and triangular weights to give more influence to observations close to the threshold. In addition, we will use a mean squared error (MSE) optimal bandwidth that is empirically derived. We assess the sensitivity of the results using alternative bandwidths (e.g. bandwidths that are 50%, 75%, 125%, and 150% of the empirically derived mean squared error-optimal bandwidth). If feasible and widely applicable, RD analyses in electronic health records could generate valuable insights into the real-life effects of clinical interventions on health and health care use, the unintended effects associated with these interventions, and the potential heterogenous treatment effects by detailed patient subgroups.

Health Outcomes to be Measured

The primary outcomes that we will measure are: future measurements of the exposure variables (e.g. BMI, blood pressure, HbA1c, blood glucose, low-density lipoprotein, thyroid-stimulating hormone level, hemoglobin, and T-score and Z-score for bone mineral density), mortality (overall and by cause of death, such as due to cardiovascular disease when examining the effects of statins), and hospitalization (overall and by cause of admission).

Collaborators

Till Bärnighausen - Chief Investigator - University of Heidelberg
Julia Lemp - Corresponding Applicant - University of Heidelberg
Anant Jani - Collaborator - University of Oxford
Christian Bommer - Collaborator - University of Heidelberg
Duy Do - Collaborator - University of Heidelberg
Justine Davies - Collaborator - University of Birmingham
Michaela Theilmann - Collaborator - University of Heidelberg
Pascal Geldsetzer - Collaborator - University of Heidelberg
Sebastian Vollmer - Collaborator - Georg-August-Universität Göttingen