Investigating the applicability of regression discontinuity in electronic health record for assessing the effectiveness of prostate-specific antigen screening for the prevention of prostate cancer

Date of Approval
Application Number
Technical Summary

Regression discontinuity (RD) design is a quasi-experimental method that takes advantage of decision rules that assigns patients to a clinical intervention if they fall above/below a cut-off point. RD can be used to assess the causal treatment effects in clinical medicine. This study seeks to determine the applicability and feasibility of RD in electronic health record (EHR) data in the field of clinical oncology. Specifically, we aim to (1) determine whether PSA thresholds are associated with a change in the probability of receiving a clinical intervention (here prostate biopsies and referral to urologists), (2) evaluate if patient characteristics are balanced within a small bandwidth surrounding these thresholds, and (3) investigate whether associations between the clinical intervention and patient outcomes are robust to different choices of bandwidth around the threshold. Exposure variables include tumour types, laboratory and physical measurements (e.g., body-mass index, age, lactate dehydrogenase, and haemoglobin). Patient outcomes include oncology-specific measurements and outcomes such as number/duration of hospitalisations, (tumour) pain, overall survival/mortality (overall and by cause of prostate cancer). We will estimate “fuzzy” RD models using local linear regression to avoid overfitting data and triangular weights to give more influence to observations close to the threshold. In addition, we will use a mean squared error (MSE) optimal bandwidth that is empirically derived. We assess the sensitivity of the results using alternative bandwidths (e.g. bandwidths that are 50%, 75%, 125%, and 150% of the empirically derived mean squared error-optimal bandwidth). If feasible and applicable, RD analyses in EHR could generate valuable insights into the real-life effects of clinical intervention and help identify heterogenous treatment effects across more granular patient subpopulations, particularly in individuals that are normally excluded from clinical trials.

Health Outcomes to be Measured

Primary outcomes will include mortality from CPRD. We are not planning to use ONS data as previous research has shown a strong overlap between CPRD and ONS mortality data.[30] We provide further detail on this below.

We aim to include secondary outcomes that can be measured in primary care, including referral to a specialists (CPRD), the diagnosis of prostate cancer (CPRD), inclusion into active surveillance programs (CPRD). We will also include subgroups of hospitalization relating to complications from prostatectomy using HES data.


Till Bärnighausen - Chief Investigator - University of Heidelberg
Maximilian Schuessler - Corresponding Applicant - University of Heidelberg
Anant Jani - Collaborator - University of Oxford
Christian Bommer - Collaborator - University of Heidelberg
Julia Lemp - Collaborator - University of Heidelberg
Justine Davies - Collaborator - University of Birmingham
Min Xie - Collaborator - University Hospital Heidelberg
Pascal Geldsetzer - Collaborator - University of Heidelberg
Sebastian Vollmer - Collaborator - Georg-August-Universität Göttingen


HES Admitted Patient Care;Patient Level Index of Multiple Deprivation;Practice Level Rural-Urban Classification