Estimating heterogeneous treatment effects from routinely collected health data: A case study on anticoagulant therapy for atrial fibrillation patients.

Application Number
Lay Summary

With the rapid evolution of medical interventions and the increasing demand for personalised information on the efficacy and safety of therapies, it has become difficult to keep track of the effect of medical interventions across diverse patient populations in a timely manner. To fill this important gap, clinical researchers have proposed to leverage the information contained in large, routinely collected health datasets (such as CPRD). The purpose of this project is to appraise and validate statistical methods that can estimate differences in treatment effect in different population sub-groups from clinical practice data. We propose to do so in the context of anticoagulant treatment for atrial fibrillation patients. Atrial fibrillation is the most common type of abnormal heart rhythm and can lead to blood clots forming in the heart. Anticoagulants are used in these patients to prevent stroke. In the past years, there have been many studies investigating the average effect of traditional and emerging anticoagulant therapies. Much less evidence exists regarding the optimal choice of therapy based on individual patient characteristics. We aim to contribute to the research on methods that enable learning from clinical practice data, as well as to the understanding of optimal anticoagulation of atrial fibrillation patients.

Technical Summary

We aim to estimate heterogeneous treatment effects of anticoagulants in atrial fibrillation patients. This will include conducting pairwise comparisons between non-vitamin K antagonist oral anticoagulants (such as rivaroxaban and apixaban), vitamin K antagonists (such as warfarin), and no anticoagulant treatment. For each patient, an index date is set as the date of anticoagulant therapy initiation following a diagnosis of atrial fibrillation. When comparing against no treatment, the index date becomes the date of first diagnosis. The primary outcomes under consideration are stroke and systemic embolism, major bleeding and all-cause mortality within two years of the index date. When therapy discontinuation or therapy switching take place within the follow-up period, the patient is considered lost to follow-up. We choose to use Target Maximum Likelihood learning. This approach is doubly robust if at least one of the treatment and outcome mechanisms is correctly specified. Both probability of treatment and outcome will be modelled using Bayesian Additive Regression Trees. Findings will be compared to estimates using the more traditional Cox regression model. Estimates from these observational studies will be compared with those of appropriate patients in existing clinical trials.

Health Outcomes to be Measured

Ischaemic stroke and systemic embolism; Major bleeding; All-cause mortality.


Blanca Gallego Luxan - Chief Investigator - Macquarie University
Blanca Gallego Luxan - Corresponding Applicant - Macquarie University
James Sheppard - Collaborator - University of Oxford
Jiazhen He - Collaborator - University of Melbourne
Jie Zhu - Collaborator - Macquarie University
Karin Verspoor - Collaborator - University of Melbourne
Polina Putrik - Collaborator - Monash University
Thierry Wendling - Collaborator - Not from an Organisation
William Tong - Collaborator - Macquarie University


HES Admitted Patient Care;HES Outpatient;ONS Death Registration Data