Comparing direct and indirect methods to estimate prevalence of chronic diseases using real-world data

Study type
Protocol
Date of Approval
Study reference ID
24_003795
Lay Summary

Orphan medicines are treatments for diseases that affect less than 5 in 10,000 people. To encourage its research, medicine/s regulators have created faster pathways for the approval of these treatments. To classify a disease as rare we need to know how common it is. However, the use of “indirect” methods to calculate the number of individuals affected are under discussion.

Using pancreatic cancer, haemophilia, sickle cell disease, cystic fibrosis and pulmonary arterial hypertension as examples for rare diseases we aim to understand how many people in the UK and other European countries have been living with rare diseases in the past 10 years, how many people have been newly diagnosed, and the duration of the diseases. This will allow us to verify that indirect methods can be used to determine rare diseases, which in turn will increase the speed of the classification of orphan medicines.

Technical Summary

OBJECTIVE: To compare direct and indirect estimations of prevalence of rare, chronic diseases using routinely-collected primary care electronic health records (CPRD GOLD).

STUDY POPULATION: All individuals in CPRD GOLD during the study period 01/01/2010 to 31/12/2022 will contribute to estimate incidence and prevalence. All patients with a respective disease will be used to estimate median disease duration.

DISEASES OF INTEREST:
• Cystic fibrosis
• Haemophilia
• Pulmonary arterial hypertension
• Pancreatic cancer
• Sickle cell disease

STATISTICAL ANALYSES:
1) For each disease of interest, point prevalence at 01/01/2016 will be calculated. For each patient, the first diagnosis of a disease will be considered, and duration of disease is considered to last until the end of patients follow-up time. For point prevalence, denominator is the total number of persons in observation at this date.
2) For the calculation of the incidence rate (over the total study period), only newly diagnosed patients contribute to the numerator. Denominator is the total number of person-years at risk, i.e. observation time of a patient within the study period or until a diagnosis occurs.
3) Kaplan Meier curves are used to estimate survival probabilities, with the time axis being time since first diagnosis. Median disease duration is time where the survival probability decreases to below 50%.
4) From the incidence rate and median disease duration, "indirect" prevalence will calculated.
Analyses will be conducted stratified for children (age 01-17) and adults (age >=18)

BENEFIT FOR PATIENTS:
Understanding the prevalence of diseases in the population is important to inform public health planning. This study aims to test if prevalence can be adequately estimated using information on incidence and average disease durations in real-world data. This could inform the reuse of existing estimations for public health planning and potentially reduce the number of newly needed analyses.

Health Outcomes to be Measured

Prevalence (direct and indirect), incidence and disease duration of Cystic fibrosis, Haemophilia, Pulmonary arterial hypertension, Pancreatic cancer and Sickle cell disease

Collaborators

Annika Jodicke - Chief Investigator - University of Oxford
Annika Jodicke - Corresponding Applicant - University of Oxford
Antonella Delmestri - Collaborator - University of Oxford
Hezekiah Omulo - Collaborator - University of Oxford
Mandickel Kamtengeni - Collaborator - University of Oxford
Wai Yi Man - Collaborator - University of Oxford