Quality and completeness of breast cancer recording in CPRD Aurum compared to CPRD GOLD

Study type
Protocol
Date of Approval
Study reference ID
20_000062
Lay Summary

Patient medical data using Vision patient management software has been collected for over three decades and is managed by the CPRD. CPRD GOLD data has been well described and validated with over 2,400 peer-reviewed publications in the last 30 years. CPRD is now providing data from a new medical record database, CPRD Aurum, which encompasses data on over 900 general practices and 30 million patients to date. Like CPRD GOLD, CPRD Aurum contains electronic healthcare records entered by GPs, however, CPRD Aurum uses a different GP patient management software, EMIS. Unlike the Vision patient management software that supports CPRD GOLD, EMIS was not developed with the additional intent to provide data for medical research. It is currently unknown what, if any, impact the differences in the two software platforms may have on the quality of the data for research purposes. Understanding the characteristics, strengths, and limitations of any new electronic medical data source is a critical step towards making decisions about its suitability for use in medical research.
CPRD GOLD is regularly used to study many diseases and to estimate prevalence and incidence of diseases, patient characteristics, laboratory measurements, treatment patterns and drug effectiveness and safety. Understanding how the new and larger CPRD Aurum databases compares with CPRD GOLD will enhance research capabilities in setting up and carrying out studies of different disease areas and drug exposures.

Technical Summary

CPRD is now providing data from a new medical record database, CPRD Aurum. Like CPRD GOLD, CPRD Aurum contains electronic healthcare records entered by GPs to facilitate clinical care; however, CPRD Aurum uses a different GP patient management software, EMIS. Unlike the Vision patient management software that supports CPRD GOLD, EMIS was not developed with the additional intent to provide data for medical research. While these data are sourced from the same UK health care system, it is unknown what, if any, impact the differences in the two software platforms may have on the quality of the data for research purposes.
We will describe the presence of breast cancer diagnosis codes, surgeries, drug treatments (e.g. tamoxifen, aromatase inhibitors), and other cancer care consistent with the treatment of breast cancer and compare the presence of these data elements in CPRD Aurum and CPRD GOLD. We will develop an algorithm to classify the likelihood that patients are ‘true’ breast cancer cases based on presence of these codes. We will estimate and compare crude and age-standardized incidence rates of breast cancer in CPRD Aurum and CPRD GOLD. Amgen researchers will conduct parallel analyses using the same patient population to confirm the findings.
Using the data quality assessment methodologies described by Weiskopf and Weng (Weiskopf 2013), we will estimate correctness and completeness of records in CPRD Aurum and CPRD GOLD compared to those in Hospital Episode Statistics (HES) or in the Cancer Registry (NCRAS). Correctness = the proportion of patients with ≥1 breast cancer diagnosis recorded in CPRD Aurum or CPRD GOLD who also had a breast cancer recorded in HES or NCRAS. Completeness = the proportion of CPRD linked patients with ≥1 breast cancer diagnosis recorded in HES or NCRAS who also had a breast cancer recorded in CPRD Aurum or CPRD GOLD.

Health Outcomes to be Measured

We will:
• compare the proportion of patients who have the following data elements present (yes/no) in their CPRD Aurum record compared to the proportion of patients with these same elements recorded in CPRD GOLD:
o Breast cancer diagnoses
o Breast cancer drug treatment (tamoxifen, aromatase inhibitors)
o Breast cancer surgery (mastectomy, lumpectomy)
o Radiation
o Chemotherapy
o Supporting Clinical Codes that indicate cancer care (e.g. specialist referrals and office visits, cancer care, palliative care)
• estimate a breast cancer likelihood classification (Likely, Possible, Unsupported) in CPRD Aurum compared to CPRD GOLD;
• calculate crude and age standardized incidence rates of breast cancer in CPRD Aurum compared CPRD GOLD;
• calculate correctness (e.g. accuracy, agreement) of breast cancer diagnoses recorded in CPRD Aurum and CPRD GOLD compared to HES (APC and OP) and NCRAS
• calculate completeness (e.g. presence or missingness) of breast cancer diagnoses recorded in CPRD Aurum and CPRD GOLD compared to HES (APC and OP) and NCRAS

Collaborators

Susan Jick - Chief Investigator - BCDSP - Boston Collaborative Drug Surveillance Program
Susan Jick - Corresponding Applicant - BCDSP - Boston Collaborative Drug Surveillance Program
Catherine Vasilakis-Scaramozza - Collaborator - BCDSP - Boston Collaborative Drug Surveillance Program
David Neasham - Collaborator - Amgen Ltd
George Kafatos - Collaborator - Amgen Ltd
Katrina Hagberg - Collaborator - BCDSP - Boston Collaborative Drug Surveillance Program
Rebecca Persson - Collaborator - BCDSP - Boston Collaborative Drug Surveillance Program

Linkages

HES Admitted Patient Care;HES Outpatient;NCRAS Cancer Registration Data;No additional NCRAS data required