COVID-19 recording in Clinical Practice Research Datalink primary care data and linked Second Generation Surveillance System SARS-CoV-2 virology test data

Date of ISAC Approval: 
11/02/2021
Lay Summary: 
CPRD supports public health research through the provision of de-identified primary care data and a number of linkages to other routinely collected health data sources. A new linkage to the Public Health England (PHE) Second Generation Surveillance System (SGSS) SARS-CoV-2 positive test data has recently been established (COVID-19 is the disease name and SARS-CoV-2 is the virus name). This works aims to describe the recording of COVID-19 diagnosis and test results across the CPRD primary care data and SGSS test data and assess how well these agree. The groups of patients with a diagnosis or positive test result in the different databases will be described by age, gender, region, ethnicity and an indicator of how deprived the area is that the patient lives in. We will establish how many of the COVID-19 records in primary care are confirmed in the linked test data as well as how many of the patients with a positive PHE test record also have that information recorded in their primary care data. Understanding how well the information on SARS-CoV-2/COVID-19 is recorded across different data sources will help researchers in determining how best to use these different datasets to identify COVID-19 positive study populations and for research into COVID-19.
Technical Summary: 
CPRD provide de-identified primary care data for public health research. In light of the COVID-19 global pandemic, linkage to the Public Health England (PHE) Second Generation Surveillance System (SGSS) SARS-CoV-2 positive virology pillar 1 test data was established. The added value of these data is currently poorly understood. We will establish the agreement of primary care recording around COVID-19 and the linked SGSS SARS-CoV-2 positive test data. This will assess the added value of the SGSS-linkage and contribute to our understanding of how well COVID-19 diagnoses are being recorded in primary care data. This knowledge is crucial to support research into COVID-19 using these data sources, which in turn can inform our understanding of the disease and inform patient care and public health policy. All patients in CPRD Aurum and CPRD GOLD eligible for linkage to the SGSS SARS-CoV-2 test data and who have at least one day of registration between March 2020 and the end of the SGSS coverage period (currently July 2020) will be included. We will describe the COVID-19 case status data available via the primary care record, and that available via the linked SGSS data in terms of numbers of events, age, gender, practice region, ethnicity, and patient-level IMD quintile. Two measures of agreement between the recording of positive SARS-CoV-2 diagnoses in primary care and the SGSS pillar 1 testing data will be estimated. The proportion of patients with a record indicating a COVID-19 diagnosis in the primary care data who also have a positive SARS-CoV-2 test record in SGSS data will be estimated as the first measure. The proportion of patients within the cohort with a positive SARS-CoV-2 test record in the SGSS data who also have a record indicating a COVID-19 diagnosis in the primary care data will be estimated as the second measure.
Health Outcomes to be Measured: 
Record of positive SARS-CoV-2 test in the SGSS linked data; Record of SARS-CoV-2 test (positive, negative, or inconclusive) or COVID-19 diagnosis code in the primary care data
Collaborators: 

Eleanor Yelland - Chief Investigator - CPRD
Rachael Williams - Collaborator - CPRD
Susan Hodgson - Collaborator - CPRD

Linkages: 
HES Admitted;SGSS;Patient IMD