Characterisation of acute and long-term COVID-19 symptoms and complications in primary care electronic health records from the UK

Study type
Protocol
Date of Approval
Study reference ID
23_002603
Lay Summary

BACKGROUND
Many people suffer from long-term complications following an acute COVID-19 infection. The condition describing ongoing symptoms such as extreme tiredness or shortness of breath is called “long COVID”, while complications following COVID-19 such as thromboses, strokes or heart attacks are referred to as “post-acute complications”. However, it is yet not fully understood how frequent these conditions are in the population, and if affected people share specific characteristics, including demographics or co-morbidities.

PURPOSE
Our study will first assess the frequency of long COVID and post-acute complications in people with COVID-19 and the general population. Subsequently, we will provide summary characteristics for people with long COVID/post-acute complications. Finally, we will describe disease pathways for long COVID over time and describe clinical subgroups.

METHOD
This study will use electronic health records from primary care in the UK. The study population will consist of people with a COVID-19 infection. Among those, we will analyse how many people had long COVID symptoms or post-acute complications events and summarise their characteristics, e.g. age, sex, co-morbidities, medication before and after COVID-19, time of infection and vaccination status. Subsequently, we will study disease pathways and describe clinical subgroups of people with long COVID using advanced statistical techniques called “machine learning”.

POTENTIAL IMPORTANCE
Our results will help patients, clinicians and regulators to better understand both conditions, and inform future treatment and prevention programs.

Technical Summary

BACKGROUND
Many people suffer from long-term complications following acute COVID-19 infection, known as long COVID and “post-acute SARS-CoV-2 complications" (PASC). For other medical conditions, e.g. diabetes, increased risk associated with COVID-19 was suggested. However, a comprehensive characterisation of the conditions has not yet been conducted.

AIMS
1) To study incidence and prevalence of long COVID, PASC and other medical conditions.
2) To characterise people with long COVID, PASC and other medical conditions.
3) To model temporal trajectories and define clinical subgroups of long COVID.

METHODS
Data: NHS records from CPRD GOLD/AURUM, linked to HES, Index of Multiple Deprivation.
Participants: Individuals registered in CPRD GOLD/AURUM for >365 days comprise the source population, from which people with COVID-19, negative tests or influenza will be selected.
Outcome definition: Long COVID defined as a positive test/clinical diagnosis with persistent symptoms for >90 days. Relevant symptoms will be identified from WHO definition. PASC defined as a positive test/clinical diagnosis with a record of pre-defined complication (e.g. venous thrombosis) at >90 days.
Statistical analysis:
WP1: Incidence rates and period prevalence will be estimated (1) for long COVID, PASC and other medical conditions in people with COVID-19, negative test or influenza and (2) for long COVID, PASC, other medical conditions, COVID-19, negative test or influenza in the general population.
WP2: Large-scale characterisation of all cohorts (long COVID, PASC, COVID-19, test negative, influenza) will be done at baseline, including demographics, comorbidities, healthcare and drug utilisation before and after COVID-19 infection.
WP3: Longitudinal trajectories for long COVID will be constructed the form COVID-19 infection -> long COVID symptom/diagnosis, with the results including sequences of codes, network-type figures, counts and significance tests. Next, Long COVID symptoms will be clustered using machine-learning techniques.

PUBLIC HEALTH BENEFIT
A better understanding of both conditions will inform future treatment and prevention programs.

Health Outcomes to be Measured

Long COVID symptoms, post-acute SARS-CoV-2 complications (PASC), and other pre-specified medical conditions (i.e. dementia, cancer, type 1 diabetes, renal failure, liver injury/failure, autoimmune diseases (arthritis, inflammatory bowel disease), MISC, myocarditis/pericarditis, dysauthonomia) which were recorded >90days after a positive test/clinical diagnosis of COVID-19 in CPRD.

OUTCOMES from analyses: (1) Incidence rates and period prevalence of long COVID, PASC and other medical conditions; (2) baseline characteristics (incl. demographics, comorbidities, frequency of healthcare utilisation, drug utilisation); (3) cluster of subgroups of long COVID symptoms and long COVID disease trajectories.

Collaborators

Annika Jodicke - Chief Investigator - University of Oxford
Kim López-Güell - Corresponding Applicant - University of Oxford
Antonella Delmestri - Collaborator - University of Oxford
Daniel Dedman - Collaborator - CPRD
Daniel Prieto-Alhambra - Collaborator - University of Oxford
Jessie Oyinlola - Collaborator - CPRD
Junqing Xie - Collaborator - University of Oxford
Martí Català Sabaté - Collaborator - University of Oxford
Theresa Burkard - Collaborator - University of Oxford
Wai Yi Man - Collaborator - University of Oxford
Zara Cuccu - Collaborator - CPRD

Linkages

HES Admitted Patient Care;Patient Level Index of Multiple Deprivation