Development of a system agnostic algorithm for a Mother Baby Link in the CPRD primary care data.

Study type
Date of Approval
Study reference ID
Lay Summary

Electronic health record databases, such as the Clinical Practice Research Datalink (CPRD), are an important tool for monitoring the impacts of exposures to medicines in pregnancy and during breastfeeding on a child’s health. However, this requires an accurate link mothers’ and babies’ medical records. There is a CPRD GOLD Mother Baby Link (MBL), but no equivalent MBL in CPRD Aurum.

This study aims to create a system-agnostic algorithm, based on what we know from the CPRD GOLD MBL, to link mothers and their children and apply this to CPRD Aurum. We will ascertain all relevant delivery, labour, pregnancy outcome, and postnatal data in CPRD Aurum and review available information on the family number to create a MBL algorithm.

We will validate the CPRD Aurum MBL by comparing the number of successful mother baby matches to the existing CPRD GOLD MBL and by comparing the number of deliveries found within the CPRD Aurum Pregnancy Register that are also found and linked to a child in the CPRD Aurum MBL, and vice versa. The system-agnostic algorithm will then be applied to CPRD GOLD.

The CPRD Aurum MBL will subsequently be made available alongside the updated CPRD GOLD MBL on a study specific basis to users of CPRD data wishing to study pregnancy outcomes (subject to Research Data Governance approval). Given the larger number of patients represented in the CPRD Aurum database, this will make it easier for researchers to look at how new or rare exposures impact a child’s health.

Technical Summary

This methodological study aims to create a system-agnostic algorithm to link the medical records of mothers and children. We will ascertain all sources of delivery, pregnancy outcome, labour, and postnatal data in CPRD Aurum, and review available information on the family number to create a system-agnostic algorithm. The existing CPRD GOLD MBL will be used to develop and validate the algorithm. We will apply the algorithm to CPRD Aurum and CPRD GOLD to generate MBL registers. The algorithmic steps will be system-agnostic, but differences in data-structure and coding-systems will be accounted for.

The mother cohort will include female patients in CPRD Aurum of childbearing age with delivery, pregnancy outcome, labour, or postnatal events. The child cohort will include patients in CPRD Aurum registered from 1986 onwards with year of birth recorded. Since CPRD does not collect full date of birth we will apply algorithmic rules to estimate DOB based on records within the data.

We will validate the CPRD Aurum MBL by assessing the number of successful mother baby matches per woman, age, and calendar year, the number and percentage of multiple births, and the mean, SD, median, and IQR of the number of children linked to a mother, in both MBLs. We will also validate against the CPRD Aurum Pregnancy Register by assessing concordance of delivery, timing of delivery and birth date, and multiple births between the CPRD Aurum Pregnancy Register and the CPRD Aurum MBL.

The CPRD Aurum MBL will be subsequently made available on a study specific basis to users of CPRD data wishing to study pregnancy outcomes (subject to Research Data Governance approval). Given the larger number of patients represented in the CPRD Aurum database, this will make it easier for researchers to look at how new or rare exposures impact a child’s health.

Health Outcomes to be Measured

Linking mothers and their children in CPRD Aurum.
• Proportion of infants matched to a mother
• Proportion of women with a live birth record who are matched to an infant


Jennifer Campbell - Chief Investigator - CPRD
Sonia Coton - Corresponding Applicant - CPRD
Rachael Williams - Collaborator - CPRD
Stephen Welburn - Collaborator - CPRD

Former Collaborators

Arlene Gallagher - Collaborator - CPRD
Rebecca Dliwayo - Collaborator - CPRD


CPRD Mother-Baby Link;Pregnancy Register