Use of routinely collected national data to determine the delays that lead to poor outcome for cancer patients in the UK

Date of Approval
Application Number
Technical Summary

Survival following cancer diagnosis in the UK is lower compared to other Western and Northern European countries. The symptomatic patient pathway begins with the first symptom, and involves clinical events such as the first GP consultation, hospital appointments, investigations, and treatment. Our hypothesis is that detrimental events during the patients diagnosis and treatment lead to delays, causing the poor 1-year survival seen in the UK. Our aim is to identify the causes of these delays by analysing CPRD linked data - GP consultations, HES admissions, NCIN cancer dataset and ONS mortality data. CPRD is the largest source of routinely collected information available in the UK. Studies have shown CPRD is a reliable source of consultation dates, co-morbidities, complications, medications and cancer diagnoses. Further information regarding chemo-/radiotherapy and complications following surgery can be identified in HES. However, CPRD linked data is limited as the length of symptoms before consultation and outpatient data is not routinely available.

For the purpose of this study, the patient pathway will be split into before and after diagnosis. We will determine the effect of the type of symptom, co-morbidity, demographics and geographical location on a delay in diagnosis, and how this effects survival. Following diagnosis, the type of operation (if relevant), stage of tumour, co-morbidities and complications will be used in multilevel regression modelling to determine the effect on survival.

The date of diagnosis will be determined on a hierarchal level of evidence as set out by the International Association of Cancer Registries, with first histological confirmation being the preferred date of diagnosis.

Our study population criterion is any patient over the age of 18 that has a diagnosis of colorectal, breast, prostate, oesophogastric or lung cancer in the NCIN cancer dataset, and has linked data in CPRD.

Patients who survive beyond one year after diagnosis will act as controls, as previous studies have shown the largest difference in survival between the UK and northern Europe occurs in the first 12 months. Secondary analysis will compare patients who have had a curative resection to those who have not, as the latter will be used as a surrogate for palliative treatment. A third analysis will compare patients diagnosed electively vs. emergency, as the latter group are known to have a worse survival. Previous studies that have attempted to test this hypothesis using local datasets or notes have not been able to include sufficient patient numbers and confounders.

The accuracy of HES has been previously verified, by comparing to patient notes. Analysis of the CPRD data from within our department has demonstrated there is sufficient number of clinical episodes regarding cancer symptoms before diagnosis, and complications in the community following treatment. A similar process has also been carried out for CPRD and is described in a systematic review. This latter publication found a median of over 95% (range 74-100) concordance for cancer related diagnoses. A concordance of 83% has also been noted when CPRD has been compared to the cancer registry. Although there remains a possibility of missing clinical episodes, these publications suggest such occurrences should be low.


Paul Ziprin - Chief Investigator - Imperial College London
Chanpreet Arhi - Corresponding Applicant - Imperial College London
Ara Darzi - Collaborator - Imperial College London
George Bouras - Collaborator - Imperial College London


HES Admitted Patient Care;NCRAS Cancer Registration Data;ONS Death Registration Data;Patient Level Index of Multiple Deprivation;Practice Level Index of Multiple Deprivation