Presented at the Neonatal Society 2014 Autumn Meeting.
Ibrahim B, Statnikov E, Modi N, Saxena S and the Medicines for Neonates Investigator Group
Imperial College London
Background: Data on hospital admissions in England are held in an administrative database, Hospital Episode Statistics (HES). Point-of-care clinician-entered information extracted from the Electronic Health Records of all admissions to NHS neonatal units are entered after cleaning into a National Neonatal Research Database (NNRD) at the Neonatal Data Analysis Unit at Imperial College London . The utility of routine data, whether clinical or administrative, for research is limited by uncertainty around the reliability and completeness of key variables . Our aim was to examine data quality and completeness in HES and the NNRD, agreement of key variables, and the feasibility of linking these databases to create a research birth cohort.
Methods: We extracted records from HES and the NNRD of all babies born in England between 1 January 2010 and 31 December 2010. We assessed the completeness of key variables (infant sex, gestational age, birth weight, multiple birth, maternal age and ethnicity) in both sources, and their agreement. We used cut-off values of >+4SDS and <-4SDS to identify potentially erroneous birth weights. We tested linkage using a deterministic approach using the NHS number as common unique identifier. We performed a one to one merge of records from both sources based on the NHS number and created a new dataset with single birth episodes and common key variables from each source included. The data linkage rate was calculated by using the number of true matches divided by the total number of records available for matching.
Results: For the calendar year 2010, 651,703 and 66,403 babies were extracted from HES and NNRD, respectively. NNRD records were over 90% complete for all key variables except maternal age; completeness was higher than HES for all variables except maternal age and ethnicity (Table 1). After data cleaning and removal of duplicates (16,334 records), 651,703 (95.2%) HES records were eligible for linkage to 45,513 (68.5%) records in the NNRD with an NHS number. Although 93% of NNRD records were successfully linked to HES births, this rate dropped to 61.4% when babies with implausible birth weights for gestational age were excluded (1.5% HES; 0.3% NNRD). The final cohort comprised 42,521 babies admitted to NHS neonatal units. The overall agreement between common fields derived from HES and NNRD was >95% for all variables except gestational age (80.7%) and maternal ethnicity (87.1%). For maternal ethnicity and gestational age kappa coefficient values ranged from 0.71 to 0.79; for other variables agreement was almost perfect with kappa values ranging from 0.92 to 0.99.
Conclusion: We have demonstrated that record linkage between NNRD and HES is feasible. This indicates that identification of long term health outcomes of babies admitted to neonatal units is potentially possible using linked records. The utility of this powerful approach would be enhanced by improvements in data quality and completeness.
1. Murray, J., et al., Quality of routine hospital birth records and the feasibility of their use for creating birth cohorts. J Public Health (Oxf), 2013. 35(2): p. 298-307.
2. Foster, V., et al., The use of routinely collected patient data for research: a critical review. Health (London), 2012. 16(4): p. 448-63.