Abstract

Share on facebook
Share on twitter
Share on linkedin

Applying big-data science to real-world neonatal data

Greenbury S, Wu J, Ougham K, Hyde M, Glen R, Gale C, Angelini E, Modi N 
Institution(s) 
1) ITMAT Data Science Group, NIHR Imperial Biomedical Research Centre, Imperial College London  2) Section of Neonatal Medicine, Department of Medicine, Imperial College London 
Introduction 
The National Neonatal Research Database (NNRD) is a mature, longitudinal, relational database containing around 450 pre-defined variables many recorded daily, that flow from the real-time, point-of-care, clinician-entered electronic patient records of all admissions to NHS neonatal units. Data are an NHS Information Standard for England and include demographics, diagnoses, outcomes, daily treatments and care processes (Neonatal Data Set ISB 1595). To-date, the NNRD contains information on over one million patients. We aimed to develop methods to curate NNRD data for the application of machine learning (ML) and Artificial Intelligence (AI) techniques and conduct a proof of concept evaluation to test the hypothesis that these approaches can reliably identify clinically meaningful preterm feeding patterns. 
Methods 
We studied a pseudoanonymised test cohort of 49,450 very and extremely preterm neonates (less than 32 gestational weeks, born between 01 Jan 2012 and 08 Jan 2019) and admitted to neonatal units in England. We considered daily data relating to nutritional intake, applying processes to minimise missing data, ensure logical consistency of variables and convert each baby’s daily record into an aggregate summary of each nutrient type (maternal milk; donor milk, formula, fortifier, parenteral nutrition) delivered per day. We applied unsupervised ML/AI methods (k-means clustering and a more complex Dirichlet Process Gaussian Mixture Model) to cluster the cohort, identify patterns in feeding regimens and outliers based on each baby’s entire length of stay. The National Research Ethics Service has approved the NNRD as a research database (16/LO/1093). Study funding is by the Imperial NIHR Biomedical Research Centre. 
Results 
We demonstrated that our clustering approaches yielded clinically meaningful and interpretable findings. We identified around 10 typical feeding patterns that describe 80% of the population. A large number of rare patterns described the remainder. The largest group (~30%) clearly illustrated the well-recognised trade-off between mother’s milk and formula, with other groups trading fortifier and formula with the presence of a larger proportion of mother’s milk. The fifth largest cluster (~7%) is a high mortality group with shorter length of stay receiving mostly parenteral nutrition. We additionally considered the average time series of feeding events associated with each group, identifying a small number of expected feeding transitions in dominant clusters, and more complicated transitions in the rare clusters. 
Conclusions 
We show that is it possible to apply agnostic ML/AI techniques to the NNRD and draw inferences that are in accord with clinician knowledge. This indicates potential to apply ML/AI techniques to the NNRD and wider linked datasets to obtain data-driven insights. Examples include detection of non-random associations, to identify possible disease determinants, and predictive modelling using complex temporal longitudinal data to uncover patient pathways and consider interventions that might alter patient outcomes. 

More to explorer

2019 Summer Meeting

60th anniversary celebration This special meeting marked the 60th anniversary of the founding of the Neonatal Society. A series of keynote lectures

2019 Autumn Meeting

7th November 2019 The Royal Society of Medicine, London 9:30 – 17:30 with a drinks Reception at 18:00 Open to all professionals

2019 Spring Meeting

The Neonatal Society 2019 Spring Meeting was held on 15th March at The Royal Society of Medicine in London.

Search by category